Perbandingan Metode Ward’s, DBSCAN, Dan K-Means Untuk Klasterisasi Aduan Pelayanan Masyarakat Di Dispendukcapil Kota Surabaya

Wardhana, Mareta Putri (2026) Perbandingan Metode Ward’s, DBSCAN, Dan K-Means Untuk Klasterisasi Aduan Pelayanan Masyarakat Di Dispendukcapil Kota Surabaya. Undergraduate thesis, UPN Veteran Jawa Timur.

[img] Text (Cover)
21083010002.-cover.pdf

Download (937kB)
[img] Text (Bab 1)
21083010002.-bab1.pdf

Download (210kB)
[img] Text (Bab 2)
21083010002.-bab2.pdf
Restricted to Repository staff only until 8 March 2029.

Download (605kB)
[img] Text (Bab 3)
21083010002.-bab3.pdf
Restricted to Repository staff only until 8 March 2029.

Download (502kB)
[img] Text (Bab 4)
21083010002.-bab4.pdf
Restricted to Repository staff only until 8 March 2029.

Download (3MB)
[img] Text (Bab 5)
21083010002.-bab5.pdf

Download (165kB)
[img] Text (Daftar Pustaka)
21083010002.daftarpustaka.pdf

Download (162kB)
[img] Text (Lampiran)
21083010002.-lampiran.pdf
Restricted to Repository staff only

Download (137kB)

Abstract

Text data clustering is one of the techniques in Text Mining used to identify patterns in unstructured data. In public services, community complaints are generally conveyed in the form of diverse textual descriptions of complaints, making them difficult to analyze manually. This condition also occurs in complaint data at the Department of Population and Civil Registration (Dispendukcapil) of Surabaya City, where variations in content and language structure cause the process of identifying problem patterns to be less effective. Therefore, clustering methods are needed to group complaints based on content similarity automatically. This study aims to compare the performance of Ward’s Method, DBSCAN, and K-Means in clustering public complaints and to determine the most optimal method based on the validity of the resulting cluster structure. The research stages include text preprocessing consisting of cleaning, tokenizing, normalization, stopword removal, and stemming, followed by TF-IDF weighting to transform text into numerical representation. The clustering process is implemented using Ward’s Method based on hierarchical agglomerative clustering with Euclidean distance, DBSCAN based on density with noise handling, and K-Means based on partitioning with the optimal number of clusters determined using the Variance Ratio Criterion (VRC). The quality of the resulting clusters is evaluated using the Silhouette Coefficient and the Davies-Bouldin Index to measure the level of compactness and separation between clusters. The results show that Ward produces 2 clusters, DBSCAN produces 10 clusters and 1 cluster as noise, and K-Means produces 3 clusters. Based on the evaluation, Ward obtains the highest Silhouette Coefficient value of 0.76 and the lowest Davies-Bouldin Index value of 1.02, compared to DBSCAN which produces a Silhouette value of 0.52 and DBI of 1.12, and K-Means which produces a Silhouette value of 0.49 and DBI of 3.47. These findings indicate that Ward’s Method is the most suitable method for the characteristics of public service complaint data and can be used as an analytical approach to support improvements in public service quality.

Item Type: Thesis (Undergraduate)
Contributors:
ContributionContributorsNIDN/NIDKEmail
Thesis advisorHindrayani, Kartika MaulidaNIDN0009099205kartika.maulida.ds@upnjatim.ac.id
Thesis advisorMuhaimin, AmriNIDN0023079502amri.muhaimin.stat@upnjatim.ac.id
Subjects: Q Science > QA Mathematics
Q Science > QA Mathematics > QA76.6 Computer Programming
Divisions: Faculty of Computer Science > Departemen of Data Science
Depositing User: Mareta Putri Wardhana
Date Deposited: 10 Mar 2026 01:30
Last Modified: 10 Mar 2026 01:30
URI: https://repository.upnjatim.ac.id/id/eprint/50275

Actions (login required)

View Item View Item