Wardhana, Mareta Putri (2026) Perbandingan Metode Ward’s, DBSCAN, Dan K-Means Untuk Klasterisasi Aduan Pelayanan Masyarakat Di Dispendukcapil Kota Surabaya. Undergraduate thesis, UPN Veteran Jawa Timur.
|
Text (Cover)
21083010002.-cover.pdf Download (937kB) |
|
|
Text (Bab 1)
21083010002.-bab1.pdf Download (210kB) |
|
|
Text (Bab 2)
21083010002.-bab2.pdf Restricted to Repository staff only until 8 March 2029. Download (605kB) |
|
|
Text (Bab 3)
21083010002.-bab3.pdf Restricted to Repository staff only until 8 March 2029. Download (502kB) |
|
|
Text (Bab 4)
21083010002.-bab4.pdf Restricted to Repository staff only until 8 March 2029. Download (3MB) |
|
|
Text (Bab 5)
21083010002.-bab5.pdf Download (165kB) |
|
|
Text (Daftar Pustaka)
21083010002.daftarpustaka.pdf Download (162kB) |
|
|
Text (Lampiran)
21083010002.-lampiran.pdf Restricted to Repository staff only Download (137kB) |
Abstract
Text data clustering is one of the techniques in Text Mining used to identify patterns in unstructured data. In public services, community complaints are generally conveyed in the form of diverse textual descriptions of complaints, making them difficult to analyze manually. This condition also occurs in complaint data at the Department of Population and Civil Registration (Dispendukcapil) of Surabaya City, where variations in content and language structure cause the process of identifying problem patterns to be less effective. Therefore, clustering methods are needed to group complaints based on content similarity automatically. This study aims to compare the performance of Ward’s Method, DBSCAN, and K-Means in clustering public complaints and to determine the most optimal method based on the validity of the resulting cluster structure. The research stages include text preprocessing consisting of cleaning, tokenizing, normalization, stopword removal, and stemming, followed by TF-IDF weighting to transform text into numerical representation. The clustering process is implemented using Ward’s Method based on hierarchical agglomerative clustering with Euclidean distance, DBSCAN based on density with noise handling, and K-Means based on partitioning with the optimal number of clusters determined using the Variance Ratio Criterion (VRC). The quality of the resulting clusters is evaluated using the Silhouette Coefficient and the Davies-Bouldin Index to measure the level of compactness and separation between clusters. The results show that Ward produces 2 clusters, DBSCAN produces 10 clusters and 1 cluster as noise, and K-Means produces 3 clusters. Based on the evaluation, Ward obtains the highest Silhouette Coefficient value of 0.76 and the lowest Davies-Bouldin Index value of 1.02, compared to DBSCAN which produces a Silhouette value of 0.52 and DBI of 1.12, and K-Means which produces a Silhouette value of 0.49 and DBI of 3.47. These findings indicate that Ward’s Method is the most suitable method for the characteristics of public service complaint data and can be used as an analytical approach to support improvements in public service quality.
| Item Type: | Thesis (Undergraduate) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Contributors: |
|
||||||||||||
| Subjects: | Q Science > QA Mathematics Q Science > QA Mathematics > QA76.6 Computer Programming |
||||||||||||
| Divisions: | Faculty of Computer Science > Departemen of Data Science | ||||||||||||
| Depositing User: | Mareta Putri Wardhana | ||||||||||||
| Date Deposited: | 10 Mar 2026 01:30 | ||||||||||||
| Last Modified: | 10 Mar 2026 01:30 | ||||||||||||
| URI: | https://repository.upnjatim.ac.id/id/eprint/50275 |
Actions (login required)
![]() |
View Item |
