Klasifikasi Perulangan Kanker Tiroid Menggunakan Stack Ensemble dan SMOTE

Putri, Endin Rahmanda (2025) Klasifikasi Perulangan Kanker Tiroid Menggunakan Stack Ensemble dan SMOTE. Undergraduate thesis, UPN Veteran Jawa Timur.

[img] Text (Cover)
20081010070.-cover.pdf

Download (2MB)
[img] Text (bab 1)
20081010070.-bab1.pdf

Download (68kB)
[img] Text (bab 2)
20081010070.-bab2.pdf
Restricted to Repository staff only until 29 April 2028.

Download (473kB)
[img] Text (bab 3)
20081010070.-bab3.pdf
Restricted to Repository staff only until 29 April 2028.

Download (596kB)
[img] Text (bab 4)
20081010070.-bab4.pdf
Restricted to Repository staff only until 29 April 2028.

Download (501kB)
[img] Text (bab 5)
20081010070.-bab5.pdf

Download (44kB)
[img] Text (daftar pustaka)
20081010070.-daftarpustaka.pdf

Download (78kB)
[img] Text (lampiran)
20081010070.-lampiran.pdf
Restricted to Repository staff only until 29 April 2028.

Download (238kB)

Abstract

Differentiated Thyroid Cancer (DTC) is the most common type of thyroid cancer, with a recurrence rate of approximately 20% among survivors. Classification plays a crucial role in assisting medical professionals in detecting recurrence earlier, thereby increasing treatment success rates and enabling more timely interventions. This study aims to improve the accuracy of thyroid cancer recurrence classification using a Stack Ensemble Learning approach combined with the Synthetic Minority Over-sampling Technique (SMOTE). The dataset used in this research was obtained from the UCI Machine Learning Repository and consists of 17 clinical attributes from patients monitored over ten years. Feature selection was performed using Information Gain with a threshold of 0.04, resulting in nine relevant features for prediction. Decision Tree, Support Vector Machine (SVM), and Logistic Regression were used as base learners, while Logistic Regression served as the meta-learner. The study evaluates the performance of the Stack Ensemble model across various training-to-testing data ratios (90:10, 80:20, 75:25, 70:30, and 60:40) and employs K-Fold validation with varying K values. The evaluation results indicate that the use of SMOTE enhances model accuracy compared to the original imbalanced dataset, particularly by improving recall, meaning the model is better at identifying minority class instances. K-Fold = 7 consistently provided the best performance for the SMOTE dataset, while K-Fold = 10 yielded the best results for the original dataset. The Stack Ensemble model with SMOTE achieved the highest accuracy at data ratios of 80:20 and 75:25, both reaching 94.26%. While precision was higher in the original dataset, recall and F1-score improved with SMOTE. The combination of SMOTE, K-Fold = 7, and a data ratio of 75:25 or 80:20 represents the optimal configuration for producing a high-performance classification model for detecting thyroid cancer recurrence

Item Type: Thesis (Undergraduate)
Contributors:
ContributionContributorsNIDN/NIDKEmail
Thesis advisorPrasetya, Dwi ArmanNIDN0005128001arman.prasetya.sada@upnjatim.ac.id
Thesis advisorJunaidi, AchmadNIDN0710117803achmadjunaidi.if@upnjatim.ac.id
Subjects: T Technology > T Technology (General)
Divisions: Faculty of Computer Science > Departemen of Informatics
Depositing User: Endin Rahmanda Putri
Date Deposited: 29 Apr 2025 06:22
Last Modified: 29 Apr 2025 06:22
URI: https://repository.upnjatim.ac.id/id/eprint/36062

Actions (login required)

View Item View Item