OPTIMASI MODEL KLASIFIKASI PENYAKIT JANTUNG MENGGUNAKAN EXTREME GRADIENT BOOSTING DENGAN HYPERPARAMETER TUNING GRIDSEARCHCV DAN BALANCING DATA SMOTE-ENN

Adias Pradana, Marchel (2025) OPTIMASI MODEL KLASIFIKASI PENYAKIT JANTUNG MENGGUNAKAN EXTREME GRADIENT BOOSTING DENGAN HYPERPARAMETER TUNING GRIDSEARCHCV DAN BALANCING DATA SMOTE-ENN. Undergraduate thesis, Universitas Pembangunan Nasional "Veteran" Jawa Timur.

[img] Text (Cover)
Cover.pdf

Download (1MB)
[img] Text (BAB I)
BAB I.pdf

Download (92kB)
[img] Text (BAB II)
BAB II.pdf
Restricted to Repository staff only until 27 May 2028.

Download (454kB)
[img] Text (BAB III)
BAB III.pdf
Restricted to Repository staff only until 27 May 2028.

Download (705kB)
[img] Text (BAB IV)
BAB IV.pdf
Restricted to Repository staff only until 27 May 2028.

Download (2MB)
[img] Text (BAB V)
BAB V.pdf

Download (10kB)
[img] Text (Daftar Pustaka)
Daftar Pustaka.pdf

Download (86kB)

Abstract

Heart disease is one of the leading causes of death, particularly in Indonesia. Therefore, effective classification methods are needed to minimize the risk of premature death caused by this condition. However, a common challenge in medical data is class imbalance, which can negatively impact the performance of classification models. This study focused on optimizing a classification model using the Extreme Gradient Boosting (XGBoost) algorithm through three scenarios: baseline XGBoost, XGBoost with hyperparameter tuning using GridSearchCV, and XGBoost with GridSearchCV combined with the SMOTE-ENN data balancing technique. XGBoost was chosen for its capability to build strong predictive models through a boosting approach. GridSearchCV was applied to automatically determine the best hyperparameter combinations using cross-validation. To handle the class imbalance, SMOTE-ENN was used as a hybrid resampling method that combines oversampling of the minority class and also undersampling of the majority class to reduce noise and improve balance. The results showed that baseline XGBoost achieved an accuracy of 0.95 but had a low geometric mean (g-mean) of 0.47. Incorporating GridSearchCV improved the g-mean to 0.81, although the accuracy dropped to 0.81. Meanwhile, the combination of GridSearchCV and SMOTE-ENN yielded a g-mean of 0.79 with a more stable accuracy of 0.87. Based on these results, XGBoost with GridSearchCV and SMOTE-ENN was selected as the best scenario, offering a balanced performance in terms to increase accuracy and class distribution. The proposed optimization method is expected to become a reference for developing classification models for imbalanced medical datasets.

Item Type: Thesis (Undergraduate)
Contributors:
ContributionContributorsNIDN/NIDKEmail
Thesis advisorRahmat, BasukiNIP19690723 202121 1 002basukirahmat.if@upnjatim.ac.id
Thesis advisorJunaidi, AchmadNIDN0710117803achmadjunaidi.if@upnjatim.ac.id
Subjects: Z Bibliography. Library Science. Information Resources > Z665 Library Science. Information Science
Divisions: Faculty of Computer Science > Departemen of Informatics
Depositing User: Marchel Adias Pradana
Date Deposited: 27 May 2025 06:53
Last Modified: 27 May 2025 06:53
URI: https://repository.upnjatim.ac.id/id/eprint/36733

Actions (login required)

View Item View Item