Classification of Lumpy Skin Disease (LSD) in Cattle Using an Ensemble Vision Transformer – DenseNet121 Model

Khairah, Ama Maulidatul (2026) Classification of Lumpy Skin Disease (LSD) in Cattle Using an Ensemble Vision Transformer – DenseNet121 Model. Undergraduate thesis, UPN Veteran Jawa Timur.

[img] Text (Cover)
22081010329.-cover.pdf

Download (965kB)
[img] Text (Bab 1)
22081010329-bab1.pdf

Download (244kB)
[img] Text (Bab 2)
22081010329-bab2.pdf
Restricted to Repository staff only until 25 June 2028.

Download (1MB)
[img] Text (Bab 3)
22081010329-bab3.pdf
Restricted to Repository staff only until 25 June 2028.

Download (898kB)
[img] Text (Bab 4)
22081010329-bab4.pdf
Restricted to Repository staff only until 25 June 2028.

Download (1MB)
[img] Text (Bab 5)
22081010329-bab5.pdf

Download (185kB)
[img] Text (Daftar pustaka)
22081010329-daftarpustaka.pdf

Download (179kB)
[img] Text (Lampiran)
22081010329-lampiran.pdf
Restricted to Repository staff only

Download (506kB)

Abstract

Lumpy Skin Disease (LSD) is an infectious disease in cattle characterized by the appearance of bumps or nodules on the skin and can cause significant economic losses for farmers. LSD needs to be identified early so that treatment can be carried out promptly and its spread can be controlled. In the field, the identification process still largely relies on clinical examination and laboratory testing, which require time, cost, and expert personnel. The purpose of this study is to classify LSD in cattle based on digital images using an ensemble model of Vision Transformer (ViT) and DenseNet121. The dataset used was obtained from Kaggle, with a total of 1445 images consisting of 700 images of normal cattle and 745 images of cattle infected with LSD. The preprocessing stage included resizing the images to 224 × 224 pixels and normalizing pixel values, after which the model was evaluated using 5 – Fold Cross Validation. Hyperparameter tuning was conducted using Grid Search to obtain the best configuration for each model, while the ensemble weight was determined by testing α values in the range of 0.0 to 1.0. In the proposed approach, ViT was used to capture global information from the images, while DenseNet121 was used to capture more detailed local features. The results showed that the ensemble model achieved the best performance, with an accuracy of 91.35%, precision of 91.41%, recall of 91.30%, and f1 – score of 91.32%. These performance values were higher than those of the Vision Transformer model, which achieved an accuracy of 89.34%, and DenseNet121, which achieved an accuracy of 90.24%. These results indicate that combining ViT and DenseNet121 can improve classification performance by leveraging both global and local features simultaneously.

Item Type: Thesis (Undergraduate)
Contributors:
ContributionContributorsNIDN/NIDKEmail
Thesis advisorMandyartha, Eka PrakarsaNIDN0725058805eka_prakarsa.fik@upnjatim.ac.id
Thesis advisorPuspaningrum, Eva YuliaNIDN0005078908evapuspaningrum.if@upnjatim.ac.id
Subjects: T Technology > T Technology (General)
Divisions: Faculty of Computer Science > Departemen of Informatics
Depositing User: Ama Maulidatul Khairah
Date Deposited: 25 Jun 2026 06:45
Last Modified: 25 Jun 2026 07:08
URI: https://repository.upnjatim.ac.id/id/eprint/54204

Actions (login required)

View Item View Item