Pakpahan, Frredrik Sahalutua (2025) IMPLEMENTASI CONVOLUTIONAL NEURAL NETWORK (CNN) UNTUK KLASIFIKASI EMOSI DALAM LAGU INSTRUMENTAL BERDASARKAN FITUR AUDIO. Undergraduate thesis, Universitas Pembangunan Nasional "Veteran" Jawa TImur.
|
Text (Cover)
Skripsi-Final-Fredrik-Revisi-2-merged (1)_merged.pdf Download (6MB) |
|
|
Text (Bab 1)
Skripsi-Final-Fredrik-Revisi-24-27.pdf Download (263kB) |
|
|
Text (Bab 2)
Skripsi-Final-Fredrik-Revisi-28-37.pdf Restricted to Repository staff only until 5 December 2028. Download (304kB) |
|
|
Text (Bab 3)
Skripsi-Final-Fredrik-Revisi-38-58.pdf Restricted to Repository staff only until 5 December 2028. Download (903kB) |
|
|
Text (Bab 4)
Skripsi-Final-Fredrik-Revisi-59-153.pdf Restricted to Repository staff only until 5 December 2028. Download (2MB) |
|
|
Text (Bab 5)
Skripsi-Final-Fredrik-Revisi-154-156.pdf Download (258kB) |
|
|
Text (Daftar Pustaka)
Skripsi-Final-Fredrik-Revisi-157-158.pdf Download (218kB) |
|
|
Text (Lampiran)
Skripsi-Final-Fredrik-Revisi-159.pdf Restricted to Repository staff only until 2028. Download (136kB) |
Abstract
Music is a powerful medium for conveying emotions, yet automatically classifying musical emotions remains challenging due to the complexity of audio structures and the wide variability of emotional expression. This study aims to develop and evaluate a Convolutional Neural Network (CNN)–based method for classifying musical emotions using audio features. The DEAM (Database for Emotional Analysis of Music) dataset is employed, providing valence and arousal annotations for thousands of songs. Feature extraction is carried out through several signal-processing stages including pre-emphasis, windowed framing, Fourier transform, mel filter bank, logarithmic compression, and Discrete Cosine Transform to produce Mel-Frequency Cepstral Coefficients (MFCC). Multiple feature configurations are evaluated, such as MFCC (13, 24, and 30 coefficients), mel-spectrograms, and pitch-shifting augmentation. Results show that the CNN effectively captures emotional patterns in music, particularly for majority classes, achieving 63– 66% accuracy for 4-quadrant valence–arousal classification and improving significantly to 77% when simplified to a 2-quadrant task. However, performance on minority classes remains low due to severe class imbalance and the limitations of MFCC in representing subtle emotional nuances. These findings indicate that CNN is effective for audio-based emotion classification, especially within simplified emotional spaces, and they provide a foundation for future development of more robust and accurate music emotion recognition systems.
| Item Type: | Thesis (Undergraduate) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Contributors: |
|
||||||||||||
| Subjects: | M Music and Books on Music > M Music T Technology > T Technology (General) |
||||||||||||
| Divisions: | Faculty of Computer Science > Departemen of Informatics | ||||||||||||
| Depositing User: | Fredrik Pakpahan | ||||||||||||
| Date Deposited: | 05 Dec 2025 09:08 | ||||||||||||
| Last Modified: | 05 Dec 2025 09:08 | ||||||||||||
| URI: | https://repository.upnjatim.ac.id/id/eprint/48095 |
Actions (login required)
![]() |
View Item |
