DEVELOPMENT OF SEMANTIC SEARCH SYSTEM WITH QUERY EXPANSION FOR THESIS RETRIEVAL IN THE UPN ACADEMIC REPOSITORY

LASMININGRUM, DELA PUSPITA (2026) DEVELOPMENT OF SEMANTIC SEARCH SYSTEM WITH QUERY EXPANSION FOR THESIS RETRIEVAL IN THE UPN ACADEMIC REPOSITORY. Undergraduate thesis, UPN "Veteran" Jawa Timur.

[img]
Preview
Text (Cover)
cover_organized.pdf

Download (1MB) | Preview
[img]
Preview
Text (BAB 1)
skripsi dela bab 1.pdf

Download (141kB) | Preview
[img] Text (BAB 2)
skripsi dela bab 2.pdf
Restricted to Repository staff only until 22 May 2029.

Download (481kB)
[img] Text (BAB 3)
skripsi dela bab 3.pdf
Restricted to Repository staff only until 22 May 2029.

Download (2MB)
[img] Text (BAB 4)
skripsi dela bab 4.pdf
Restricted to Repository staff only until 22 May 2029.

Download (946kB)
[img]
Preview
Text (BAB 5)
skripsi dela bab 5.pdf

Download (10kB) | Preview
[img]
Preview
Text (Daftar Pustaka)
skripsi dela bab daftar pustaka.pdf

Download (269kB) | Preview

Abstract

Information retrieval systems play an important role in helping users efficiently find relevant documents. In higher education, thesis retrieval has become an essential need for students as a writing reference and to avoid similarity in research topics. The digital repository of UPN “Veteran” Jawa Timur provides access to student theses; however, the existing search system still relies on explicit keyword matching, making it less effective when users submit short, general queries or use different terms with similar meanings. This study aims to design and implement a thesis retrieval system based on semantic search capable of understanding the contextual meaning of queries and documents. The proposed approach utilizes IndoSBERT to generate semantic representations (embeddings) of thesis titles, abstracts, and user queries. To improve sensitivity to term variations and limitations of short queries, an embedding-based query expansion technique is applied. The embedding retrieval process is conducted using FAISS with cosine similarity to maintain search efficiency on a large-scale dataset. The study includes system design, embedding generation, query expansion integration, as well as implementation and functional testing of the retrieval system. The experimental results show that the system is capable of retrieving relevant documents based on semantic meaning despite vocabulary differences between queries and documents. The semantic fine-tuning model with query expansion achieved an nDCG@15 score of 0.8470, indicating that relevant documents tend to appear at the top of the search results. The application of fine-tuning and query expansion proved effective in improving the quality of semantic representations and enriching query context, enabling the system to capture semantic relationships better than the baseline semantic approach. In terms of performance, the system was able to process searches over 15,326 documents with an average retrieval time of approximately 0.5 seconds using FAISS. Therefore, the IndoSBERT-based semantic search approach combined with fine-tuning and query expansion can serve as an alternative approach for thesis retrieval systems, particularly for queries with vocabulary variations. Keywords: semantic search, IndoSBERT, query expansion, FAISS, thesis retrieval.

Item Type: Thesis (Undergraduate)
Contributors:
ContributionContributorsNIDN/NIDKEmail
Thesis advisorPUSPANINGRUM, EVA YULIANIDN0005078908evapuspaningrum.if@upnjatim.ac.id
Thesis advisorMULYO, BUDI MUKHAMADNIDN0718118904budi.m.mulyo.fasilkom@upnjatim.ac.id
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Faculty of Computer Science > Departemen of Informatics
Depositing User: Dela Puspita Lasminingrum
Date Deposited: 22 May 2026 07:45
Last Modified: 22 May 2026 08:31
URI: https://repository.upnjatim.ac.id/id/eprint/52219

Actions (login required)

View Item View Item