OPTIMASI TREE TRAVERSAL BERBASIS INTERLEAVING CHAIN-OF-THOUGHT PADA ARSITEKTUR RETRIEVAL-AUGMENTED GENERATION HIERARKIS

Maassobirin, Mukhamad Khafid (2026) OPTIMASI TREE TRAVERSAL BERBASIS INTERLEAVING CHAIN-OF-THOUGHT PADA ARSITEKTUR RETRIEVAL-AUGMENTED GENERATION HIERARKIS. Undergraduate thesis, UPN Veteran Jawa Timur.

Preview

Text (COVER)
ilovepdf_merged (2)_organized.pdf
Download (1MB) | Preview

Preview

Text (BAB I)
SKRIPSI KHAFID INDONESIA BAB 1.pdf
Download (356kB) | Preview

Text (BAB II)
SKRIPSI KHAFID INDONESIA BAB 2.pdf
Restricted to Repository staff only until 1 July 2029.
Download (1MB) | Request a copy

Text (BAB III)
SKRIPSI KHAFID INDONESIA BAB 3.pdf
Restricted to Repository staff only until 1 July 2029.
Download (1MB) | Request a copy

Text (BAB IV)
SKRIPSI KHAFID INDONESIA BAB 4.pdf
Restricted to Repository staff only until 1 July 2029.
Download (1MB) | Request a copy

Preview

Text (BAB V)
SKRIPSI KHAFID INDONESIA bab 5.pdf
Download (243kB) | Preview

Preview

Text (DAFTAR PUSTAKA)
SKRIPSI KHAFID INDONESIA DAFTAR PUSTAKA.pdf
Download (203kB) | Preview

Text (LAMPIRAN)
SKRIPSI KHAFID INDONESIA LAMPIRAN.pdf
Restricted to Repository staff only until 1 July 2029.
Download (543kB) | Request a copy

Abstract

Hierarchical RAG systems such as RAPTOR and HIRO are limited by traversal mechanisms that rely solely on static semantic similarity against the initial query, making them unable to adapt to evolving reasoning needs during the exploration process. Static similarity-based retrieval is insufficient for tasks requiring deep reasoning over long documents, as demonstrated by the BRIGHT benchmark which revealed significant performance degradation even among top-performing retrieval models. This study proposes an Interleaving Chain-of-Thought (IRCoT)-based tree traversal mechanism within a hierarchical RAG architecture that integrates dynamic reasoning signals into the Depth-First Search algorithm. The system encompasses three main phases: hierarchical tree construction through embedding, clustering, and LLM-based summarization, followed by traversal using a combined scoring function with adaptive dual-threshold pruning interleaved with reasoning generation at each step, and finally an answer generation phase. The key innovation lies in reasoning signals that evolve dynamically throughout traversal, distinguishing this approach from HIRO which relies on static query representations. Comprehensive evaluation was conducted on four benchmark datasets, namely NarrativeQA, QASPER, QuALITY, and TyDi QA, using multilingual-e5-large-instruct for embedding and Qwen2.5-7B-Instruct-AWQ for generation. Results show that IRCoT improves ROUGE-L on NarrativeQA from 0.1144 to 0.1275 and Answer F1 on QASPER from 0.3135 to 0.3288, but decreases accuracy on QuALITY from 54.98% to 54.49% and Token F1 on TyDi QA from 0.3922 to 0.3862, with computational overhead two to four times slower. Empirical analysis identifies that reasoning signals are effective for documents with dispersed information but offer limited advantages for shallow-structured documents or holistic-comprehension-demanding questions.

Item Type:

Thesis (Undergraduate)

Contributors:

Contribution	Contributors	NIDN/NIDK	Email
Thesis advisor	Saputra, Wahyu Syaifullah Jauharis	NIDN0725088601	wahyu.s.j.saputra.if@upnjatim.ac.id
Thesis advisor	Adziima, Andri Fauzan	NUPTK9844773674130292	andri.fauzan.fasilkom@upnjatim.ac.id

Subjects:

Q Science > Q Science (General)
Q Science > QA Mathematics > QA76 Computer software
Q Science > QA Mathematics > QA76.6 Computer Programming
Q Science > QA Mathematics > QA76.76.E95 Expert Systems
Q Science > QA Mathematics > QA76.87 Neural computers

Divisions:

Faculty of Computer Science > Departemen of Data Science

Depositing User:

Mukhamad Khafid Maassobirin

Date Deposited:

01 Jul 2026 04:38

Last Modified:

01 Jul 2026 04:58

URI:

https://repository.upnjatim.ac.id/id/eprint/54348