Agri-Query: A CASE STUDY ON RAG VS. LONG-CONTEXT LLMS FOR CROSS-LINGUAL TECHNICAL QUESTION ANSWERING

📝 Paper:Agri-Query: A Case Study on RAG vs. Long-Context LLMs for Cross-Lingual Technical Question Answering

This project focuses on evaluating the fundamental retrieval and reasoning capabilities of LLMs. It compares the performance of Long-Context LLMs (processing up to 128k tokens directly) against Retrieval-Augmented Generation (RAG) strategies (Keyword, Semantic, Hybrid) in a cross-lingual agricultural setting.

RAG (Retrieval-Augmented Generation):
- Project Details: RAG README
- RAG Evaluation: RAG Evaluation
- RAG Results:RAG Results Folder
Long-Context Evaluation ("Zeroshot"):
- Project Details: ZeroShot README
- Results:ZeroShot Results Folder
- Visualizations: ZeroShot Visualization Plots

2. Second Paper: Embedded ISOBUS Semantic Synchronization

This subsequent project shifts focus toward the practical, embedded deployment of these models in agricultural machinery over the ISO 11783 (ISOBUS) network.

Difference from the first paper: While the first paper establishes how to best retrieve answers (proving Hybrid RAG is superior to Long-Context ingestion), the second paper establishes how to deliver the necessary data to the edge hardware given network bandwidth limits (comparing Markdown, JSON, and XML transfer efficiencies) and identifies the Minimum Viable Intelligence (MVI) for offline deployment.

Project Details & Codebase: RAG2_COMPAG README
Directory: RAG2_COMPAG/

Long Context Evaluation (Referred to as "Zeroshot" in Codebase)

This evaluation assesses the capability of Large Language Models (LLMs) to answer questions when provided with extensive context. The tests are conducted without any model fine-tuning, focusing on the LLMs' inherent ability to process and retrieve information from varying lengths of text.

Key Aspects of this Evaluation:

Guaranteed Answer Presence: For each question, the context supplied to the LLM always contains the page with the correct answer. This setup tests the model's ability to locate information within the provided text, rather than its ability to recall information from prior training.
Variable Context Lengths with "Noise": To simulate challenges of finding relevant information in large documents, tests are run with different context sizes. This includes scenarios where "noise" – additional, potentially irrelevant pages – is appended to the core context. For example, tests might involve adding 10k tokens of noise or using the entire document (approximately 59k tokens) as context.
Performance Metrics: Model performance is measured using standard information retrieval metrics, including accuracy, precision, recall, and F1-score.

Codebase and Tools:

"Zeroshot" Terminology: In this project's codebase, evaluations of this nature are referred to as "Zeroshot." The Zeroshot/ directory contains all relevant scripts and utilities for conducting these long context evaluations.
PDF to Text Conversion: To prepare PDF documents for this framework (specifically, converting them into a page-wise plain text format suitable for ingestion), use the docling_page_wise_pdf_converter tool. This tool is located in the zeroshot/docling_page_wise_pdf_converter/ directory.

Example Visualization: The following image illustrates how the results from long context evaluations are typically visualized, showing accuracy against varying levels of noise:

Long context Accuracy comparison for English manual

Name		Name	Last commit message	Last commit date
Latest commit History 229 Commits
LLM_Research/visualization/plots		LLM_Research/visualization/plots
RAG		RAG
RAG2_COMPAG		RAG2_COMPAG
ZeroShot		ZeroShot
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
launcher.py		launcher.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agri-Query: A CASE STUDY ON RAG VS. LONG-CONTEXT LLMS FOR CROSS-LINGUAL TECHNICAL QUESTION ANSWERING

Contents

1. First Paper: RAG vs. Long-Context Evaluation

2. Second Paper: Embedded ISOBUS Semantic Synchronization

Long Context Evaluation (Referred to as "Zeroshot" in Codebase)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agri-Query: A CASE STUDY ON RAG VS. LONG-CONTEXT LLMS FOR CROSS-LINGUAL TECHNICAL QUESTION ANSWERING

Contents

1. First Paper: RAG vs. Long-Context Evaluation

2. Second Paper: Embedded ISOBUS Semantic Synchronization

Long Context Evaluation (Referred to as "Zeroshot" in Codebase)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages