GitHub - loadinglit/RAG-EXCEL-TO-CHAT: This repository provides a Document Query System powered by Light-RAG (Lightweight Retrieval-Augmented Generation), Azure OpenAI APIs, and Streamlit. The application facilitates document ingestion and querying using advanced LLMs and embedding functions.

Light-RAG Document Query System

This repository provides a Document Query System powered by Light-RAG (Lightweight Retrieval-Augmented Generation), Azure OpenAI APIs, and Streamlit. The application facilitates document ingestion and querying using advanced LLMs and embedding functions.

Key Features

Document Ingestion: Supports .txt and .csv files for ingestion, with automatic summarization for structured data.
Querying: Allows users to perform local, global, and hybrid search queries across ingested documents.
Azure OpenAI Integration: Utilizes Azure OpenAI for LLM-based summarization and embeddings.
Streamlit Interface: Provides a user-friendly interface for uploading files and querying documents.

Setup Instructions

Prerequisites

Python 3.8 or higher.
Azure OpenAI access with the following endpoints and keys configured in .env file:
- Chat Completions
- Embeddings

Environment Variables

Create a .env file with the following content:

AZURE_OPENAI_API_KEY=<your-api-key>
AZURE_OPENAI_API_VERSION=<api-version>
AZURE_OPENAI_ENDPOINT=<api-endpoint>
AZURE_OPENAI_DEPLOYMENT=<deployment-name>

AZURE_EMBEDDING_API_KEY=<your-embedding-api-key>
AZURE_EMBEDDING_API_VERSION=<api-version>
AZURE_EMBEDDING_ENDPOINT=<api-endpoint>
AZURE_EMBEDDING_DEPLOYMENT=<embedding-deployment-name>

Installation

Clone the repository:

git clone <repository-url>
cd <repository-directory>

Install dependencies:
```
pip install lightrag-hku
```

Running the Application

Start the Streamlit application:

streamlit run app.py

Usage Guide

Document Ingestion

Upload .txt or .csv files via the Document Ingestion section.
For .csv files, the data is automatically summarized into concise paragraphs before ingestion.

Query Documents

Enter a query in the Query Documents section.
Choose a search type:
- local: Search within specific documents.
- global: Search across all documents.
- hybrid: Combines local and global search results.
View query results in a structured format.

Code Overview

Ingestion:
- CSV files are summarized into narrative paragraphs using Azure OpenAI.
- Text files are directly processed.
- Content is ingested into the Light-RAG system in the form of relationships in json format
- Note : You can also store the generated graphml & json files in MongoDB Cluster
Querying:
- Queries are executed using Light-RAG's aquery function.
- Results are displayed in JSON or text format based on their structure.
Integration:
- Azure OpenAI APIs handle LLM queries and embeddings.
- LightRAG provides efficient document retrieval and query handling.
Streamlit Interface:
- File upload and query submission are handled through an intuitive web interface.

Dependencies

Streamlit: For building the user interface.
LightRAG: For lightweight RAG functionalities.
Azure OpenAI: For LLM and embedding services.
Pandas: For CSV processing.
NumPy: For embedding array manipulation.

Future Enhancements

Data Visualization using Neo4j-DB for Admin Purposes
Add support for additional file types (e.g., PDFs).
Include advanced query filters and ranking.
Expand embedding support for multilingual documents.

Acknowledgments

This project uses LightRAG for efficient retrieval-augmented generation and Azure OpenAI for state-of-the-art LLM functionalities.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Light-RAG Document Query System

Key Features

Setup Instructions

Prerequisites

Environment Variables

Installation

Running the Application

Usage Guide

Document Ingestion

Query Documents

Code Overview

Dependencies

Future Enhancements

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Light-RAG Document Query System

Key Features

Setup Instructions

Prerequisites

Environment Variables

Installation

Running the Application

Usage Guide

Document Ingestion

Query Documents

Code Overview

Dependencies

Future Enhancements

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages