A Node.js/TypeScript backend for a Document Q&A SaaS application that allows users to upload documents (PDF, DOCX, TXT) and ask questions about their content using AI.
- Document Processing: Support for PDF, DOCX, and TXT files
- Text Chunking: Intelligent text splitting with overlap for better context
- Vector Embeddings: Using transformers.js for local embedding generation
- Q&A System: Local Llama model integration for intelligent question answering
- RAG Pipeline: Retrieval-Augmented Generation for accurate answers
- TypeScript: Full type safety with strict configuration
- SQLite Database: Prisma ORM with SQLite for development
- Runtime: Node.js 18+ with TypeScript 5+
- Framework: Express.js with TypeScript
- Database: SQLite with Prisma ORM
- Document Processing: pdf-parse, mammoth
- Embeddings: @xenova/transformers (all-MiniLM-L6-v2)
- Q&A: Local Llama model
- File Upload: Multer with validation
- Node.js 18+
- npm or yarn
- Local Llama model file
-
Clone and install dependencies:
npm install
-
Set up environment variables:
cp .env.example .env
Edit
.envand configure your local Llama model:LLAMA_MODEL_PATH="./models/llama-2-7b-chat.gguf" LLAMA_MODEL_TYPE="llama-2-7b-chat"
-
Set up the database:
npm run db:generate npm run db:migrate
-
Start the development server:
npm run dev
The server will start on http://localhost:3001
GET /health- Server health status
GET /api/documents- List all documentsGET /api/documents/:id- Get specific documentPOST /api/documents- Upload new documentDELETE /api/documents/:id- Delete documentGET /api/documents/:id/chunks- Get document chunks
POST /api/qa/ask- Ask question about documentGET /api/qa/models- Get AI model informationPOST /api/qa/batch- Ask multiple questions
curl -X POST http://localhost:3001/api/documents \
-F "file=@document.pdf" \
-F "userId=user123"curl -X POST http://localhost:3001/api/qa/ask \
-H "Content-Type: application/json" \
-d '{
"documentId": "doc_id_here",
"question": "What is the main topic of this document?",
"userId": "user123"
}'curl http://localhost:3001/api/documentssrc/
├── index.ts # Main server entry point
├── types/ # TypeScript type definitions
│ └── index.ts
├── middleware/ # Express middleware
│ └── errorHandler.ts
├── services/ # Business logic services
│ ├── documentProcessor.ts
│ ├── embeddingService.ts
│ └── qaService.ts
├── routes/ # API route handlers
│ ├── documents.ts
│ └── qa.ts
└── prisma/ # Database schema
└── schema.prisma
npm run dev- Start development server with hot reloadnpm run build- Build for productionnpm run start- Start production servernpm run type-check- TypeScript type checkingnpm run db:migrate- Run database migrationsnpm run db:studio- Open Prisma Studionpm test- Run tests
| Variable | Description | Default |
|---|---|---|
PORT |
Server port | 3001 |
NODE_ENV |
Environment | development |
DATABASE_URL |
Database connection | file:./dev.db |
LLAMA_MODEL_PATH |
Path to Llama model file | ./models/llama-2-7b-chat.gguf |
LLAMA_MODEL_TYPE |
Llama model type | llama-2-7b-chat |
MAX_FILE_SIZE |
Max file size (bytes) | 10485760 (10MB) |
EMBEDDING_MODEL |
Embedding model | Xenova/all-MiniLM-L6-v2 |
The application uses Prisma with the following models:
- User: User accounts (basic implementation)
- Document: Uploaded documents with metadata
- Chunk: Text chunks with embeddings for vector search
- Usage: Usage tracking for analytics
- Upload: File validation and storage
- Extraction: Text extraction based on file type
- Chunking: Intelligent text splitting with overlap
- Embedding: Vector embedding generation
- Storage: Database storage with relationships
- Question: User submits question
- Embedding: Generate embedding for question
- Search: Vector similarity search for relevant chunks
- Context: Prepare context from relevant chunks
- Answer: Generate answer using local Llama model
- Response: Return structured response with sources
The application includes comprehensive error handling:
- Input validation with detailed error messages
- File processing error handling
- API error responses with appropriate HTTP status codes
- Database error handling
- Graceful server shutdown
- Helmet.js for security headers
- CORS configuration
- File type validation
- File size limits
- Input sanitization
- Efficient text chunking with overlap
- Local embedding generation (no external API calls)
- Database indexing for fast queries
- Memory-efficient file processing
This is the MVP backend. Future enhancements include:
- User authentication and authorization
- Rate limiting
- Caching layer
- Advanced vector database (Pinecone/Chroma)
- Multi-document conversations
- Usage analytics and billing
- Admin dashboard
- API documentation with Swagger
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
MIT License - see LICENSE file for details