vLLM Manager

A comprehensive web application for managing vLLM (Virtual Large Language Model) instances with HuggingFace integration. This application allows you to easily deploy, manage, and monitor multiple vLLM instances through a modern web interface.

Features

🚀 Easy Instance Management: Create, start, stop, restart, and remove vLLM instances
🔍 HuggingFace Integration: Search and browse models directly from HuggingFace
🔐 Authentication Support: Full support for gated and private models with API keys
📊 Real-time Monitoring: Live status updates and container logs
🌐 Modern UI: Clean, responsive interface built with React and Tailwind CSS
🐳 Docker-based: Containerized deployment for easy setup and scaling
📱 Mobile Friendly: Responsive design works on all devices

Screenshots

Model Discovery & Browsing

Search and browse HuggingFace models with detailed information and popularity metrics

Instance Management

View detailed instance information, logs, and manage running containers

Testing & API Usage

Test your vLLM instances with an interactive chat interface

Architecture

Frontend: React.js with Tailwind CSS for styling
Backend: Node.js with Express.js API
Database: SQLite for instance configuration storage
Container Management: Docker API integration for vLLM instances
Model Discovery: HuggingFace API integration

Quick Start

Prerequisites

Docker and Docker Compose v2
Node.js 18+ (for development)
At least 4GB RAM for running models

Development Setup

Clone the repository

git clone git@github.com:ddunford/vLLMManager.git
cd vllm-manager

Install dependencies
```
npm run install:all
```

Set up environment variables

cp .env.example .env
# Edit .env with your configuration

Start the development servers

# Terminal 1: Start backend
npm run dev

# Terminal 2: Start frontend
npm run dev:frontend

Access the application
- Frontend: http://localhost:3000
- Backend API: http://localhost:3001

Production Deployment

Configure environment variables

cp .env.example .env
# Edit .env with your production settings

Build and start with Docker Compose

# For production
docker compose -f docker-compose.prod.yml up -d

# For development
docker compose up -d

Access the application
- Application: http://localhost:3001

For detailed production deployment instructions, see DEPLOYMENT.md.

Manual Docker Build

# Build the image
docker build -t vllm-manager .

# Run the container
docker run -d \
  -p 3001:3001 \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v ./server/data:/app/server/data \
  --name vllm-manager \
  vllm-manager

Usage

Creating a New Instance

Navigate to the Create Instance page
Enter an instance name and model name (e.g., microsoft/DialoGPT-medium)
Optionally provide a HuggingFace API key for gated models
Click Create Instance

Browsing Models

Use the model discovery interface to search and browse available models:

Search: Find models by name, description, or tags
Popular Models: Browse trending and most downloaded models
Model Details: View comprehensive information including parameters, license, and usage examples
Direct Integration: Click any model to use it for creating a new instance

Managing Instances

Dashboard: View all instances with their status and basic controls
Instance Details: Click on any instance to view logs, detailed information, and API usage examples
Actions: Start, stop, restart, or remove instances directly from the dashboard

The instance detail page provides:

Real-time Logs: Monitor container output and debug issues
Status Information: Current state, port assignments, and resource usage
API Examples: Copy-paste ready code examples for different programming languages
Configuration Details: View model parameters and container settings

Testing Your Instances

Use the built-in testing interface to verify your vLLM instances:

Interactive Chat: Test conversational models with a chat interface
API Testing: Send custom requests and view responses
Response Analysis: Examine model outputs and performance metrics
Error Diagnosis: Debug connection and model issues

Using the API

Once an instance is running, you can access the OpenAI-compatible API:

curl -X POST http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer localkey" \
  -d '{
    "model": "your-model-name",
    "messages": [{"role": "user", "content": "Hello!"}],
    "max_tokens": 100
  }'

API Endpoints

Container Management

GET /api/containers - List all instances
POST /api/containers - Create new instance
POST /api/containers/:id/start - Start instance
POST /api/containers/:id/stop - Stop instance
POST /api/containers/:id/restart - Restart instance
DELETE /api/containers/:id - Remove instance
GET /api/containers/:id/logs - Get container logs

Model Discovery

GET /api/models/search?query=<query> - Search HuggingFace models
GET /api/models/popular - Get popular models
GET /api/models/:modelId - Get model details
POST /api/models/validate - Validate model access

Configuration

Environment Variables

Variable	Description	Default
`PORT`	Server port	`3001`
`NODE_ENV`	Environment	`development`
`HF_TOKEN`	HuggingFace API token	-
`MIN_PORT`	Minimum port for instances	`8001`
`MAX_PORT`	Maximum port for instances	`9000`

Model Selection

The application supports any HuggingFace model compatible with vLLM:

Text Generation: GPT-style models, LLaMA, Mistral, etc.
Conversational: ChatGPT-style models
Code Generation: CodeLLaMA, CodeT5, etc.

Resource Requirements

Minimum: 4GB RAM, 2 CPU cores
Recommended: 8GB+ RAM, 4+ CPU cores
Storage: 10GB+ for model caching

Troubleshooting

Common Issues

Port already in use
- The application automatically assigns available ports
- Check if other services are using the port range (8001-9000)
Model download fails
- Ensure internet connectivity
- Check if the model requires authentication
- Verify HuggingFace API key for gated models
Container creation fails
- Ensure Docker daemon is running
- Check Docker socket permissions
- Verify available disk space

Logs

Application logs: docker compose logs vllm-manager
Instance logs: Available through the web interface
Container logs: docker logs <container-name>

Development

Project Structure

vllm-manager/
├── server/              # Backend API
│   ├── routes/         # API routes
│   ├── services/       # Business logic
│   ├── middleware/     # Security and logging middleware
│   ├── database/       # Database management
│   └── tests/          # Backend tests
├── frontend/           # React frontend
│   ├── src/
│   │   ├── components/ # Reusable components
│   │   ├── pages/      # Page components
│   │   └── services/   # API clients
│   └── public/
├── .github/            # GitHub Actions CI/CD
├── docker-compose.yml  # Development Docker configuration
├── docker-compose.prod.yml # Production Docker configuration
└── Dockerfile         # Container build instructions

Development Scripts

# Start development servers
npm run dev              # Backend with auto-reload
npm run dev:frontend     # Frontend development server

# Testing
npm test                 # Run backend tests
npm run test:coverage    # Run tests with coverage
npm run test:watch       # Watch mode for tests

# Code quality
npm run lint             # Run ESLint
npm run lint:fix         # Fix linting issues
npm run format           # Format code with Prettier

# Docker
npm run docker:up        # Start development containers
npm run docker:down      # Stop containers
npm run docker:prod      # Start production containers

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Run linting and formatting: npm run lint:fix && npm run format
Ensure all tests pass: npm test
Submit a pull request

The project includes:

✅ Automated Testing: Jest test suite with coverage reporting
✅ Code Quality: ESLint and Prettier for consistent code style
✅ CI/CD Pipeline: GitHub Actions for automated testing and deployment
✅ Security Scanning: Automated vulnerability scanning in CI
✅ Docker Support: Multi-stage builds with security best practices

Adding New Features

Backend routes: Add to server/routes/
Frontend pages: Add to frontend/src/pages/
UI components: Add to frontend/src/components/
API services: Add to server/services/

Security Considerations

✅ Production-Ready Security: Comprehensive security middleware with Helmet.js
✅ Rate Limiting: Protection against DoS attacks and API abuse
✅ Environment Variables: All sensitive data configured via environment variables
✅ Container Security: Non-root user and security options enabled
✅ Security Headers: CORS, CSP, HSTS, and other security headers configured
✅ Input Validation: Server-side validation of all user inputs
✅ Security Logging: Monitoring and logging of suspicious activities
✅ Container Isolation: Docker networks and security options prevent interference
✅ Regular Updates: Automated dependency scanning and security updates

See SECURITY.md for detailed security documentation.

Performance Tips

Use smaller models for testing and development
Monitor resource usage through the dashboard
Scale horizontally by running multiple instances
Use SSD storage for better model loading performance

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For issues and questions:

Check the troubleshooting section
Review container logs
Open an issue on GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
frontend		frontend
images		images
ollama-models		ollama-models
server		server
vllm-models		vllm-models
.dockerignore		.dockerignore
.env.example		.env.example
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
.prettierrc.js		.prettierrc.js
DEPLOYMENT.md		DEPLOYMENT.md
Dockerfile		Dockerfile
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
jest.config.js		jest.config.js
package.json		package.json
start.sh		start.sh

Folders and files

Latest commit

History

Repository files navigation

vLLM Manager

Features

Screenshots

Model Discovery & Browsing

Instance Management

Testing & API Usage

Architecture

Quick Start

Prerequisites

Development Setup

Production Deployment

Manual Docker Build

Usage

Creating a New Instance

Browsing Models

Managing Instances

Testing Your Instances

Using the API

API Endpoints

Container Management

Model Discovery

Configuration

Environment Variables

Model Selection

Resource Requirements

Troubleshooting

Common Issues

Logs

Development

Project Structure

Development Scripts

Contributing

Adding New Features

Security Considerations

Performance Tips

License

Support

About

Resources

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages