A real-time music detection system that uses convolutional neural networks to identify songs from live audio input through the user's microphone.
- Real-time detection: Identifies songs in 3-second segments
- Machine Learning: CNN trained on mel spectrograms for audio classification
- Modern web interface: Frontend built with Astro and TypeScript
- Automatic download: Spotify integration for playlist downloading
- Data augmentation: Improves model robustness with audio transformations
- Real-time visualization: Training progress charts with Chart.js
The system follows a client-server architecture with separate frontend and backend services communicating via WebSockets:
- Frontend (Port 3000): Web interface with Astro, audio handling and WebSockets
- Backend (Ports 5000, 5001): ML processing, model training and prediction
- Communication: WebSocket for model state (5000) and predictions (5001)
- Docker and Docker Compose
- NVIDIA GPU with CUDA support (recommended for training)
.envfile with Spotify configuration
- Clone the repository:
git clone https://github.com/moraxh/NeuraZam.git
cd NeuraZam- Configure environment variables:
cp .env.example .env
# Edit .env with your Spotify credentials- Run with Docker Compose:
docker-compose up --build- Access the application at
http://localhost:3000
The system implements a complete ML pipeline:
- Audio download: Retrieves songs from Spotify playlists using
spotdl - Feature extraction: Converts audio to normalized mel spectrograms
- Data augmentation: Applies transformations (noise, pitch shift, filters)
- CNN training: Convolutional neural network with early stopping
- Real-time inference: Live audio classification
In internal tests, the model was successfully trained in approximately 2 hours using a dataset of 100 songs, achieving over 95% accuracy in song identification.
- Input: Mel spectrograms (128 x time)
- Layers: 4 convolutional layers + 2 dense layers
- Optimization: Adam with ReduceLROnPlateau
- Regularization: Dropout, BatchNorm, Early Stopping
- MusicDetector: Main component for music detection
- StateDialogManager: Training state management
- AudioRecorder: Audio capture and processing
- WebSocketManager: Real-time communication
- ChartManager: Training metrics visualization
- Astro 5.7.12: Static site framework
- TypeScript: Static typing
- TailwindCSS: Utility-first styling
- Chart.js: Data visualization
NeuraZam/
βββ app/
β βββ frontend/ # Astro application
β β βββ src/
β β β βββ components/ # Reusable components
β β β βββ layouts/ # Page layouts
β β β βββ lib/ # TypeScript utilities
β β β βββ pages/ # Application pages
β β βββ Dockerfile
β βββ backend/ # Python server
β βββ src/
β β βββ audio/ # Audio processing
β β βββ models/ # ML models
β β βββ utils/ # Utilities
β βββ Dockerfile
βββ docker-compose.yml # Service orchestration
βββ README.md
The system manages multiple states during operation:
LOADING_SERVER: Initializing serverDOWNLOADING_SONGS: Downloading playlistPROCESSING_SONGS: Processing audioEXTRACTING_FEATURES: Generating spectrogramsTRAINING_MODEL: Training CNNREADY: Ready for predictions
- Initial setup: System automatically downloads and processes songs
- Training: CNN trains with processed data
- Detection: Use listening button to identify songs in real-time
- Results: View predictions with song information
# Frontend
pnpm install # Install dependencies
pnpm dev # Development server
pnpm build # Production build
# Backend
python src/main.py # Run backend serverThe system is optimized for NVIDIA GPU:
- CUDA-accelerated training
- Optimized real-time inference
- 2GB shared memory for processing
This project is under the MIT License. See LICENSE for more details.
- Spotdl: For music access
- PyTorch: Deep learning framework
- Astro: Modern frontend framework
- Chart.js: Data visualization



