A Python chatbot application that combines AIML (Artificial Intelligence Markup Language) pattern matching with a Flask web interface, Neo4j graph storage for users and long-term memory, and optional NLTK-based word analysis. An alternate terminal workflow can combine AIML with SWI-Prolog family reasoning for demos.
The project originated as an AIML tutorial-style bot and has been extended with authentication, a web UI, voice input, graph-backed social and episodic memory, and ESP32 firmware that captures audio over Wi-Fi and forwards it to the same voice API.
- Web application (
app.py): Sign up, login (Flask-Login), chat UI, and speech-to-text via WAV posted to/voice(Google Speech Recognition on the server). - ESP32 microphone pipeline (MicroPython): Scripts under the project root target an ESP32 with an analog microphone on an ADC pin. They connect to Wi-Fi, record 8 kHz, mono, 8-bit PCM audio for a fixed duration, wrap it in a WAV container, and POST the bytes to
http://<your-pc-ip>:5050/voiceso the Flask app returns transcribed text and the chatbot reply (same contract as the browser). - AIML engine (
chatbot.py): Loads a serialized brain frompretrained_model/aiml_pretrained_model.dumpwhen present; otherwise bootstraps frompretrained_model/LearningFileList.aiml, which loads patterns underdata/*.aiml. - Neo4j integration (
family_bot.py): User accounts, chat episodes, sensory memory (per-message word nodes with POS, sentiment, WordNet links), person nodes, custom relationships, and person properties (facts). - Natural-language shortcuts (in
get_chatbot_response): Phrases such asmy brother is John,Alice age is 30,what is the age of Alice, andwho is my brotherare parsed in Python and mapped to Neo4j before falling back to AIML. - Optional CLI (
conversation.py): Loads AIML fromdata/*.aiml, consultsfamilyfacts.plandfamilyrule.plfor family and age queries, and can talk to Neo4j (intended for terminal experimentation; ensure method names matchFamilyBotif you use this path).
-
Python 3.x (version aligned with your installed
aimlandneo4jdriver packages). -
Neo4j with Bolt access. The code uses database name
chatbot(seeFamilyBotconstructor usage). -
NLTK data:
pam_utils.pyexpects tokenizers, taggers, NER chunker, names, VADER, and WordNet. Download as needed, for example:import nltk nltk.download("punkt_tab") nltk.download("averaged_perceptron_tagger_eng") nltk.download("maxent_ne_chunker_tab") nltk.download("words") nltk.download("names") nltk.download("vader_lexicon") nltk.download("wordnet")
-
Voice (server):
speech_recognition(and a working audio pipeline; the server writes a temporary WAV file). Install if you use/voice:pip install SpeechRecognition
-
Voice (ESP32): A board running MicroPython with
urequests(often preinstalled or installable viamip). The analog mic must be wired to the configured ADC pin with appropriate attenuation (ADC.ATTN_11DBin the scripts). The ESP32 and the machine running Flask must be on the same LAN so the device can reach your PC’s IP address and port5050. -
Optional CLI + Prolog: SWI-Prolog and
pyswipif you runconversation.pywith Prolog rules.
-
Clone or extract the project and open a terminal in the project root.
-
Create a virtual environment (recommended), then install dependencies:
pip install -r requirements.txt
The repository
requirements.txtlists core packages (flask,flask_login,aiml,neo4j,nltk,pytz, and others). AddSpeechRecognition,prettyprinter, andpyswipmanually if you use voice, fullpam_utilsfeatures, or the Prolog CLI path. -
Start Neo4j, create or select the
chatbotdatabase, and ensure credentials match your configuration (default URI and credentials are set inapp.pyandfamily_bot.py; change them for your environment). -
Download NLTK corpora as described above before first chat, or the first requests that hit
createPAMmay fail.
- Neo4j: In
app.py,FamilyBotis constructed withbolt://localhost:7687, userneo4j, and password12345678. Adjust the URI, user, password, and database name in code or move these to environment variables for production. - Flask secret: Replace
app.secret_keyinapp.pywith a strong random value; never commit production secrets. - Passwords: User passwords are stored and compared as plain text in the sample code. This is suitable only for local development; use hashing (for example
werkzeug.security) for any shared or public deployment.
From the project root:
python app.pyThe server listens on all interfaces at port 5050 (host='0.0.0.0', port=5050). Open http://localhost:5050 in a browser, register or log in, then use the chat interface.
- POST
/chat: JSON body{"message": "..."}returns{"response": "..."}. - POST
/voice: Raw WAV body (Content-Type: audio/wav); returns JSON with recognizedtext,response, or anerrorfield. Used by the web client and by the ESP32 scripts.
Three MicroPython-oriented scripts complement the Flask voice endpoint. Flash MicroPython to the ESP32, copy the files you need, then edit SSID, password, and SERVER_URL (or the hardcoded URL in Mic_server.py) to use your PC’s LAN IP (not localhost from the device’s point of view).
| Script | Purpose |
|---|---|
Wifi_Esp.py |
Wi-Fi only: enables station mode, optionally scans SSIDs, connects with your credentials, prints the ESP32’s assigned IP. Use this to verify radio and DHCP before adding audio. |
Mic_server.py |
Record and upload: samples the microphone on ADC pin 1 at 8 kHz for 3 seconds, builds a minimal WAV file, and POSTs it to the Flask /voice URL. If your build omits top-level imports, add the same machine / time / struct / urequests imports used in Final_mic.py. |
Final_mic.py |
End-to-end: configurable SSID, PASSWORD, SERVER_URL, ADC_PIN, SAMPLE_RATE, and DURATION; connects with a 15 s timeout; then loops in the REPL (e.g. Thonny) so each trigger records and sends audio to the server. This is the recommended single script for demos once Wi-Fi is stable. |
Audio format (must match what speech_recognition expects): mono, 8-bit PCM, WAV RIFF; default 8000 Hz sample rate and 3 s clip length in these scripts.
Authentication note: Unauthenticated requests to /voice are handled with user context guest for chatbot and memory wiring in the current app.py, so the ESP32 does not need a browser login. If you later protect /voice with Flask-Login, you will need a token or API key strategy for the device. Prefer running voice tests on a trusted network and avoid committing real Wi-Fi passwords into public repositories.
python conversation.pyThis loop reads stdin, responds via AIML, and can delegate to Prolog (familyfacts.pl, familyrule.pl) or Neo4j depending on AIML templates. You must have SWI-Prolog available and compatible pyswip setup. Verify that any Neo4j helper calls match the current FamilyBot API.
| Path | Role |
|---|---|
app.py |
Flask app: auth, chat, voice, Neo4j wiring |
chatbot.py |
AIML kernel wrapper and brain load/save |
family_bot.py |
Neo4j driver: users, memory, relations, facts, episodes |
pam_utils.py |
Per-word PAM-style features (NLTK / VADER / WordNet) |
conversation.py |
Optional AIML + Prolog + Neo4j CLI |
familyfacts.pl, familyrule.pl |
Example family knowledge and rules for Prolog |
data/*.aiml |
AIML pattern libraries |
pretrained_model/LearningFileList.aiml |
Bootstrap file that loads data/*.aiml |
templates/ |
login.html, signup.html, home.html |
static/style.css |
Stylesheet for the web UI |
imp.txt |
Developer notes (Cypher snippets, search hints) |
AIML_TAG_descriptions.txt |
AIML tag reference (if present) |
Wifi_Esp.py |
ESP32 MicroPython: Wi-Fi scan and connect |
Mic_server.py |
ESP32 MicroPython: ADC capture, WAV, POST to /voice |
Final_mic.py |
ESP32 MicroPython: Wi-Fi + interactive record/send loop |
AIML files under data/ define categories and templates. The kernel learns from the list file in pretrained_model/; the first run without aiml_pretrained_model.dump parses AIML and saves a brain file for faster subsequent starts.
Custom domains can be added by extending AIML or by handling patterns in Python (as in get_chatbot_response).
- Neo4j connection errors: Confirm Bolt is enabled, the
chatbotdatabase exists, and URI/credentials matchFamilyBotinitialization. - Case-sensitive paths: On Linux or macOS, ensure
chatbot.pyreferences the correct capitalization forLearningFileList.aimlunderpretrained_model/. - NLTK errors: Run the downloads listed in Requirements before chatting.
- Voice failures: Check microphone permissions, WAV format, and that
SpeechRecognitioncan reach Google’s recognition service (network required for the default recognizer). - ESP32 cannot reach the server: Confirm firewall rules allow inbound TCP 5050 on the Flask host, verify the URL uses the host’s LAN IP, and ping or browse from another device on the same subnet.
- ESP32 audio unusable by recognizer: Ensure sample rate, bit depth, mono channel, and WAV headers match the scripts; clipping or wrong pin wiring often yields silence or noise.
This project builds on AIML concepts and sample data common in AIML tutorials. HTML/CSS may derive from third-party snippets; review template comments and original sources if you redistribute. Add a license file if you intend to publish the repository formally.
Initial project date (from prior README): November 2018. The stack has since been extended with Flask, Neo4j, browser and ESP32 voice paths (Wifi_Esp.py, Mic_server.py, Final_mic.py), and related modules as described above.