Building a Chatbot for Students Mentorship based on Extracted Knowledge Graphs

📖 Abstract

The student mentorship process is often time-consuming for students and repetitive for university staff due to the high volume of inquiries regarding university regulations. This project introduces a chatbot that utilizes a Knowledge Graph constructed from unstructured text (specifically university PDF regulations). The system employs KG embedding models to predict missing links and infer relationships, enabling it to answer complex student queries with high accuracy.

🎓 Academic Resources

For a deep dive into the methodology, architectural design, and evaluation metrics of this project, please refer to the following documents:

Full Thesis PDF – A comprehensive breakdown of the research, implementation, and results.
Presentation Slides – The final defense deck used for the graduation committee.

🎬 Demo

Experience the chatbot in action by viewing the recorded demonstration:

Click here to watch the Demo Video

(The demo showcases the end-to-end pipeline from processing a student's natural language query to the generation of a factual response based on the Knowledge Graph.)

🏗️ Architecture

The chatbot pipeline consists of four sequential stages:

Pre-processing User Input: Normalizing text via spell-checking, grammar correction, and lemmatization.
Input Comprehension: Extracting subjects and predicates using spaCy for dependency parsing and Fuzzywuzzy for entity mapping.
Knowledge Graph Embedding Model: Utilizing trained embeddings (TransE, DistMult, or ComplEx) to predict the missing head or tail of a triplet.
Response Generation: Converting predicted triplets back into natural language using NLTK and the Pattern library for grammatical conjugation.

🛠️ Tech Stack

The project is built using Python 3.10 and the following core libraries[cite: 1326]:

Library	Version	Purpose
AmpliGraph	2.0.0	KG Embedding and Link Prediction
spaCy	3.5.1	NER, POS tagging, and Coreference Resolution
Stanford-OpenIE	1.3.1	Information extraction of (Subject-Relation-Object) triplets
Flask	2.2.2	Web framework for the chatbot interface
NLTK / Pattern	3.6.3 / 3.6	Natural Language Generation and text processing
PyPDF2	3.0.1	Extracting raw text from university regulation PDFs

📁 Repository Structure

├── data/
│   ├── raw_pdfs/           # University regulation documents 
│   └── triplets.csv        # Extracted Subject-Relation-Object data 
├── src/
│   ├── preprocessing.py    # Text cleaning and normalization 
│   ├── kg_construction.py  # Triple extraction and KG building 
│   ├── embedding_model.py  # Model training (TransE, ComplEx, etc.) 
│   └── app.py              # Flask application for user interaction 
├── docs/
│   └── Thesis_Full_PDF.pdf # Full academic documentation
└── README.md

🚀 Installation & Usage

Clone the repository:

git clone https://github.com/davidsamy1/Thesis-Chatbot.git
cd Thesis-Chatbot

Install dependencies:
```
pip install -r requirements.txt
```
Run the Application:
```
python app.py
```

📊 Evaluation

The system was tested using three primary KG embedding algorithms to predict missing academic facts:

ComplEx: Captured anti-symmetric relations and complex interactions.
TransE: Provided efficient distance-based reasoning.
DistMult: Used for semantic matching energy modeling. The experimental results demonstrated that the models successfully captured semantic relationships and structural properties of the university KG.

🎓 Citation

If you use this work in your research, please cite it as follows:

@bachelorthesis{Samy2023,
  author = {David Samy},
  title  = {Building a Chatbot for Students Mentorship based on Extracted Knowledge Graphs},
  school = {German University in Cairo (GUC)},
  faculty = {Media Engineering and Technology},
  year   = {2023},
  month  = {June}
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Implementation		Implementation
Bachelor Presentation Slides.pptx		Bachelor Presentation Slides.pptx
Bachelor_Thesis.pdf		Bachelor_Thesis.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Building a Chatbot for Students Mentorship based on Extracted Knowledge Graphs

📖 Abstract

🎓 Academic Resources

🎬 Demo

🏗️ Architecture

🛠️ Tech Stack

📁 Repository Structure

🚀 Installation & Usage

📊 Evaluation

🎓 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Building a Chatbot for Students Mentorship based on Extracted Knowledge Graphs

📖 Abstract

🎓 Academic Resources

🎬 Demo

🏗️ Architecture

🛠️ Tech Stack

📁 Repository Structure

🚀 Installation & Usage

📊 Evaluation

🎓 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages