Building systems that think, retrieve, and decide at scale, from multi-agent RAG pipelines and fine-tuned transformers to production ML infrastructure on GCP and AWS.
Portfolio: vuchau0802.github.io/Portfolio
LinkedIn: linkedin.com/in/vutrongchau
GitHub: github.com/vuchau0802
Email: chautrongvu@gmail.com
M.S. Computer Science (AI) from Troy University with hands-on experience across the full ML lifecycle, data engineering, model training, inference optimization, and MLOps deployment. Currently interning as an AI Systems & LLM Engineering Intern at TechX, building multi-LLM orchestration pipelines and GPU-accelerated inference services on GCP Vertex AI.
Python LangGraph LangChain FAISS Groq AWS Docker RAGAS
- 5-agent LangGraph RAG pipeline over 500K+ medical records with 90.0% accuracy, 0.898 macro F1 (Linear SVM)
- AWS S3 + FAISS vector storage with ETL pipelines; RAGAS-guided chunking cut unsafe response rate to <0.3%
- Dockerized microservices with GitHub Actions CI/CD and Prometheus monitoring; deploys end-to-end in <4 minutes
Python XGBoost Scikit-learn Pandas Flask FRED API Tableau
- Full ML lifecycle pipelines over 2.26M financial records (110K+ loans, 7 states)
- AUC-ROC 0.79, R²=0.91, 89.7% accuracy; SMOTE oversampling lifted minority-class F1 by +12 pp
- Automated ETL integrating FRED, BLS, and BEA macroeconomic APIs with zero-null feature store
Python PyTorch Hugging Face Transformers FastAPI Docker AWS CI/CD
- Fine-tuned Toxic-BERT on 130K+ labeled texts with 84.9% accuracy, F1-score 0.855 (+9.3 F1 pp over baseline)
- FastAPI inference service with async handling and caching at <80 ms median latency
- Full MLOps CI/CD via GitHub Actions, Docker, AWS EC2/S3, and Hugging Face Spaces
Python Scikit-learn Flask Pandas NumPy
- End-to-end ML pipelines on 110K+ health records with 82.4% accuracy, 0.747 macro F1 (Random Forest)
- Regression model: R²=0.671, RMSE=0.737 (Logistic Regression) via 5-fold cross-validated model selection
- Flask prediction API with real-time analytics dashboard and personalized health recommendations
Python Scikit-learn Pandas D3.js ETL Pipelines Flask
- ETL pipeline integrating 12 World Bank indicators across 195+ countries and 60+ years (1960–2023)
- Linear Regression outperformed RNN and CNN: R²=94.53%, MAE=3.11%, MSE=0.60%
- Interactive D3.js dashboard with choropleth map, time-series analytics, and demographic comparison charts
| Category | Tools |
|---|---|
| Programming Languages | Python, SQL, C/C++, JavaScript |
| ML / DL Frameworks | PyTorch, TensorFlow, Scikit-learn, XGBoost, Hugging Face Transformers, LangChain, LangGraph, Pandas, NumPy, NLTK |
| Generative AI & LLMs | RAG, Prompt Engineering, FAISS, Chroma, RAGAS, Fine-tuning, Quantization, Agentic Pipelines |
| MLOps & Cloud | Docker, Kubernetes, CI/CD, GCP Vertex AI, AWS, FastAPI, Triton Inference Server, Tableau, D3.js, ETL Pipelines |
M.S. Computer Science — Artificial Intelligence (GPA: 3.5/4.0)
Troy University · Jul 2025
Coursework: Machine Learning, Advanced AI, Analysis of Algorithms, Data Visualization, Business Analytics (MBA)
B.Eng. Electronic & Electrical Engineering (UK 2:1 Honours)
University of Sunderland · Jul 2021
- 🎓 LLM Application Engineering and Development — Simplilearn (Oct 2025)
- 🎓 Generative AI with Large Language Models — DeepLearning.AI (Sep 2025)
- 🎓 Data Science Methodology — IBM (Sep 2025)