Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion Documentations/DevSync.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,15 @@ into an interactive SSH session running the agent.
Run from anywhere inside the repo:

```bash
./devsync.sh
./devsync.sh # default: python -m chatdku.core.agent
./devsync.sh chatdku/core/agent.py # as a file path
./devsync.sh chatdku.core.agent # as a module (runs with python -m)
```

Arguments containing `/` or ending in `.py` are run as file paths. Everything
else is treated as a module name and run with `python -m`. Absolute local
paths are accepted and automatically stripped to repo-relative before sync.

## What it does

1. Resolves the remote user (prefers `gh api user`, falls back to `whoami`) and
Expand Down
75 changes: 50 additions & 25 deletions GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,20 @@

This is a set of guides intended for you to get ready to contribute to our project.
This guide is intended for **newcomers**, as well as, our **members**.
I (Temuulen) will be explaining our core dependencies as well as any other useful stuff you should learn about before getting into coding.
I (Temuulen) will be explaining our core dependencies as well as any other useful stuff you should learn about before getting into coding.

> [!IMPORTANT]
> This is a work in progess. Please tell me what you don't understand about this guide and our project and I will add it to this document for future use.
> This is a work in progress. Please tell me what you don't understand about this guide and our project and I will add it to this document for future use.

When I was coming into this project, even though it was structured very clearly, it was hard to get my head around everything.
I felt like the code was just very messy and there were just a lot of things that did not have clear explanations.

And most of our code is like that even today. However, with this guide I hope you will at least have some support and start contributing faster.
And most of our code is like that even today. However, with this guide I hope you will at least have some support and start contributing faster.

> Please remember that at first you will be learning *slow* to **develop** faster in the future by following this guide.
> Please remember that at first you will be learning _slow_ to **develop** faster in the future by following this guide.

Here are some list of members and their respective roles they **self-assigned** themselves into:

- Anar: Frontend (React.js), Syllabi SQL agent tool
- Munish: Backend (Flask, Django), System health monitoring
- Temuulen: Agent logic (DSPy), Document ingestion Logic (Transferring to ZhiWei)
Expand All @@ -23,11 +24,12 @@ Here are some list of members and their respective roles they **self-assigned**

### 1. Python

First, obviously you need to know python. While we don't require you to be a pythonic expert, a quality code is generally preferred. So, what makes a code ***good code***?
First, obviously you need to know python. While we don't require you to be a pythonic expert, a quality code is generally preferred. So, what makes a code **_good code_**?

This is completely subjective, but there are some qualities that you can start from:
- Functions have [docstrings](https://numpydoc.readthedocs.io/en/latest/format.html)
- Account for future contributers to understand the code

- Functions have [docstrings](https://numpydoc.readthedocs.io/en/latest/format.html)
- Account for future contributors to understand the code
- Obvious naming practices and using python naming practices.

I mean I can go on and on about coding practices. What you need to understand is that you need to build scalable code, accounting for any other person to review your code and understand it.
Expand All @@ -39,13 +41,13 @@ I mean I can go on and on about coding practices. What you need to understand is
> While these things seem very annoying at first, believe me that they will help.
> When I come back to DKU next Spring, I plan to give every member a crash course on a new GIT workflow. Please read all the articles I will be linking to.

Git is a version control system that intelligently tracks changes in files.
Git is a version control system that intelligently tracks changes in files.
Git is particularly useful when you and a group of people are all making changes to the same files at the same time.

Typically, to do this in a Git-based workflow, you would:

- Create a branch to ***show the intent of your work***.
- Create issues ***before*** you do the work/code.
- Create a branch to **_show the intent of your work_**.
- Create issues **_before_** you do the work/code.
- Make edits to the files independently and safely on your own personal branch.
- Close or update issues [with your commits or Merge Requests](https://docs.gitlab.com/user/project/issues/managing_issues/#closing-issues-automatically)
- Let Git intelligently merge your specific changes back into the main copy of files, so that your changes don't impact other people's updates.
Expand All @@ -55,35 +57,37 @@ Typically, to do this in a Git-based workflow, you would:
> Our `Main` branch is a **SACRED** branch. DO NOT PUSH CODE WITHOUT PROPER REVIEW FROM OTHER MEMBERS.

Please read these articles:
- [Github Flow](https://docs.github.com/en/get-started/using-github/github-flow)
- [Always start with an issue](https://web.archive.org/web/20230214040753/https://about.gitlab.com/blog/2016/03/03/start-with-an-issue/)
- Try creating an issue now on what you want to do next.
- Also if you don't see our issue board under the projects tab in our repo. Please contact Mingxi and ask to be added to the Project issue board.

- [GitHub Flow](https://docs.github.com/en/get-started/using-github/github-flow)
- [Always start with an issue](https://web.archive.org/web/20230214040753/https://about.gitlab.com/blog/2016/03/03/start-with-an-issue/)
- Try creating an issue now on what you want to do next.
- Also if you don't see our issue board under the projects tab in our repo. Please contact Mingxi and ask to be added to the Project issue board.
- [Write good commit messages!](https://cbea.ms/git-commit/)
- [Issue board](https://about.gitlab.com/blog/announcing-the-gitlab-issue-board/)
- While we are not using Gitlab, Github has the same feature called "Project".
- [It's all connected in Gitlab](https://about.gitlab.com/2016/03/08/gitlab-tutorial-its-all-connected/)
- Again, Github has the equilavent features at [here](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/autolinked-references-and-urls)
- While we are not using GitLab, GitHub has the same feature called "Project".
- [It's all connected in Gitlab](https://about.gitlab.com/2016/03/08/gitlab-tutorial-its-all-connected/)
- Again, GitHub has the equivalent features at [here](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/autolinked-references-and-urls)

As you incorperate these steps into your developer journey, you will be better equipped for real world team-coding.
All the industry experts follow some form of stardards using GIT. You should learn to use it properly while you are here with us.
As you incorporate these steps into your developer journey, you will be better equipped for real world team-coding.
All the industry experts follow some form of standards using GIT. You should learn to use it properly while you are here with us.

And [here is a longer video](https://www.youtube.com/watch?v=1ffBJ4sVUb4) that gives you more in-depth details on how GIT works.
And [here is a longer video](https://www.youtube.com/watch?v=1ffBJ4sVUb4) that gives you more in-depth details on how GIT works.

Here is an [interactive](https://learngitbranching.js.org/?locale=en_US) Git simulator for you to practice.
Here is an [interactive](https://learngitbranching.js.org/?locale=en_US) Git simulator for you to practice.

### 3. Using the Terminal

Using the terminal, you can do a lot of stuff with it. I assure you that to get better at it you just have to use it daily. At first you might google a lot of stuff, and that is **okay!**.
All of us started out like that. Here are some of the common commands I use when working with CHATDKU:

- `ssh`: Used to connect to our server
- `git`: Working with github
- `git`: Working with GitHub
- `sftp`: ssh like file transferring
- `nvidia-smi`: Used to inspect GPUs

Again, just google these stuff and learn. Good luck! It will be worth it.

## Role-specific guides
## Role-specific Guides

Please be careful when interacting with Docker. It hosts our Embedding Model, Vector Database, and Redis Database.

Expand All @@ -93,9 +97,30 @@ Please be careful when interacting with Docker. It hosts our Embedding Model, Ve
- For creating tools: https://github.com/Glitterccc/ChatDKU/issues/122
- Arize Phoenix for observability: https://arize.com/docs/phoenix

### Document ingestion
### Iterating on the agent with `devsync.sh`

Edit code on your laptop, then push and run it on the shared dev server in one
command. From the repo root:

```bash
./devsync.sh # runs the agent
```

```bash
./devsync.sh chatdku/core/tools/your_file.py # runs any file you're hacking on
```

The script rsyncs your working tree, runs `uv sync`, and drops you into a live
session on the remote. Your `.venv/`, `.env`, and `.git/` are left alone.

See [Documentations/DevSync.md](Documentations/DevSync.md) for configuration,
Windows-specific notes, and troubleshooting. If you're new, also skim
[Documentations/Shared-Secrets.md](Documentations/Shared-Secrets.md) — once an
admin adds you to `chatdku_devs`, all project secrets load into your remote
shell automatically, no `.env` copying needed.

### Document Ingestion

- Llamaindex for document ingestion: https://developers.llamaindex.ai/python/framework/getting_started/concepts
- ChromaDB for vector store: https://docs.trychroma.com/docs/overview/introduction
- Redis for keyword search: https://redis.io/docs/latest/develop/

5 changes: 4 additions & 1 deletion chatdku/core/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,10 @@
from chatdku.core.dspy_classes.synthesizer import Synthesizer
from chatdku.core.tools.course_schedule import CourseScheduleLookupOuter
from chatdku.core.tools.get_prerequisites import PrerequisiteLookupOuter
from chatdku.core.tools.llama_index import KeywordRetrieverOuter, VectorRetrieverOuter
from chatdku.core.tools.llama_index_tools import (
KeywordRetrieverOuter,
VectorRetrieverOuter,
)
from chatdku.core.tools.major_requirements import MajorRequirementsLookupOuter
from chatdku.core.tools.syllabi_tool.query_curriculum_db import QueryCurriculumOuter
from chatdku.core.utils import format_trajectory, load_conversation, span_start
Expand Down
2 changes: 1 addition & 1 deletion chatdku/django/chatdku_django/chat/tools.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from chatdku.core.tools.llama_index import KeywordRetrieverOuter, VectorRetrieverOuter
from chatdku.core.tools.llama_index_tools import KeywordRetrieverOuter, VectorRetrieverOuter
from chatdku.core.tools.syllabi_tool.query_curriculum_db import QueryCurriculumOuter


Expand Down
2 changes: 1 addition & 1 deletion chatdku/django/chatdku_django/chat/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
from rest_framework.views import APIView

from chatdku.core.agent import Agent
from chatdku.core.tools.llama_index import KeywordRetrieverOuter, VectorRetrieverOuter
from chatdku.core.tools.llama_index_tools import KeywordRetrieverOuter, VectorRetrieverOuter
from chatdku.core.tools.syllabi_tool.query_curriculum_db import QueryCurriculumOuter
from chat.tools import get_tools

Expand Down
43 changes: 40 additions & 3 deletions devsync.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
#!/usr/bin/env bash
# devsync.sh — rsync local sources to the dev server, then drop into an interactive agent session.
# devsync.sh — rsync local sources to the dev server, then run Python remotely.
#
# Usage:
# ./devsync.sh # runs: python -m chatdku.core.agent
# ./devsync.sh path/to/file.py # runs: python path/to/file.py
# ./devsync.sh chatdku.core.agent # runs: python -m chatdku.core.agent
# Arguments with `/` or a `.py` suffix are treated as file paths; everything
# else is treated as a module name and run with `python -m`.
set -euo pipefail

BOLD="\033[1m"
Expand All @@ -21,6 +28,34 @@ SERVER="${CHATDKU_SERVER:-${_SSH_USER}@10.200.14.82}"
REMOTE_DIR="${CHATDKU_REMOTE_DIR:-~/ChatDKU-DevSync}"
LOCAL_DIR="$(git rev-parse --show-toplevel)"

# Accept a leading `-m` / `--module` flag for familiarity; we always decide
# file-vs-module from the argument shape below.
if [[ "${1:-}" == "-m" || "${1:-}" == "--module" ]]; then
shift
fi

TARGET="${1:-}"
if [[ -n "$TARGET" ]]; then
if [[ "$TARGET" != *"/"* && "$TARGET" != *.py ]]; then
# Looks like a module (e.g. chatdku.core.agent) — run with -m
REMOTE_RUN_CMD="uv run python -m $(printf %q "$TARGET")"
RUN_DESC="python -m $TARGET"
else
# Treat as a file path
if [[ "$TARGET" = /* ]]; then
TARGET="${TARGET#"$LOCAL_DIR"/}"
fi
if [[ ! -f "$LOCAL_DIR/$TARGET" ]]; then
warn "target '$TARGET' not found under $LOCAL_DIR — syncing anyway"
fi
REMOTE_RUN_CMD="uv run python $(printf %q "$TARGET")"
RUN_DESC="python $TARGET"
fi
else
REMOTE_RUN_CMD="uv run python -m chatdku.core.agent"
RUN_DESC="agent"
fi

step "preparing remote directory $REMOTE_DIR on $SERVER"
ssh "${SERVER}" "mkdir -p ${REMOTE_DIR}"

Expand All @@ -40,6 +75,8 @@ info "syncing ${BOLD}$LOCAL_DIR${RESET}${CYAN} → ${BOLD}$SERVER:$REMOTE_DIR"

rsync -avz --delete \
--exclude='.git/' \
--exclude='.venv/' \
--exclude='venv/' \
--exclude='__pycache__/' \
--exclude='*.pyc' \
--exclude='*.egg-info/' \
Expand All @@ -52,5 +89,5 @@ rsync -avz --delete \

success "synced"

info "connecting to ${BOLD}$SERVER${RESET}${CYAN} — running agent"
ssh -t "${SERVER}" "bash -l -c 'cd ${REMOTE_DIR} && uv sync && uv run python -m chatdku.core.agent'"
info "connecting to ${BOLD}$SERVER${RESET}${CYAN} — running ${BOLD}${RUN_DESC}${RESET}"
ssh -t "${SERVER}" "bash -l -c 'cd ${REMOTE_DIR} && uv sync && ${REMOTE_RUN_CMD}'"
Loading
Loading