🚀 Weft vs Traditional Orchestration — Token & Cost Efficiency Demo

Cut LLM costs by ~10–30%—just by changing orchestration.

🎬 Demo

🧠 What this project does

This project demonstrates how orchestration strategy—not model choice—drives LLM cost and efficiency.

It compares how different pipeline designs affect:

🔢 Token usage
💰 Cost
⚡ Efficiency

Compared approaches:

🧵 Weft-style orchestration
🔁 Map-Reduce orchestration
📦 Python full-buffer baseline (control)

👉 Same input. Same model.
👉 Only orchestration changes.

🌐 About Weft

Courtesy: https://github.com/WeaveMindAI/weft

Weft-style orchestration focuses on:

Passing only the data needed at each step
Avoiding repeated context sharing
Using structured, minimal data flow

💡 In this project, we simulate Weft-style orchestration in Python to demonstrate its impact on token efficiency.

💥 Why this matters

Most LLM pipelines are inefficient because they:

❌ Re-send the same context repeatedly
❌ Grow token usage at every step
❌ Increase cost without improving output

🧵 Weft solves this by:

Sending only relevant data
Avoiding full-buffer context passing
Reducing unnecessary token duplication

📊 What the UI shows

Input / Output / Total tokens
Estimated cost per approach
✅ Tokens saved (vs baseline)
✅ Cost saved
🚀 % cost reduction (Weft vs Map-Reduce)

⚙️ How it works

Runs three orchestration pipelines:
- Full-buffer baseline
- Map-Reduce
- Weft-style
Measures:
- Input tokens
- Output tokens
- Total tokens
Computes:
- Cost using configs/pricing.yaml
- Absolute savings
- Percentage (%) reduction
Displays everything in a side-by-side comparison UI

▶️ Run locally

python scripts.py seed-demo-data
python scripts.py benchmark --model claude-3-5-sonnet
python backend/main.py

Open:

http://127.0.0.1:8000

🌐 Landing Page

http://127.0.0.1:8000

Click Try Live Demo to launch the main UI.

📚 Documentation

For deeper understanding of the system design:

🧠 Orchestration Design — explains how Weft-style and baseline pipelines are structured
📄 Design Goals — outlines the objectives, constraints, and comparison methodology

🎯 What you learn from this project

👉 LLM cost is an orchestration problem, not just a model problem.

🔥 Context duplication is the real cost driver
🧠 Pipeline design directly impacts token usage
⚡ Structured data flow > raw text passing
💰 Reduce cost without changing models
📊 Measure efficiency using tokens, cost, and % reduction

🏭 From demo → production

Applies to:

RAG pipelines
Agents
Multi-step workflows
Tool-using systems

Production mindset shift:

❌ “Which model is cheaper?”
✅ “Why am I sending so much data?”

💡 Final takeaway

Better orchestration beats better models (for cost). The cheapest token is the one you never send.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
configs		configs
data		data
docs		docs
frontend		frontend
output		output
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
scripts.py		scripts.py
test_retrieval.py		test_retrieval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Weft vs Traditional Orchestration — Token & Cost Efficiency Demo

🎬 Demo

🧠 What this project does

Compared approaches:

🌐 About Weft

💥 Why this matters

🧵 Weft solves this by:

📊 What the UI shows

⚙️ How it works

▶️ Run locally

🌐 Landing Page

📚 Documentation

🎯 What you learn from this project

👉 LLM cost is an orchestration problem, not just a model problem.

🏭 From demo → production

Production mindset shift:

💡 Final takeaway

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚀 Weft vs Traditional Orchestration — Token & Cost Efficiency Demo

🎬 Demo

🧠 What this project does

Compared approaches:

🌐 About Weft

💥 Why this matters

🧵 Weft solves this by:

📊 What the UI shows

⚙️ How it works

▶️ Run locally

🌐 Landing Page

📚 Documentation

🎯 What you learn from this project

👉 LLM cost is an orchestration problem, not just a model problem.

🏭 From demo → production

Production mindset shift:

💡 Final takeaway

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages