Navigate and query your Dify knowledge base like a filesystem: ls, cat, grep, find, and search over documents organized by virtual paths.
Author: ki3nd
Type: tool
Github Repo: https://github.com/ki3nd/DifyFS
Github Issues: issues
The idea comes from two sources:
-
Mintlify ChromaFS — treats a vector store as a filesystem, assigning each document a
slugpath (e.g.guides/quickstart) and using the vector DB as both a coarse filter and a content store. Grep works in two stages: coarse filter via vector similarity, then fine line-by-line regex. -
vkfs — a Go implementation of the same idea, supporting SQLite and Zilliz backends. difyfs adapts the same virtual path model and two-stage grep to work on top of the Dify Knowledge Base API.
Each document in your dataset gets a slug metadata field — its virtual path in the filesystem. Tools then navigate and read documents by slug.
Credentials required:
| Field | Description |
|---|---|
| Service API Endpoint | Your Dify instance API base URL, e.g. https://api.dify.ai/v1 |
| API Key | A dataset-scoped API key from your Dify workspace |
Set the slug (and any other metadata) on a document. This is the first step before using any other tool.
dataset_id: <your-dataset-id>
document_id: <document-id>
key: slug
value: guides/quickstart
ls / → top-level dirs and files
ls guides → contents of guides/
ls guides/api → contents of guides/api/
Output:
/guides
api/
quickstart
reference/
Returns the full text of a document by its slug path. Content is reconstructed by joining all text chunks in order.
cat guides/quickstart
Note: difyfs assumes datasets are chunked without overlap. If your dataset uses chunk overlap,
catoutput will contain duplicated text at chunk boundaries.
stat guides/quickstart → word count, tokens, indexing status, created_at, metadata
stat guides → type: directory, child count
Searches line by line, returns path:lineNum — line output.
grep dataset_id=<id> pattern=access_token path=guides
Two modes:
- Single-file mode — when
pathmatches an exact slug. Fetches all chunks and applies regex. 100% accurate. - Directory mode — when
pathis a prefix or empty. Uses Dify full-text search as a coarse filter (top_k segments), then applies regex line by line. Best-effort — recall depends ontop_k.
Glob matching on the filename part of each slug. Supports * and ?.
find dataset_id=<id> name_pattern=*.md path=guides
search dataset_id=<id> query="authentication flow" search_method=semantic_search path=guides
Returns matching chunks with virtual file path, relevance score, and a 300-character preview.
A slug is a /-separated path string stored as document metadata:
guides/quickstart → file at /guides/quickstart
guides/api/endpoints → file at /guides/api/endpoints
Virtual directories are derived automatically — any common prefix becomes a navigable directory. There is no need to create directory entries explicitly.
Use metadata_set to assign slugs to documents before navigating the filesystem. Documents without a slug fall back to using their document name as the path, placed at root.
- No overlap support in
cat— chunk overlap produces duplicate text at boundaries. Use overlap = 0 when configuring your dataset chunking. - Directory mode grep is best-effort — Dify's retrieve API returns at most
top_kchunks. Segments not in the top-k are not searched. - No write operations — difyfs is read-only by design. Document creation and deletion are not supported.
- Group-based filtering — a
groupmetadata field on documents could be used to scopels,find,grep, andsearchto a named group (e.g.group=engineering). - Public/private access control — a
publicmetadata field (true/false) could let tools filter out private documents, enabling basic visibility control within a shared dataset.