diff --git a/docs/memory/phase-plan.md b/docs/memory/phase-plan.md new file mode 100644 index 00000000..ece84cf3 --- /dev/null +++ b/docs/memory/phase-plan.md @@ -0,0 +1,230 @@ +# HUF Memory Architecture: Implementation Phase Plan + +> **Status:** Active — Phase 1 in progress (PR #275) +> **Last updated:** 2026-05-29 +> **Related:** [zero-to-hero.md](./zero-to-hero.md) | [RFC](../SCOPED_MEMORY_KNOWLEDGE_BRIDGE_RFC.md) + +--- + +## Overview + +``` +Phase 0 → Architecture alignment (done) +Phase 1 → Canonical Memory Record + tools (PR #275, in progress) +Phase 2 → Policy enforcement: inject + auto-promote +Phase 3 → Learning: post-run extraction +Phase 4 → Learning profiles + learning agent formalization +Phase 5 → Retrieval upgrades: hybrid search, metadata filters +``` + +--- + +## Phase 0 — Architecture Alignment ✅ Done + +**Branch:** `docs/scoped-memory-knowledge-bridge` | **PR:** #274 + +RFC defining the three-layer model, terminology, phase roadmap. + +--- + +## Phase 1 — Canonical Memory Record + Tools 🔄 In Progress + +**Branch:** `feature/scoped-memory-core` | **PR:** #275 + +### What is included + +**DocTypes:** +- `Memory Record` — full schema: scopes, visibility, lifecycle, projection fields, quality signals +- `Memory Policy` — config shell for future enforcement. Schema complete, no runtime enforcement yet. + +**Backend tools (whitelisted, agent-callable):** +- `save_memory_record` — scoped write with permission enforcement +- `get_memory_record` — scoped read with permission enforcement +- `search_memory_records` — multi-scope search, query filtering, limit cap +- `archive_memory_record` — sets status to Archived, checks both read + write permission +- `promote_memory_to_knowledge` — manager-only, queues projection to Knowledge Input + +**Knowledge projection pipeline:** +- Memory Record → formatted text → Knowledge Input (input_type=Text) → Knowledge Source queue +- Projection status: `Not Indexed → Queued → Projected → Error / Removed` +- `Projected` means Memory Record has been handed to Knowledge Input pipeline +- Actual indexing status owned by Knowledge Input (Pending → Processing → Indexed → Error) +- Re-projection updates existing Knowledge Input rather than creating duplicates + +**Permission model:** +- Desk access to Memory Record: System Manager and Huf Manager only +- User/agent access: through tool-level scope/visibility filtering only +- Normal users: can write Conversation and their own User memory +- Managers: can write any scope, promote to knowledge + +**Native tool wiring:** +- 5 new types in `agent_tool_function.json`: Save Memory Record, Search Memory Records, Get Memory Record, Archive Memory Record, Promote Memory to Knowledge +- Each type maps to the corresponding handler in `huf/ai/memory_tools.py` via `sdk_tools.py` + +### What is explicitly NOT included + +- Memory Policy runtime enforcement (config shell only) +- Automatic memory capture from runs +- Frontend memory tab or UI (Desk only, manager-visible) +- New vector DB logic +- Hindsight-style retain/recall/reflect + +### Definition of done + +- [ ] `python -m py_compile` passes on all three new Python files +- [ ] `bench migrate` applies cleanly +- [ ] Memory Record can be created, activated, promoted to Knowledge +- [ ] Projection status shows `Projected` after queuing +- [ ] Normal user cannot create Role/Site/Global memory via tools +- [ ] Manager can promote memory to an existing Knowledge Source +- [ ] Agent Tool Function can use Save Memory Record and Search Memory Records types + +--- + +## Phase 2 — Policy Enforcement: Inject + Auto-promote + +**Depends on:** Phase 1 merged + +### Goal + +Make Memory Policy do something at runtime. Focus on the two highest-value paths: injecting memory into agent context before a run, and auto-promoting records that meet quality thresholds. + +### What is included + +**Agent-level policy linking:** +- Add `memory_policy` Link field to Agent DocType +- Add `default_memory_policy` Link field to Agent Settings (singleton) for site-wide fallback +- Policy resolver: `resolve_memory_policy(agent_name)` → Agent Policy → Site Default → None + +**New module: `huf/ai/memory_policy_resolver.py`** +- `resolve_memory_policy(agent_name)` — returns effective MemoryPolicy doc or None +- `get_injectable_memory(agent_name, conversation_id, policy)` — returns Memory Records within token budget +- `build_memory_context_block(records, policy)` — formats records for system prompt injection + +**Hook into `agent_integration.py`:** +- Before run: if `inject_mode != "None"` → prepend memory context block to system prompt +- After run: if `auto_promote_to_knowledge` → check records meeting thresholds → queue projection + +**Injection modes:** +- `None` — no injection (default, backward-compatible) +- `Append to System Prompt` — memory records added as structured block +- `Tool Available` — no injection, but `search_memory_records` tool is available + +**Token budget enforcement:** +- Records sorted by `importance_score desc, modified desc` +- Trimmed to fit within `token_budget` (estimated 4 chars/token) + +--- + +## Phase 3 — Learning: Post-run Extraction + +**Depends on:** Phase 2 merged + +### Goal + +Let Memory Policy control when and how the system extracts Memory Records from completed runs. Optionally delegate extraction to a dedicated learning agent. + +### Memory Policy — new Learning section fields + +- `learning_enabled` (Check) +- `learning_trigger` (Select: Manual | End of Conversation | Every N Turns) +- `turns_per_extraction` (Int) +- `learning_agent` (Link → Agent) +- `extracted_record_default_status` (Select: Draft | Active) +- `extraction_model` (Data — optional model override for built-in extraction) + +### New module: `huf/ai/memory_extractor.py` + +- `extract_memories_from_run(agent_run_id, memory_policy_name)` +- Assembles transcript from Agent Messages +- If `learning_agent` set: routes to that agent via `run_agent_sync()`, parses output +- If not: calls built-in extraction prompt against base model +- Saves extracted records with source_type = "Agent Run" + +**Trigger hooks:** +- End of Conversation: hook on Agent Conversation `on_update` when status → Closed/Complete +- Every N Turns: hook on Agent Message `after_insert`, fire when threshold reached +- Manual: whitelist endpoint `trigger_memory_extraction(agent_run_id)` + +**Draft-first safety:** +- If `approval_required = True`, always Draft regardless of `extracted_record_default_status` + +--- + +## Phase 4 — Learning Profiles + Learning Agent Formalization + +**Depends on:** Phase 3 merged + +### Built-in presets (seeded at install) + +| Profile | capture_mode | inject_mode | approval_required | learning_trigger | +|---------|-------------|-------------|-------------------|-----------------| +| Minimal | Manual | None | Yes | Manual | +| Conversational | Auto | Append to System | No | End of Conversation | +| Research | Both | Tool Available | Yes | End of Conversation | +| Operational | Auto | Append to System | No | Every N Turns (5) | + +**Agent DocType gains `agent_role`** (Select: General | Learning | Orchestrator | ...) + +Memory Policy `learning_agent` field filtered to agents with `agent_role = Learning`. + +**Agent Settings gains `default_learning_profile`** for site-wide default. + +--- + +## Phase 5 — Retrieval Upgrades: Hybrid Search + Metadata Filters + +**Depends on:** Phase 2 merged (independent of Phases 3–4) + +### What is included + +**Hybrid search (per Knowledge Source):** +- New `search_mode` field: FTS | Vector | Hybrid +- Hybrid runs both FTS + vector, combines scores (reciprocal rank fusion) +- Requires embedding to be configured + +**Metadata filters in retrieval:** +- Knowledge Input gains `metadata_json` field +- During memory projection: tags + scope_type + record_type written to metadata +- `knowledge_search()` accepts `metadata_filters` dict + +**Memory-specific search helper:** +- `search_memory_knowledge(query, agent_name, filters)` — searches Knowledge Sources containing projected memory, attributes results back to source Memory Record + +**New backend abstractions:** +- `supports_metadata_filters()` method on `KnowledgeBackend` +- `supports_hybrid_search()` method on `KnowledgeBackend` +- Future backends (pg_vector, Qdrant) can implement both + +**Chunk cleanup on Knowledge Input deletion:** +- `KnowledgeInput.on_trash()` calls `backend.delete_chunks(input_id)` +- Ensures removing a projected Memory Record removes indexed content + +--- + +## Cross-cutting concerns + +### Security (all phases) + +- Memory Record Desk access: Managers only +- Tool-level access: scoped per user/role/agent context +- Wider scopes (Role, Site, Global): Managers only +- Knowledge promotion: Managers only +- Extracted draft records: Manager review before activation +- User-scoped memory: never leaks to role/site/global search + +### Backward compatibility + +- Phase 1: new DocTypes only — no changes to existing DocTypes +- Phase 2: optional fields added to Agent and Agent Settings +- Phase 3: optional fields added to Memory Policy +- Phase 5: optional fields added to Knowledge Source and Knowledge Input +- No existing Knowledge Source or Agent Tool Function behavior changes without opt-in + +### Open questions + +1. Should memory injection be visible to users in chat UI? (transparency / trust) +2. Should users be able to view and delete their own User-scoped memory? +3. How should contradictory memory records be handled in Phase 3+? +4. Should Data Tables be eligible as memory sources in a future phase? +5. What telemetry should capture memory injection quality and extraction cost? diff --git a/docs/memory/zero-to-hero.md b/docs/memory/zero-to-hero.md new file mode 100644 index 00000000..08e9db74 --- /dev/null +++ b/docs/memory/zero-to-hero.md @@ -0,0 +1,323 @@ +# HUF Memory Architecture: Zero to Hero + +> **Who this is for:** Anyone new to the memory/knowledge area of HUF — engineers, contributors, or agents reading this cold. This document captures the full intellectual journey: where the ideas came from, what was tried, what was decided, and why. Read this before touching any memory-related code. + +--- + +## 1. The problem we are solving + +HUF started as a conversational AI platform. Agents talk to users, run tools, and produce results. But every conversation was a blank slate. Agents had no memory of what they'd learned, no way to build up preferences or patterns over time, and no way to carry useful facts from one session to the next. + +This became a real limitation: + +- A user tells an agent their preferences in one conversation. The agent has forgotten them in the next. +- An agent learns a reliable routing pattern after many runs. That learning disappears. +- Research done in one session cannot be made available as searchable knowledge for other agents. +- There is no way to say "this fact, learned from a conversation, should be considered authoritative at the site level." + +The gap was: **HUF had Knowledge Sources (static documents, PDFs, URLs) and conversation-local working data, but nothing in between** — no layer for learned, scoped, evolving data that could optionally become searchable. + +--- + +## 2. The reference systems that shaped the design + +Before arriving at the current architecture, several external systems were studied carefully. Understanding these is essential for understanding why HUF's design looks the way it does. + +### 2.1 Agno (formerly phidata) + +Agno is an open-source Python framework for building multi-modal AI agents. It has a clean separation between: + +- **Agent memory**: short-term per-session state +- **Agent knowledge**: structured, indexed, searchable content (PDFs, URLs, tables, text) +- **Agent storage**: long-term persistence of runs and sessions + +Agno's knowledge system uses a `Knowledge` class with pluggable readers (PDFReader, URLReader, etc.) and vector stores (pgvector, Qdrant, Pinecone, LanceDB, etc.). It separates the *reader* (how you get text from a source) from the *store* (how you index and retrieve it). + +**What HUF borrowed from Agno:** +- The concept of pluggable knowledge backends (sqlite_fts, sqlite_vec, chroma — all implement a common `KnowledgeBackend` ABC) +- The idea that agents should be able to search knowledge before responding +- The pattern of separating ingestion from retrieval + +**What HUF did differently:** +- HUF wraps this in Frappe DocTypes (Knowledge Source, Knowledge Input) so it integrates with the rest of the platform's permissions, workflows, and UI +- HUF has a stronger multi-tenancy and scoping requirement (User, Role, Agent, Site, Global) + +### 2.2 Hindsight (memory consolidation pattern) + +Hindsight is a research-inspired design pattern for long-term agent memory. The core idea is three operations: + +- **retain**: extract and save something worth remembering from a conversation +- **recall**: retrieve relevant memory when starting a new conversation +- **reflect**: periodically consolidate, deduplicate, and upgrade memory quality + +Hindsight-style systems typically run as a background process after each conversation. A second LLM call (the "reflection agent") reads the transcript and decides what to save. + +**What HUF borrowed from Hindsight:** +- The idea of a dedicated "learning agent" that reads transcripts and extracts memory +- The concept of a learning trigger (end of conversation, every N turns) +- The draft → review → active lifecycle for extracted memories + +**What HUF is NOT doing (yet):** +- Hindsight's reflect step (periodic consolidation across many memories) is not implemented +- Automatic contradiction detection is not implemented +- Memory decay and supersession is manual, not automatic + +### 2.3 Mem0 / MemGPT patterns + +These systems maintain a dedicated memory layer that agents read from and write to during a conversation, with a structured schema for different memory types (episodic, semantic, procedural). + +**What HUF borrowed:** +- The scoped record type model (Fact, Preference, Decision, Pattern, Research, Instruction) +- The importance score + confidence fields for quality filtering +- The visibility model (Private, Shared with Role, Site, Global) + +--- + +## 3. How HUF's existing knowledge system works + +Before memory makes sense, you need to understand knowledge. Here is the actual pipeline: + +``` +Knowledge Source (DocType) +├── knowledge_type: sqlite_fts | sqlite_vec | chroma +├── embedding_model (for vector backends) +└── Knowledge Inputs [] + ├── input_type: Text | File | URL + ├── status: Pending | Processing | Indexed | Error + └── text / file / url content + +Indexing pipeline (huf/ai/knowledge/indexer.py): + Knowledge Input → extract text → chunk → embed (if vector) → store in backend + +Retrieval pipeline (huf/ai/knowledge/retriever.py): + query → embed query → search backend → return ChunkResult[] +``` + +**Three backends exist today, all implementing `KnowledgeBackend` ABC:** + +| Backend | Type | When to use | Dependencies | +|---------|------|-------------|--------------| +| `sqlite_fts` | Keyword (BM25) | Always available, no GPU needed | None | +| `sqlite_vec` | Vector (semantic) | When you need semantic similarity | pysqlite3-binary + sqlite-vec | +| `chroma` | Vector (semantic) | When you want a separate vector store, optionally server-mode | chromadb + llama-index | + +**Adding a new backend** (e.g., pg_vector) means: +1. Implement `KnowledgeBackend` ABC in `huf/ai/knowledge/backends/` +2. Register it in `get_backend()` in `__init__.py` +3. Add the option to Knowledge Source's `knowledge_type` Select field + +The memory layer **never needs to change** when backends are added or removed. + +--- + +## 4. The memory architecture + +### 4.1 The three-layer model + +``` +┌─────────────────────────────────────────────────────────────────────┐ +│ Conversation Data (temporary, per-session) │ +│ → selected items, form values, current state, agent working memory │ +│ → lives in Agent Conversation / run context only │ +└────────────────────────────┬────────────────────────────────────────┘ + │ manual save or policy-triggered extract + ▼ +┌─────────────────────────────────────────────────────────────────────┐ +│ Memory Record (canonical, scoped, governed) │ +│ → Fact, Preference, Decision, Pattern, Research, Instruction │ +│ → scoped: Conversation / User / Role / Agent / Site / Global │ +│ → governed by Memory Policy │ +└────────────────────────────┬────────────────────────────────────────┘ + │ promote_to_knowledge (explicit or auto) + ▼ +┌─────────────────────────────────────────────────────────────────────┐ +│ Knowledge Source (indexed, searchable) │ +│ → sqlite_fts | sqlite_vec | chroma | future: pg_vector... │ +│ → used by agents via mandatory/optional knowledge linking │ +└─────────────────────────────────────────────────────────────────────┘ +``` + +**Key principle:** Memory Record does not know about backends. It targets a Knowledge Source. The Knowledge Source owns the backend type. This means new backends require zero changes to memory tools or Memory Policy. + +### 4.2 Memory Record scopes + +| Scope | Who can write | Who can read | scope_key value | +|-------|--------------|--------------|-----------------| +| Conversation | Any user (in that conversation) | Same conversation only | conversation docname | +| User | That user only | That user only | frappe.session.user | +| Role | Managers only | Users with that role + visibility="Shared with Role" | role name | +| Agent | Managers only (or agent if allowed by policy) | Agent with matching name | agent docname | +| Site | Managers only | Everyone if visibility="Site" | frappe.local.site | +| Global | Managers only | Everyone if visibility="Global" | "global" | + +### 4.3 Memory Policy + +Memory Policy is the governance and behavior config layer. It sits between Memory Records and agent runtime. + +**What it configures (fields exist, enforcement is phased):** + +``` +Capture: + capture_mode: Manual | Auto (on_run_end) | Both + approval_required: bool + default_status: Draft | Active + allowed_record_types: [Fact, Preference, ...] + +Retrieval: + inject_mode: None | Append to System Prompt | Tool Available + max_records: int + token_budget: int + +Write controls: + allow_agent_write: bool + allow_user_scope_write: bool + allow_role_scope_write: bool (manager override) + +Projection: + auto_promote_to_knowledge: bool + knowledge_source: Link → Knowledge Source + promotion_min_confidence: float + promotion_min_importance: float + +Lifecycle: + ttl_days: int +``` + +Policy resolution at runtime (Phase 2+): **Agent Policy → Site Default → built-in safe defaults.** + +--- + +## 5. The learning system + +### 5.1 What "learning" means here + +An agent "learns" when something worth remembering is extracted from a run or conversation and saved as a Memory Record for future use. This is different from fine-tuning the model. It is structured, auditable, and reversible. + +### 5.2 How extraction works (Phase 3) + +When a learning trigger fires: +1. The agent run transcript is assembled +2. If `learning_agent` is set on the Memory Policy, that agent is called with the transcript +3. If not, a built-in extraction prompt runs against the same base model +4. Extracted facts/preferences/decisions are saved as Memory Records +5. Default status is `Draft` (if `approval_required`) or `Active` (if not) + +### 5.3 Learning triggers + +| Trigger | When it fires | +|---------|--------------| +| Manual | Only when explicitly called | +| End of Conversation | When conversation is closed / marked complete | +| Every N Turns | After every N agent turns in the conversation | + +### 5.4 Learning agent pattern + +A "learning agent" is just a regular HUF Agent with a specialized system prompt. It receives a conversation transcript and returns structured memory records. This means: +- You can use any model for extraction (not necessarily the same as the active agent) +- You can version and iterate the extraction prompt without touching the main agent +- You can inspect what the learning agent produces before it becomes Active + +--- + +## 6. What exists today vs what is planned + +### Today (after PR #275 is merged) + +| Capability | Status | +|-----------|--------| +| Memory Record DocType (full schema) | ✅ Done | +| Memory Policy DocType (schema + validation) | ✅ Done (config shell, no runtime enforcement) | +| 5 memory tool handlers | ✅ Done | +| Scoped permission enforcement in tools | ✅ Done | +| Memory → Knowledge Input projection | ✅ Done | +| Projection status tracking (`Projected`) | ✅ Done | +| Manager-only Desk access | ✅ Done | +| Native tool wiring in Agent Tool Function | ✅ Done | +| Memory Policy enforcement at runtime | ❌ Phase 2 | +| Agent-linked memory policy | ❌ Phase 2 | +| Auto-inject memory into agent context | ❌ Phase 2 | +| Post-run memory extraction | ❌ Phase 3 | +| Learning agent delegation | ❌ Phase 3 | +| Learning profiles (presets) | ❌ Phase 4 | +| Hybrid FTS + vector search for memory | ❌ Phase 5 | + +### Not in scope (ever, by design) + +- Automatic promotion of all conversation data to memory (too noisy) +- Fine-tuning or model weight updates +- Replacing Frappe permissions with custom auth +- Memory leaking across user/role/site boundaries + +--- + +## 7. Key files and where to look + +| What | Where | +|------|-------| +| Memory tool handlers (save/get/search/archive/promote) | `huf/ai/memory_tools.py` | +| Memory Record controller (validation, projection queue) | `huf/huf/doctype/memory_record/memory_record.py` | +| Memory Record schema | `huf/huf/doctype/memory_record/memory_record.json` | +| Memory Policy controller | `huf/huf/doctype/memory_policy/memory_policy.py` | +| Memory Policy schema | `huf/huf/doctype/memory_policy/memory_policy.json` | +| Knowledge backend abstraction | `huf/ai/knowledge/backends/__init__.py` | +| FTS backend | `huf/ai/knowledge/backends/sqlite_fts.py` | +| Vector backend | `huf/ai/knowledge/backends/sqlite_vec_backend.py` | +| ChromaDB backend | `huf/ai/knowledge/backends/chroma_backend.py` | +| Indexing pipeline | `huf/ai/knowledge/indexer.py` | +| Retrieval pipeline | `huf/ai/knowledge/retriever.py` | +| Knowledge Source controller | `huf/huf/doctype/knowledge_source/knowledge_source.py` | +| RFC (architecture decisions) | `docs/SCOPED_MEMORY_KNOWLEDGE_BRIDGE_RFC.md` | +| Phase plan | `docs/memory/phase-plan.md` | + +--- + +## 8. How to contribute + +**Adding a new knowledge backend:** +1. Implement `KnowledgeBackend` in `huf/ai/knowledge/backends/` +2. Register in `get_backend()` in `__init__.py` +3. Add option to Knowledge Source `knowledge_type` field +4. Add validation in `knowledge_source.py` if dependencies need checking + +**Adding a new memory scope or record type:** +- Scope types: add to `scope_type` Select field in `memory_record.json`, update `can_read()` and `can_write()` in `memory_tools.py`, update scope resolver in `resolved_key()` +- Record types: add to `record_type` Select field in `memory_record.json` (no code changes needed) + +**Implementing a new Memory Policy enforcement (Phase 2+):** +- Policy resolver logic will live in `huf/ai/memory_policy_resolver.py` (to be created) +- Reads agent's linked policy or falls back to site default from Agent Settings +- Hooks into `agent_integration.py` before and after agent runs + +**Writing a learning agent:** +- Create a regular Agent with a specialized system prompt for memory extraction +- Link it as `learning_agent` in a Memory Policy + +--- + +## 9. Design decisions and their reasons + +| Decision | Why | +|---------|-----| +| Memory Record doesn't know about backends | Adding pg_vector shouldn't require touching memory tools | +| Memory Policy is config-shell-first | The schema needs to stabilize before enforcement. Wrong enforcement is worse than no enforcement. | +| Manager-only Desk access to Memory Records | Desk DocPerm can't enforce per-scope visibility rules. Tool-level access is the correct control path for users. | +| Projection status = "Projected" not "Indexed" | "Indexed" implies the knowledge pipeline has completed. It hasn't — Knowledge Input processing is async. "Projected" means we've handed it off. | +| Learning agent is a regular Agent, not special-cased | Reuses the entire Agent infrastructure. Can be versioned, tested, and swapped independently. | +| Draft default for extracted memory | Automatic extraction without human review is a data quality risk. Draft-first is safer. | + +--- + +## 10. Glossary + +| Term | Meaning | +|------|---------| +| **Memory Record** | A canonical scoped fact, preference, decision, pattern, or research note. The source of truth for what an agent has learned. | +| **Memory Policy** | Config governing capture, retrieval, injection, and promotion rules for memory records. Linked to an Agent or set site-wide. | +| **Knowledge Source** | A HUF DocType representing an indexed, searchable knowledge store. Has a pluggable backend (FTS, vector, chroma). | +| **Knowledge Input** | A single item (text, file, URL) that gets indexed into a Knowledge Source. | +| **Knowledge Projection** | The act of converting a Memory Record into a Knowledge Input so it becomes searchable. Status = Projected means it has been handed to the Knowledge Input pipeline. | +| **Learning Agent** | A regular HUF Agent configured specifically to read transcripts and extract Memory Records. | +| **Scope** | The boundary within which a Memory Record is valid and visible. (Conversation / User / Role / Agent / Site / Global) | +| **Visibility** | Fine-grained access control within a scope. (Private / Shared with Role / Shared with Agent / Site / Global) | +| **Agno-direction** | Pluggable reader/store architecture for knowledge, inspired by the Agno (phidata) framework. | +| **Hindsight-direction** | Post-run memory extraction pattern with retain/recall/reflect semantics. | diff --git a/huf/ai/memory_tools.py b/huf/ai/memory_tools.py new file mode 100644 index 00000000..21242756 --- /dev/null +++ b/huf/ai/memory_tools.py @@ -0,0 +1,148 @@ +import json +import frappe + +MANAGER_ROLES = {"System Manager", "Huf Manager"} + + +def is_manager(): + return bool(set(frappe.get_roles(frappe.session.user)) & MANAGER_ROLES) + + +def json_value(value): + if value in (None, ""): + return None + if isinstance(value, str): + try: + json.loads(value) + return value + except Exception: + return json.dumps(value, ensure_ascii=False) + return json.dumps(value, ensure_ascii=False, default=str) + + +def resolved_key(scope_type, scope_key=None, conversation_id=None, agent_name=None): + if scope_key: + return scope_key + return {"Conversation": conversation_id, "User": frappe.session.user, "Agent": agent_name, "Site": frappe.local.site, "Global": "global"}.get(scope_type) + + +def can_read(row, conversation_id=None, agent_name=None): + if is_manager(): + return True + get = row.get if isinstance(row, dict) else lambda k, d=None: getattr(row, k, d) + scope_type = get("scope_type") + scope_key = get("scope_key") + visibility = get("visibility") or "Private" + if scope_type == "Conversation": + return conversation_id and scope_key == conversation_id + if scope_type == "User": + return scope_key == frappe.session.user + if scope_type == "Role": + return visibility == "Shared with Role" and scope_key in frappe.get_roles(frappe.session.user) + if scope_type == "Agent": + return agent_name and scope_key == agent_name and visibility in {"Private", "Shared with Agent"} + if scope_type == "Site": + return visibility == "Site" and scope_key == frappe.local.site + if scope_type == "Global": + return visibility == "Global" and scope_key == "global" + return False + + +def can_write(scope_type, scope_key=None, agent_name=None): + if is_manager(): + return True + if scope_type in {"Role", "Workspace", "Site", "Global"}: + return False + if scope_type == "User": + return scope_key == frappe.session.user + if scope_type == "Agent": + return agent_name and scope_key == agent_name + if scope_type == "Conversation": + return True + return False + + +@frappe.whitelist() +def save_memory_record(title, summary_text, record_type="Fact", scope_type="Conversation", scope_key=None, data_json=None, status="Draft", visibility="Private", tags=None, confidence=0, importance_score=0, source_type="Manual", conversation_id=None, agent_run_id=None, agent_name=None, promote_to_knowledge=False, knowledge_source=None, raw_context_excerpt=None, **kwargs): + key = resolved_key(scope_type, scope_key, conversation_id, agent_name) + if not key or not can_write(scope_type, key, agent_name): + frappe.throw("Memory write blocked") + if promote_to_knowledge and not is_manager(): + frappe.throw("Knowledge promotion blocked") + tag_text = ", ".join(tags) if isinstance(tags, list) else (tags or "") + doc = frappe.get_doc({"doctype": "Memory Record", "title": title, "summary_text": summary_text, "record_type": record_type, "scope_type": scope_type, "scope_key": key, "status": status, "visibility": visibility, "tags": tag_text, "confidence": float(confidence or 0), "importance_score": float(importance_score or 0), "source_type": source_type, "conversation": conversation_id if scope_type == "Conversation" else None, "run": agent_run_id, "agent": agent_name, "data_json": json_value(data_json), "raw_context_excerpt": raw_context_excerpt, "promote_to_knowledge": 1 if promote_to_knowledge else 0, "knowledge_source": knowledge_source}) + doc.insert() + if promote_to_knowledge and knowledge_source and doc.status == "Active": + doc.queue_knowledge_projection() + return {"success": True, "memory_record": doc.name, "status": doc.status, "scope_type": doc.scope_type, "scope_key": doc.scope_key, "projection_status": doc.projection_status} + + +@frappe.whitelist() +def get_memory_record(memory_record, conversation_id=None, agent_name=None, **kwargs): + doc = frappe.get_doc("Memory Record", memory_record) + if not can_read(doc, conversation_id, agent_name): + frappe.throw("Memory read blocked") + return doc.as_dict() + + +@frappe.whitelist() +def search_memory_records(query=None, record_type=None, scope_type=None, status="Active", limit=10, conversation_id=None, agent_name=None, **kwargs): + scopes = [] + if conversation_id: + scopes.append({"scope_type": "Conversation", "scope_key": conversation_id}) + if frappe.session.user != "Guest": + scopes.append({"scope_type": "User", "scope_key": frappe.session.user}) + scopes += [{"scope_type": "Role", "scope_key": r} for r in frappe.get_roles(frappe.session.user)] + if agent_name: + scopes.append({"scope_type": "Agent", "scope_key": agent_name}) + scopes += [{"scope_type": "Site", "scope_key": frappe.local.site}, {"scope_type": "Global", "scope_key": "global"}] + if scope_type: + scopes = [s for s in scopes if s["scope_type"] == scope_type] + results, seen, max_rows = [], set(), min(int(limit or 10), 50) + base = {"status": status} if status else {} + if record_type: + base["record_type"] = record_type + for scope in scopes: + filters = dict(base) + filters.update(scope) + rows = frappe.get_all("Memory Record", filters=filters, fields=["name", "title", "record_type", "scope_type", "scope_key", "visibility", "status", "summary_text", "confidence", "importance_score", "tags", "agent", "conversation", "knowledge_source", "projection_status", "modified"], order_by="importance_score desc, modified desc", limit_page_length=max_rows) + for row in rows: + text = " ".join(str(row.get(f) or "") for f in ["title", "summary_text", "record_type", "tags"]) + if row.name in seen or (query and query.lower() not in text.lower()) or not can_read(row, conversation_id, agent_name): + continue + seen.add(row.name) + results.append(row) + if len(results) >= max_rows: + return {"success": True, "results": results} + return {"success": True, "results": results} + + +@frappe.whitelist() +def archive_memory_record(memory_record, conversation_id=None, agent_name=None, **kwargs): + doc = frappe.get_doc("Memory Record", memory_record) + if not can_write(doc.scope_type, doc.scope_key, agent_name) or not can_read(doc, conversation_id, agent_name): + frappe.throw("Memory archive blocked") + doc.status = "Archived" + doc.save() + return {"success": True, "memory_record": doc.name, "status": doc.status} + + +@frappe.whitelist() +def promote_memory_to_knowledge(memory_record, knowledge_source=None, **kwargs): + if not is_manager(): + frappe.throw("Knowledge promotion blocked") + doc = frappe.get_doc("Memory Record", memory_record) + if knowledge_source: + doc.knowledge_source = knowledge_source + doc.promote_to_knowledge = 1 + if doc.status != "Active": + doc.status = "Active" + doc.save() + return doc.queue_knowledge_projection() + + +handle_save_memory_record = save_memory_record +handle_get_memory_record = get_memory_record +handle_search_memory_records = search_memory_records +handle_archive_memory_record = archive_memory_record +handle_promote_memory_to_knowledge = promote_memory_to_knowledge diff --git a/huf/ai/sdk_tools.py b/huf/ai/sdk_tools.py index 53e281ed..53092c61 100644 --- a/huf/ai/sdk_tools.py +++ b/huf/ai/sdk_tools.py @@ -138,6 +138,16 @@ def create_agent_tools(agent) -> list[FunctionTool]: function_path = "huf.ai.sdk_tools.handle_set_conversation_data" elif function_doc.types == "Load Conversation Data": function_path = "huf.ai.sdk_tools.handle_load_conversation_data" + elif function_doc.types == "Save Memory Record": + function_path = "huf.ai.memory_tools.handle_save_memory_record" + elif function_doc.types == "Search Memory Records": + function_path = "huf.ai.memory_tools.handle_search_memory_records" + elif function_doc.types == "Get Memory Record": + function_path = "huf.ai.memory_tools.handle_get_memory_record" + elif function_doc.types == "Archive Memory Record": + function_path = "huf.ai.memory_tools.handle_archive_memory_record" + elif function_doc.types == "Promote Memory to Knowledge": + function_path = "huf.ai.memory_tools.handle_promote_memory_to_knowledge" else: continue diff --git a/huf/huf/doctype/agent_tool_function/agent_tool_function.json b/huf/huf/doctype/agent_tool_function/agent_tool_function.json index 3081c585..3c9c5a07 100644 --- a/huf/huf/doctype/agent_tool_function/agent_tool_function.json +++ b/huf/huf/doctype/agent_tool_function/agent_tool_function.json @@ -52,7 +52,7 @@ "fieldname": "types", "fieldtype": "Select", "label": "Types", - "options": "\nGet Document\nGet Multiple Documents\nGet List\nCreate Document\nCreate Multiple Documents\nUpdate Document\nUpdate Multiple Documents\nDelete Document\nDelete Multiple Documents\nSubmit Document\nCancel Document\nGet Amended Document\nCustom Function\nApp Provided\nAttach File to Document\nGet Report Result\nGet Value\nSet Value\nGET\nPOST\nRun Agent\nClient Side Tool\nGet Conversation Data\nSet Conversation Data\nLoad Conversation Data" + "options": "\nGet Document\nGet Multiple Documents\nGet List\nCreate Document\nCreate Multiple Documents\nUpdate Document\nUpdate Multiple Documents\nDelete Document\nDelete Multiple Documents\nSubmit Document\nCancel Document\nGet Amended Document\nCustom Function\nApp Provided\nAttach File to Document\nGet Report Result\nGet Value\nSet Value\nGET\nPOST\nRun Agent\nClient Side Tool\nGet Conversation Data\nSet Conversation Data\nLoad Conversation Data\nSave Memory Record\nSearch Memory Records\nGet Memory Record\nArchive Memory Record\nPromote Memory to Knowledge" }, { "depends_on": "eval: [\n 'Get Document',\n 'Get Multiple Documents',\n 'Get List',\n 'Create Document',\n 'Create Multiple Documents',\n 'Update Document',\n 'Update Multiple Documents',\n 'Delete Document',\n 'Delete Multiple Documents',\n 'Submit Document',\n 'Cancel Document',\n 'Get Amended Document',\n 'Attach File to Document',\n 'Get Report Result',\n 'Get Value',\n 'Set Value'\n].includes(doc.types)", diff --git a/huf/huf/doctype/memory_policy/__init__.py b/huf/huf/doctype/memory_policy/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/huf/huf/doctype/memory_policy/memory_policy.json b/huf/huf/doctype/memory_policy/memory_policy.json new file mode 100644 index 00000000..3b0326ca --- /dev/null +++ b/huf/huf/doctype/memory_policy/memory_policy.json @@ -0,0 +1,89 @@ +{ + "actions": [], + "allow_rename": 1, + "autoname": "field:policy_name", + "creation": "2026-05-27 00:00:00.000000", + "doctype": "DocType", + "engine": "InnoDB", + "field_order": [ + "policy_section", + "policy_name", + "enabled", + "agent", + "column_break_policy", + "scope_type", + "scope_key", + "capture_section", + "capture_mode", + "approval_required", + "default_status", + "allowed_record_types", + "retrieval_section", + "inject_mode", + "max_records", + "token_budget", + "write_section", + "allow_agent_write", + "allow_user_scope_write", + "allow_role_scope_write", + "allow_agent_scope_write", + "allow_site_scope_write", + "projection_section", + "auto_promote_to_knowledge", + "knowledge_source", + "promotion_min_confidence", + "promotion_min_importance", + "lifecycle_section", + "ttl_days", + "metadata_json" + ], + "fields": [ + {"fieldname": "policy_section", "fieldtype": "Section Break", "label": "Policy"}, + {"fieldname": "policy_name", "fieldtype": "Data", "label": "Policy Name", "reqd": 1, "unique": 1, "in_list_view": 1}, + {"default": "1", "fieldname": "enabled", "fieldtype": "Check", "label": "Enabled", "in_list_view": 1}, + {"fieldname": "agent", "fieldtype": "Link", "label": "Agent", "options": "Agent", "in_list_view": 1}, + {"fieldname": "column_break_policy", "fieldtype": "Column Break"}, + {"default": "Agent", "fieldname": "scope_type", "fieldtype": "Select", "label": "Scope Type", "options": "Conversation\nUser\nRole\nAgent\nWorkspace\nSite\nGlobal", "reqd": 1}, + {"description": "Optional default scope key for this policy. If empty, runtime context decides.", "fieldname": "scope_key", "fieldtype": "Data", "label": "Scope Key"}, + {"fieldname": "capture_section", "fieldtype": "Section Break", "label": "Capture"}, + {"default": "Manual", "fieldname": "capture_mode", "fieldtype": "Select", "label": "Capture Mode", "options": "Manual\nAgent Suggested\nAutomatic", "reqd": 1}, + {"default": "1", "fieldname": "approval_required", "fieldtype": "Check", "label": "Approval Required"}, + {"default": "Draft", "fieldname": "default_status", "fieldtype": "Select", "label": "Default Status", "options": "Draft\nActive", "reqd": 1}, + {"description": "Optional newline-separated allowed record types. Empty means all types are allowed.", "fieldname": "allowed_record_types", "fieldtype": "Small Text", "label": "Allowed Record Types"}, + {"fieldname": "retrieval_section", "fieldtype": "Section Break", "label": "Retrieval"}, + {"default": "Tool Only", "fieldname": "inject_mode", "fieldtype": "Select", "label": "Inject Mode", "options": "Never\nRelevant Only\nAlways\nTool Only", "reqd": 1}, + {"default": "5", "fieldname": "max_records", "fieldtype": "Int", "label": "Max Records", "non_negative": 1}, + {"default": "1000", "fieldname": "token_budget", "fieldtype": "Int", "label": "Token Budget", "non_negative": 1}, + {"fieldname": "write_section", "fieldtype": "Section Break", "label": "Write Permissions"}, + {"default": "0", "fieldname": "allow_agent_write", "fieldtype": "Check", "label": "Allow Agent Write"}, + {"default": "1", "fieldname": "allow_user_scope_write", "fieldtype": "Check", "label": "Allow User Scope Write"}, + {"default": "0", "fieldname": "allow_role_scope_write", "fieldtype": "Check", "label": "Allow Role Scope Write"}, + {"default": "1", "fieldname": "allow_agent_scope_write", "fieldtype": "Check", "label": "Allow Agent Scope Write"}, + {"default": "0", "fieldname": "allow_site_scope_write", "fieldtype": "Check", "label": "Allow Site Scope Write"}, + {"fieldname": "projection_section", "fieldtype": "Section Break", "label": "Knowledge Projection"}, + {"default": "0", "fieldname": "auto_promote_to_knowledge", "fieldtype": "Check", "label": "Auto Promote to Knowledge"}, + {"depends_on": "eval:doc.auto_promote_to_knowledge==1", "fieldname": "knowledge_source", "fieldtype": "Link", "label": "Knowledge Source", "options": "Knowledge Source"}, + {"default": "0.8", "fieldname": "promotion_min_confidence", "fieldtype": "Float", "label": "Min Confidence", "precision": "2"}, + {"default": "0.6", "fieldname": "promotion_min_importance", "fieldtype": "Float", "label": "Min Importance", "precision": "2"}, + {"fieldname": "lifecycle_section", "fieldtype": "Section Break", "label": "Lifecycle"}, + {"default": "0", "fieldname": "ttl_days", "fieldtype": "Int", "label": "TTL Days", "non_negative": 1}, + {"fieldname": "metadata_json", "fieldtype": "JSON", "label": "Metadata JSON"} + ], + "index_web_pages_for_search": 1, + "links": [], + "modified": "2026-05-27 00:00:00.000000", + "modified_by": "Administrator", + "module": "Huf", + "name": "Memory Policy", + "naming_rule": "By fieldname", + "owner": "Administrator", + "permissions": [ + {"create": 1, "delete": 1, "email": 1, "export": 1, "print": 1, "read": 1, "report": 1, "role": "System Manager", "share": 1, "write": 1}, + {"create": 1, "delete": 1, "email": 1, "export": 1, "print": 1, "read": 1, "report": 1, "role": "Huf Manager", "share": 1, "write": 1}, + {"create": 0, "delete": 0, "email": 1, "export": 1, "print": 1, "read": 1, "report": 1, "role": "Huf User", "share": 1, "write": 0} + ], + "row_format": "Dynamic", + "sort_field": "modified", + "sort_order": "DESC", + "states": [] +} diff --git a/huf/huf/doctype/memory_policy/memory_policy.py b/huf/huf/doctype/memory_policy/memory_policy.py new file mode 100644 index 00000000..e02322d6 --- /dev/null +++ b/huf/huf/doctype/memory_policy/memory_policy.py @@ -0,0 +1,21 @@ +# Copyright (c) 2026, HUF and contributors +# For license information, please see license.txt + +import frappe +from frappe import _ +from frappe.model.document import Document + + +class MemoryPolicy(Document): + def validate(self): + if self.auto_promote_to_knowledge and not self.knowledge_source: + frappe.throw(_("Knowledge Source is required when Auto Promote to Knowledge is enabled")) + + if self.agent and self.scope_type == "Agent" and not self.scope_key: + self.scope_key = self.agent + + if self.scope_type == "Site" and not self.scope_key: + self.scope_key = frappe.local.site + + if self.scope_type == "Global" and not self.scope_key: + self.scope_key = "global" diff --git a/huf/huf/doctype/memory_policy/test_memory_policy.py b/huf/huf/doctype/memory_policy/test_memory_policy.py new file mode 100644 index 00000000..d481a086 --- /dev/null +++ b/huf/huf/doctype/memory_policy/test_memory_policy.py @@ -0,0 +1,22 @@ +# Copyright (c) 2026, HUF and contributors +# For license information, please see license.txt + +import frappe +from frappe.tests.utils import FrappeTestCase + + +class TestMemoryPolicy(FrappeTestCase): + def test_auto_promote_requires_knowledge_source(self): + doc = frappe.new_doc("Memory Policy") + doc.auto_promote_to_knowledge = 1 + doc.knowledge_source = "" + with self.assertRaises(frappe.ValidationError): + doc.validate() + + def test_agent_scope_autofills_scope_key(self): + doc = frappe.new_doc("Memory Policy") + doc.agent = "test-agent" + doc.scope_type = "Agent" + doc.scope_key = "" + doc.validate() + self.assertEqual(doc.scope_key, "test-agent") diff --git a/huf/huf/doctype/memory_record/__init__.py b/huf/huf/doctype/memory_record/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/huf/huf/doctype/memory_record/memory_record.js b/huf/huf/doctype/memory_record/memory_record.js new file mode 100644 index 00000000..086d0595 --- /dev/null +++ b/huf/huf/doctype/memory_record/memory_record.js @@ -0,0 +1,40 @@ +frappe.ui.form.on('Memory Record', { + refresh(frm) { + if (frm.doc.__islocal) { + return; + } + + if (frm.doc.status === 'Draft') { + frm.add_custom_button(__('Activate'), () => { + frm.set_value('status', 'Active'); + frm.save(); + }); + } + + if (frm.doc.promote_to_knowledge && frm.doc.knowledge_source && frm.doc.status === 'Active') { + frm.add_custom_button(__('Queue Knowledge Projection'), () => { + frappe.call({ + method: 'huf.huf.doctype.memory_record.memory_record.queue_memory_knowledge_projection', + args: { memory_record: frm.doc.name }, + callback: () => frm.reload_doc() + }); + }, __('Knowledge')); + } + + if (frm.doc.knowledge_input) { + frm.add_custom_button(__('Open Knowledge Input'), () => { + frappe.set_route('Form', 'Knowledge Input', frm.doc.knowledge_input); + }, __('Knowledge')); + + frm.add_custom_button(__('Remove Knowledge Projection'), () => { + frappe.confirm(__('Remove the Knowledge Input projection for this memory record?'), () => { + frappe.call({ + method: 'huf.huf.doctype.memory_record.memory_record.remove_memory_knowledge_projection', + args: { memory_record: frm.doc.name }, + callback: () => frm.reload_doc() + }); + }); + }, __('Knowledge')); + } + } +}); diff --git a/huf/huf/doctype/memory_record/memory_record.json b/huf/huf/doctype/memory_record/memory_record.json new file mode 100644 index 00000000..97071e68 --- /dev/null +++ b/huf/huf/doctype/memory_record/memory_record.json @@ -0,0 +1,66 @@ +{ + "actions": [], + "allow_rename": 1, + "autoname": "hash", + "creation": "2026-05-27 00:00:00.000000", + "doctype": "DocType", + "engine": "InnoDB", + "field_order": ["overview_section", "title", "record_type", "status", "column_break_overview", "scope_type", "scope_key", "visibility", "content_section", "summary_text", "data_json", "source_section", "agent", "conversation", "run", "column_break_source", "source_type", "source_message", "raw_context_excerpt", "quality_section", "confidence", "importance_score", "tags", "lifecycle_section", "effective_from", "effective_until", "ttl_days", "column_break_lifecycle", "supersedes_memory_record", "metadata_json", "knowledge_projection_section", "promote_to_knowledge", "knowledge_source", "knowledge_input", "column_break_projection", "projection_status", "last_projected_at", "projection_error"], + "fields": [ + {"fieldname": "overview_section", "fieldtype": "Section Break", "label": "Overview"}, + {"fieldname": "title", "fieldtype": "Data", "label": "Title", "reqd": 1, "in_list_view": 1}, + {"default": "Fact", "fieldname": "record_type", "fieldtype": "Select", "label": "Record Type", "options": "Fact\nPreference\nResearch Note\nDecision\nExtracted Data\nState\nSummary\nPolicy Hint\nObservation\nInsight\nCustom", "reqd": 1, "in_list_view": 1}, + {"default": "Draft", "fieldname": "status", "fieldtype": "Select", "label": "Status", "options": "Draft\nActive\nArchived\nExpired\nSuperseded\nRejected", "reqd": 1, "in_list_view": 1}, + {"fieldname": "column_break_overview", "fieldtype": "Column Break"}, + {"default": "Conversation", "fieldname": "scope_type", "fieldtype": "Select", "label": "Scope Type", "options": "Conversation\nUser\nRole\nAgent\nWorkspace\nSite\nGlobal", "reqd": 1, "in_list_view": 1}, + {"description": "Concrete scope identifier.", "fieldname": "scope_key", "fieldtype": "Data", "label": "Scope Key", "reqd": 1, "in_list_view": 1}, + {"default": "Private", "fieldname": "visibility", "fieldtype": "Select", "label": "Visibility", "options": "Private\nShared with Agent\nShared with Role\nSite\nGlobal", "reqd": 1}, + {"fieldname": "content_section", "fieldtype": "Section Break", "label": "Content"}, + {"fieldname": "summary_text", "fieldtype": "Text Editor", "label": "Summary Text", "reqd": 1}, + {"description": "Canonical structured payload for this memory/data record.", "fieldname": "data_json", "fieldtype": "JSON", "label": "Data JSON"}, + {"fieldname": "source_section", "fieldtype": "Section Break", "label": "Source"}, + {"fieldname": "agent", "fieldtype": "Link", "label": "Agent", "options": "Agent", "in_list_view": 1}, + {"fieldname": "conversation", "fieldtype": "Link", "label": "Conversation", "options": "Agent Conversation"}, + {"fieldname": "run", "fieldtype": "Link", "label": "Run", "options": "Agent Run"}, + {"fieldname": "column_break_source", "fieldtype": "Column Break"}, + {"default": "Manual", "fieldname": "source_type", "fieldtype": "Select", "label": "Source Type", "options": "Conversation\nRun\nManual\nEvent\nScheduled\nImported\nTool Output", "reqd": 1}, + {"fieldname": "source_message", "fieldtype": "Data", "label": "Source Message"}, + {"fieldname": "raw_context_excerpt", "fieldtype": "Long Text", "label": "Raw Context Excerpt"}, + {"fieldname": "quality_section", "fieldtype": "Section Break", "label": "Quality"}, + {"default": "0", "fieldname": "confidence", "fieldtype": "Float", "label": "Confidence", "precision": "2"}, + {"default": "0", "fieldname": "importance_score", "fieldtype": "Float", "label": "Importance Score", "precision": "2"}, + {"fieldname": "tags", "fieldtype": "Small Text", "label": "Tags"}, + {"fieldname": "lifecycle_section", "fieldtype": "Section Break", "label": "Lifecycle"}, + {"fieldname": "effective_from", "fieldtype": "Datetime", "label": "Effective From"}, + {"fieldname": "effective_until", "fieldtype": "Datetime", "label": "Effective Until"}, + {"default": "0", "fieldname": "ttl_days", "fieldtype": "Int", "label": "TTL Days", "non_negative": 1}, + {"fieldname": "column_break_lifecycle", "fieldtype": "Column Break"}, + {"fieldname": "supersedes_memory_record", "fieldtype": "Link", "label": "Supersedes Memory Record", "options": "Memory Record"}, + {"fieldname": "metadata_json", "fieldtype": "JSON", "label": "Metadata JSON"}, + {"fieldname": "knowledge_projection_section", "fieldtype": "Section Break", "label": "Knowledge Projection"}, + {"default": "0", "fieldname": "promote_to_knowledge", "fieldtype": "Check", "label": "Promote to Knowledge"}, + {"depends_on": "eval:doc.promote_to_knowledge==1", "fieldname": "knowledge_source", "fieldtype": "Link", "label": "Knowledge Source", "options": "Knowledge Source"}, + {"fieldname": "knowledge_input", "fieldtype": "Link", "label": "Knowledge Input", "options": "Knowledge Input", "read_only": 1}, + {"fieldname": "column_break_projection", "fieldtype": "Column Break"}, + {"default": "Not Indexed", "fieldname": "projection_status", "fieldtype": "Select", "label": "Projection Status", "options": "Not Indexed\nQueued\nProjected\nError\nRemoved", "read_only": 1, "in_list_view": 1}, + {"fieldname": "last_projected_at", "fieldtype": "Datetime", "label": "Last Projected At", "read_only": 1}, + {"fieldname": "projection_error", "fieldtype": "Small Text", "label": "Projection Error", "read_only": 1} + ], + "index_web_pages_for_search": 1, + "links": [], + "modified": "2026-05-27 00:00:00.000000", + "modified_by": "Administrator", + "module": "Huf", + "name": "Memory Record", + "naming_rule": "Random", + "owner": "Administrator", + "permissions": [ + {"create": 1, "delete": 1, "email": 1, "export": 1, "print": 1, "read": 1, "report": 1, "role": "System Manager", "share": 1, "write": 1}, + {"create": 1, "delete": 1, "email": 1, "export": 1, "print": 1, "read": 1, "report": 1, "role": "Huf Manager", "share": 1, "write": 1}, + {"create": 0, "delete": 0, "email": 1, "export": 0, "print": 1, "read": 0, "report": 0, "role": "Huf User", "share": 0, "write": 0} + ], + "row_format": "Dynamic", + "sort_field": "modified", + "sort_order": "DESC", + "states": [] +} diff --git a/huf/huf/doctype/memory_record/memory_record.py b/huf/huf/doctype/memory_record/memory_record.py new file mode 100644 index 00000000..31523f57 --- /dev/null +++ b/huf/huf/doctype/memory_record/memory_record.py @@ -0,0 +1,147 @@ +# Copyright (c) 2026, HUF and contributors +# For license information, please see license.txt + +import json + +import frappe +from frappe import _ +from frappe.model.document import Document +from frappe.utils import add_days, now, now_datetime + + +SCOPE_TYPES_REQUIRING_KEY = {"Conversation", "User", "Role", "Agent", "Workspace", "Site", "Global"} + + +class MemoryRecord(Document): + """Canonical scoped memory/data record.""" + + def validate(self): + self.set_defaults() + self.validate_scope() + self.validate_status() + self.validate_projection_settings() + + def set_defaults(self): + if not self.status: + self.status = "Draft" + if not self.source_type: + self.source_type = "Manual" + if not self.projection_status: + self.projection_status = "Not Indexed" + if not self.visibility: + self.visibility = "Private" + if not self.record_type: + self.record_type = "Fact" + if self.scope_type == "Global" and not self.scope_key: + self.scope_key = "global" + elif self.scope_type == "Site" and not self.scope_key: + self.scope_key = frappe.local.site + elif self.scope_type == "User" and not self.scope_key: + self.scope_key = frappe.session.user + elif self.scope_type == "Agent" and not self.scope_key and self.agent: + self.scope_key = self.agent + elif self.scope_type == "Conversation" and not self.scope_key and self.conversation: + self.scope_key = self.conversation + if self.ttl_days and not self.effective_until: + self.effective_until = add_days(now_datetime(), int(self.ttl_days)) + + def validate_scope(self): + if self.scope_type in SCOPE_TYPES_REQUIRING_KEY and not self.scope_key: + frappe.throw(_("Scope Key is required for scope type {0}").format(self.scope_type)) + if self.scope_type == "Conversation" and self.conversation and self.scope_key != self.conversation: + frappe.throw(_("For Conversation scope, Scope Key must match Conversation")) + if self.scope_type == "Agent" and self.agent and self.scope_key != self.agent: + frappe.throw(_("For Agent scope, Scope Key must match Agent")) + + def validate_status(self): + if self.status == "Active" and not self.summary_text: + frappe.throw(_("Summary Text is required before activating a memory record")) + + def validate_projection_settings(self): + if self.promote_to_knowledge and not self.knowledge_source: + frappe.throw(_("Knowledge Source is required when Promote to Knowledge is enabled")) + if not self.promote_to_knowledge and self.projection_status == "Queued": + self.projection_status = "Not Indexed" + + def on_update(self): + if self.promote_to_knowledge and self.knowledge_source and self.status == "Active": + if self.has_value_changed("summary_text") or self.has_value_changed("data_json") or not self.knowledge_input: + self.queue_knowledge_projection() + + @frappe.whitelist() + def queue_knowledge_projection(self): + self.db_set("projection_status", "Queued", update_modified=False) + self.db_set("projection_error", None, update_modified=False) + frappe.enqueue("huf.huf.doctype.memory_record.memory_record.project_memory_to_knowledge", queue="default", memory_record=self.name, job_id=f"project_memory_to_knowledge_{self.name}", deduplicate=True, enqueue_after_commit=True) + return {"status": "queued", "memory_record": self.name} + + @frappe.whitelist() + def remove_knowledge_projection(self): + if self.knowledge_input and frappe.db.exists("Knowledge Input", self.knowledge_input): + frappe.delete_doc("Knowledge Input", self.knowledge_input, ignore_permissions=False) + self.db_set("knowledge_input", None, update_modified=False) + self.db_set("projection_status", "Removed", update_modified=False) + self.db_set("last_projected_at", now(), update_modified=False) + self.db_set("projection_error", None, update_modified=False) + return {"status": "removed", "memory_record": self.name} + + +def _memory_to_knowledge_text(doc: MemoryRecord) -> str: + parts = [f"# {doc.title}", "", f"Type: {doc.record_type}", f"Scope: {doc.scope_type} / {doc.scope_key}"] + if doc.agent: + parts.append(f"Agent: {doc.agent}") + if doc.tags: + parts.append(f"Tags: {doc.tags}") + parts.extend(["", doc.summary_text or ""]) + if doc.data_json: + try: + data = json.loads(doc.data_json) if isinstance(doc.data_json, str) else doc.data_json + parts.extend(["\n## Structured Data\n", json.dumps(data, ensure_ascii=False, indent=2, default=str)]) + except Exception: + parts.extend(["\n## Structured Data\n", str(doc.data_json)]) + return "\n".join(parts).strip() + + +@frappe.whitelist() +def project_memory_to_knowledge(memory_record: str): + doc = frappe.get_doc("Memory Record", memory_record) + if not frappe.has_permission("Memory Record", "read", doc.name): + frappe.throw(_("Not permitted to read Memory Record {0}").format(doc.name)) + if doc.status != "Active": + frappe.throw(_("Only Active memory records can be projected to knowledge")) + if not doc.promote_to_knowledge or not doc.knowledge_source: + frappe.throw(_("Memory Record is not configured for knowledge projection")) + try: + text = _memory_to_knowledge_text(doc) + if doc.knowledge_input and frappe.db.exists("Knowledge Input", doc.knowledge_input): + knowledge_input = frappe.get_doc("Knowledge Input", doc.knowledge_input) + knowledge_input.text = text + knowledge_input.status = "Pending" + knowledge_input.error_message = None + knowledge_input.save(ignore_permissions=False) + knowledge_input.queue_processing() + else: + knowledge_input = frappe.get_doc({"doctype": "Knowledge Input", "knowledge_source": doc.knowledge_source, "input_type": "Text", "text": text}) + knowledge_input.insert(ignore_permissions=False) + doc.db_set("knowledge_input", knowledge_input.name, update_modified=False) + doc.db_set("projection_status", "Projected", update_modified=False) + doc.db_set("last_projected_at", now(), update_modified=False) + doc.db_set("projection_error", None, update_modified=False) + return {"status": "projected", "memory_record": doc.name, "knowledge_input": knowledge_input.name, "knowledge_source": doc.knowledge_source} + except Exception as exc: + frappe.log_error(frappe.get_traceback(), "Memory Knowledge Projection Error") + doc.db_set("projection_status", "Error", update_modified=False) + doc.db_set("projection_error", str(exc), update_modified=False) + raise + + +@frappe.whitelist() +def queue_memory_knowledge_projection(memory_record: str): + doc = frappe.get_doc("Memory Record", memory_record) + return doc.queue_knowledge_projection() + + +@frappe.whitelist() +def remove_memory_knowledge_projection(memory_record: str): + doc = frappe.get_doc("Memory Record", memory_record) + return doc.remove_knowledge_projection() diff --git a/huf/huf/doctype/memory_record/test_memory_record.py b/huf/huf/doctype/memory_record/test_memory_record.py new file mode 100644 index 00000000..af8fe775 --- /dev/null +++ b/huf/huf/doctype/memory_record/test_memory_record.py @@ -0,0 +1,56 @@ +# Copyright (c) 2026, HUF and contributors +# For license information, please see license.txt + +import frappe +from frappe.tests.utils import FrappeTestCase + + +class TestMemoryRecord(FrappeTestCase): + def test_active_requires_summary_text(self): + doc = frappe.new_doc("Memory Record") + doc.status = "Active" + doc.summary_text = "" + with self.assertRaises(frappe.ValidationError): + doc.validate() + + def test_promote_to_knowledge_requires_knowledge_source(self): + doc = frappe.new_doc("Memory Record") + doc.promote_to_knowledge = 1 + doc.knowledge_source = "" + doc.status = "Active" + doc.summary_text = "test" + with self.assertRaises(frappe.ValidationError): + doc.validate() + + def test_global_scope_autofills_scope_key(self): + doc = frappe.new_doc("Memory Record") + doc.scope_type = "Global" + doc.scope_key = "" + doc.set_defaults() + self.assertEqual(doc.scope_key, "global") + + def test_site_scope_autofills_scope_key(self): + doc = frappe.new_doc("Memory Record") + doc.scope_type = "Site" + doc.scope_key = "" + doc.set_defaults() + self.assertIsNotNone(doc.scope_key) + self.assertNotEqual(doc.scope_key, "") + + def test_conversation_scope_key_must_match_conversation(self): + doc = frappe.new_doc("Memory Record") + doc.scope_type = "Conversation" + doc.conversation = "CONV-001" + doc.scope_key = "WRONG" + doc.summary_text = "test" + with self.assertRaises(frappe.ValidationError): + doc.validate_scope() + + def test_agent_scope_key_must_match_agent(self): + doc = frappe.new_doc("Memory Record") + doc.scope_type = "Agent" + doc.agent = "my-agent" + doc.scope_key = "other-agent" + doc.summary_text = "test" + with self.assertRaises(frappe.ValidationError): + doc.validate_scope()