Status: Specification Only — No Implementation Yet
longshade generates a conversable persona from personal data. Given conversations and writings, it produces everything needed to instantiate an LLM that can speak in your voice.
This is the "ghost" — your digital echo that can answer questions, share perspectives, and represent your thinking after you're gone.
"The ghost is not you. But it echoes you."
# Generate persona from input data
longshade generate ./input/ --output ./persona/
# Test the persona interactively
longshade chat ./persona/
# Analyze inputs without generating
longshade analyze ./input/Conversational data — your voice in dialogue.
{"role": "user", "content": "What do you think about...", "timestamp": "2024-01-15T10:30:00Z", "source": "ctk"}
{"role": "assistant", "content": "I think...", "timestamp": "2024-01-15T10:31:00Z", "source": "ctk"}Required fields:
role: "user" (your messages) or "assistant" (AI responses for context)content: Message text
Optional fields:
timestamp: ISO 8601 datetimesource: Where this came from (for attribution)conversation_id: Group related messagestopic: Subject/theme
Note: Your messages (role: "user") are the primary signal for voice. AI responses provide context but are not persona.
Long-form writing — your voice in prose.
---
title: Why I Care About Durability
date: 2024-01-15
tags: [philosophy, archiving]
type: essay
---
When I think about what matters...Frontmatter (optional but helpful):
title: Title of the piecedate: When writtentags: Topics/themestype: essay, post, note, letter, etc.
longshade produces a persona/ directory:
persona/
├── README.md # How to use this persona
├── system-prompt.txt # Ready-to-use LLM system prompt
├── rag/ # Embeddings and index for retrieval
│ ├── index.faiss
│ ├── metadata.json
│ └── chunks.jsonl
├── voice-samples.jsonl # Example Q&A pairs
└── fine-tune/ # Optional training data
The system prompt captures voice, values, and style. The RAG index enables grounded responses with semantic search. Voice samples demonstrate correct tone for few-shot prompting.
Any Source longshade Output
┌─────────────────┐ ┌─────────────────┐ ┌────────────────┐
│ conversations/ │─────────────→│ │ │ persona/ │
│ *.jsonl │ │ Analyze voice │ │ README.md │
├─────────────────┤ │ Extract style │──────────→│ system-prompt│
│ writings/ │─────────────→│ Build RAG index │ │ rag/ │
│ *.md │ │ Generate prompt │ │ voice-samples│
└─────────────────┘ └─────────────────┘ └────────────────┘
- Ingest — Read conversations and writings
- Analyze — Extract voice characteristics, values, patterns
- Chunk & Embed — Build semantic search index
- Generate — Produce system prompt and artifacts
longshade is part of the ECHO ecosystem but works independently:
- longshade defines what it accepts — Input formats are longshade's specification
- Any source can provide input — If you can produce JSONL conversations or Markdown writings, longshade accepts them
- Outputs are self-contained — The persona directory works with any LLM
Compatible data sources:
longshade processes personal data. Consider:
- Review inputs before processing
- Think about what you're comfortable having in a conversable persona
- Use filtering options to exclude sensitive content
- Control who has access to the output
The generated persona can answer questions you never anticipated. Think carefully about what's included.
For the complete technical specification, see SPEC.md.
- longecho — ECHO compliance validator
- ctk — Conversation toolkit
- btk — Bookmark toolkit
- ebk — Ebook toolkit
"The ghost is not you. But it echoes you."