voidfreud · voidfreud · Mar 27, 2026 · Mar 27, 2026 · Mar 27, 2026
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -0,0 +1,36 @@
+name: CI
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: ["3.10", "3.11", "3.12", "3.13"]
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+
+      - name: Install uv
+        uses: astral-sh/setup-uv@v4
+
+      - name: Install dependencies
+        run: uv sync --group dev
+
+      - name: Lint
+        run: uv run ruff check chatgpt_export_tool tests
+
+      - name: Format check
+        run: uv run ruff format --check chatgpt_export_tool tests
+
+      - name: Test
+        run: uv run pytest --cov=chatgpt_export_tool --cov-report=term-missing
diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml
@@ -0,0 +1,28 @@
+name: Publish to PyPI
+
+on:
+  release:
+    types: [published]
+
+permissions:
+  id-token: write
+
+jobs:
+  publish:
+    runs-on: ubuntu-latest
+    environment: pypi
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-python@v5
+        with:
+          python-version: "3.13"
+
+      - name: Install build tools
+        run: pip install build
+
+      - name: Build package
+        run: python -m build
+
+      - name: Publish to PyPI
+        uses: pypa/gh-action-pypi-publish@release/v1
diff --git a/Fields.md b/Fields.md
@@ -1,232 +1,161 @@
 # Field Selection Reference
 
-This document describes the field-selection and metadata-selection features that the current CLI actually supports.
+Practical reference for the `--fields` and `--include`/`--exclude` options in `chatgpt-export`.
 
-It is intentionally practical rather than exhaustive. The goal is to document the fields, groups, and selectors you can use with `chatgpt-export`, not to guess every field that might appear in every historical `conversations.json` file.
+---
 
-## Structural Levels
+## How Data Is Structured
 
-The tool understands conversation data at these nested levels:
+ChatGPT exports nest data at these levels:
 
-```text
+```
 conversation
 └── mapping node
     └── message
-        ├── author
-        ├── content
-        └── metadata
+        ├── author      (role, name)
+        ├── content     (content_type, parts, text, ...)
+        └── metadata    (model_slug, message_type, ...)
 ```
 
-The field selector can retain or remove fields across those levels while preserving the containers needed to reach nested selected fields.
-
-Text export is transcript-oriented: it follows the active branch defined by `current_node` and `parent` links, then applies transcript visibility rules from the TOML config passed to `export`.
-
-## `--fields`
+The field selector retains or removes fields across these levels while preserving the containers needed to reach any selected nested field.
 
-The `--fields` argument accepts one field-selection spec.
+Text/Markdown export is **transcript-oriented** — it follows the active branch via `current_node` and `parent` links, then applies visibility rules from the TOML config.
 
-Supported forms:
+---
 
-```text
-all
-none
-include field1,field2
-exclude field1,field2
-groups group1,group2
-```
+## `--fields`
 
-Examples:
+Controls which structural fields survive before formatting.
 
 ```bash
-chatgpt-export export data.json --fields all
-chatgpt-export export data.json --fields none
-chatgpt-export export data.json --fields "include title,create_time,mapping"
-chatgpt-export export data.json --fields "exclude moderation_results,plugin_ids"
-chatgpt-export export data.json --fields "groups minimal"
-chatgpt-export export data.json --fields "groups conversation,message"
+--fields all                                  # keep everything (default)
+--fields none                                 # empty structure
+--fields "include title,create_time,mapping"  # whitelist
+--fields "exclude moderation_results"         # blacklist
+--fields "groups minimal"                     # named group
+--fields "groups conversation,message"        # combine groups
 ```
 
-Multi-word specs must be quoted.
+> Multi-word specs must be quoted.
 
-## Field Groups
+---
 
-The current built-in field groups are:
+## Built-in Field Groups
 
 ### `conversation`
 
-Includes:
-
-- `_id`
-- `conversation_id`
-- `create_time`
-- `update_time`
-- `title`
-- `type`
+`_id` · `conversation_id` · `create_time` · `update_time` · `title` · `type`
 
 ### `message`
 
-Includes:
-
-- `author`
-- `content`
-- `status`
-- `end_turn`
+`author` · `content` · `status` · `end_turn`
 
 ### `metadata`
 
-Includes:
-
-- `model_slug`
-- `message_type`
-- `is_archived`
+`model_slug` · `message_type` · `is_archived`
 
 ### `minimal`
 
-Includes:
+`title` · `create_time` · `message`
 
-- `title`
-- `create_time`
-- `message`
+---
 
-## Known Structural Fields
+## All Known Structural Fields
 
-These are the structural fields the tool currently categorizes by level.
+<details>
+<summary><strong>Conversation level</strong></summary>
 
-### Conversation
+`title` · `create_time` · `update_time` · `mapping` · `moderation_results` · `current_node` · `plugin_ids` · `_id` · `conversation_id` · `type`
 
-- `title`
-- `create_time`
-- `update_time`
-- `mapping`
-- `moderation_results`
-- `current_node`
-- `plugin_ids`
-- `_id`
-- `conversation_id`
-- `type`
+</details>
 
-### Mapping Node
+<details>
+<summary><strong>Mapping node level</strong></summary>
 
-- `id`
-- `parent`
-- `children`
-- `message`
+`id` · `parent` · `children` · `message`
 
-### Message
+</details>
 
-- `author`
-- `content`
-- `status`
-- `end_turn`
-- `weight`
-- `recipient`
-- `channel`
-- `create_time`
-- `update_time`
+<details>
+<summary><strong>Message level</strong></summary>
 
-### Author
+`author` · `content` · `status` · `end_turn` · `weight` · `recipient` · `channel` · `create_time` · `update_time`
 
-- `role`
-- `name`
+</details>
 
-### Content
+<details>
+<summary><strong>Author</strong></summary>
 
-- `content_type`
-- `parts`
-- `language`
-- `response_format_name`
-- `text`
-- `user_profile`
-- `user_instructions`
+`role` · `name`
 
-Unknown names are still allowed in `include` and `exclude` field specs, but the validator may warn about them.
+</details>
 
-## Metadata Filtering
+<details>
+<summary><strong>Content</strong></summary>
 
-Metadata filtering is separate from `--fields`.
+`content_type` · `parts` · `language` · `response_format_name` · `text` · `user_profile` · `user_instructions`
 
-Use:
+</details>
 
-- `--include PATTERN [PATTERN ...]`
-- `--exclude PATTERN [PATTERN ...]`
+> Unknown field names are allowed in `include`/`exclude` specs, but the validator may warn about them.
 
-These apply to known metadata names inside nested `message.metadata` dictionaries after structural field filtering.
+---
 
-Examples:
+## Metadata Filtering
+
+Separate from `--fields`. Applies only to keys inside `message.metadata` dictionaries, *after* structural filtering.
 
 ```bash
 chatgpt-export export data.json --include model_slug
 chatgpt-export export data.json --include "model*" --exclude plugin_ids
 chatgpt-export export data.json --fields "groups message" --include is_archived
 ```
 
-Pattern matching supports:
-
-- exact matches
-- substring matches
-- shell-style wildcards such as `model*`
-
-## Known Metadata Names
+**Pattern matching:** exact, substring, and shell-style globs (`model*`).
 
-The current metadata filter recognizes these names:
+**Known metadata names:** `model_slug` · `message_type` · `plugin_ids` · `is_archived`
 
-- `model_slug`
-- `message_type`
-- `plugin_ids`
-- `is_archived`
+---
 
-## How Filtering Combines
+## Filtering Pipeline
 
-Filtering happens in this order:
-
-1. structural field selection through `--fields`
-2. metadata filtering through `--include` and `--exclude`
-3. formatting to text or JSON
-
-This means:
+```
+conversations.json
+  → 1. Structural field selection (--fields)
+  → 2. Metadata filtering (--include / --exclude)
+  → 3. Formatting (md / txt / json)
+  → output
+```
 
-- `--fields` decides whether structural containers like `mapping`, `message`, `author`, `content`, and `metadata` survive
-- `--include` and `--exclude` decide which metadata keys remain inside metadata dictionaries
+`--fields` decides whether containers like `mapping`, `message`, and `metadata` survive.
+`--include`/`--exclude` decides which keys remain inside those metadata containers.
 
-## Practical Recipes
+---
 
-Keep only a small readable subset:
+## Recipes
 
 ```bash
+# Minimal readable export
 chatgpt-export export data.json --fields "groups minimal"
-```
-
-Keep titles and timestamps but drop plugin noise:
 
-```bash
+# Drop noise, keep structure
 chatgpt-export export data.json --fields "exclude plugin_ids,moderation_results"
-```
-
-Keep only message-oriented structure and model metadata:
 
-```bash
+# Message structure + model info only
 chatgpt-export export data.json --fields "groups message" --include "model*"
-```
 
-Write one file per conversation with a minimal payload:
-
-```bash
+# One file per conversation, minimal payload
 chatgpt-export export data.json --split subject --output-dir exports --fields "groups minimal"
 ```
 
+---
+
 ## Notes
 
-- `analyze --fields` reports field coverage; it does not accept the export-style field-selection spec.
-- `export` can load defaults from a TOML file via `--config PATH`.
-- The repo ships `chatgpt_export.toml.example` as a template; copy it to a local file before use.
-- `export --split single` writes to stdout unless `--output` is provided.
-- Subject split files are named from the source conversation title plus identifier.
-- Split modes such as `subject`, `date`, and `id` write to `--output-dir`.
-- Text export follows the active conversation branch and is configurable through the `[transcript]` and `[text_output]` TOML sections.
-- Default text export shows user text, assistant text, assistant thoughts, and a compact preview of `user_editable_context`.
-- Default text export hides assistant code, reasoning recap, and tool plumbing unless the transcript policy explicitly enables them.
-- Advanced transcript controls include `user_editable_context_mode`, `show_visually_hidden_content_types`, `include_content_types`, and `exclude_content_types`.
-- Text layout controls include `layout_mode`, `heading_style`, `include_turn_count_in_header`, `include_turn_numbers`, `turn_separator`, `strip_chatgpt_artifacts`, and `wrap_width`.
-- A practical default is `layout_mode = "reading"` with `turn_separator = "---"` and artifact stripping enabled.
-- For tighter exports, use `layout_mode = "compact"` and disable turn counts.
-- For notes-oriented output, use `heading_style = "markdown"`.
+- `analyze --fields` reports field *coverage* — it does not use the export-style field-selection spec.
+- `export --split single` writes to stdout unless `--output` is given.
+- Subject split files are named `Title_ID.md` (or `.txt`/`.json`).
+- Default Markdown export shows user text, assistant text, thoughts, and compact context previews.
+- Hidden by default: assistant code, reasoning recap, tool plumbing.
+- All transcript visibility is configurable via `[transcript]` in the TOML config.
+- Text layout is configurable via `[text_output]`: layout mode, heading style, wrap width, separators, turn numbering.