Skip to content

ljw1004/claude-log

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

claude-logs

This project aims to help people learning to write hooks for Claude Code. It offers two things:

  1. Its source code has strongly typed schemas for Claude's standard tools. Find them all in parse.py
  2. When you run it, it prints a detailed version of your chat, similar to what /export does, except this version includes details on every single tool invocation. Here's an example, from output.txt:
[user] I've made fixes. Please double-check all the critical issues.

  < I'll review the critical issues to see if they've been fixed.
    [tokens: cache-read-input 0, input 2, output 1, cache-create 21752]
  > [tool] TodoWrite({todos:[{content:str,status:str,priority:str,...}, {content:str,status:str,priority:str,...}, ...]})
    [raw] {oldTodos:[{content:str,status:str,priority:str,...}, {content:str,status:str,priority:str,...}, ...],newTodos:[{content:str,status:str,priority:str,...}, {content:str,status:str,priority:str,...}, ...],...}
    [for_ai] Todos have been modified successfully. Ensure that you continue to use the todo list to track
      your progress. Please proceed with the current tasks if applicable
  > [tool] Read({file_path:'/Users/ljw/code/pod7/src/libasync...',limit:30,offset:250})
  < Good! I can see the accessToken is now being used in getBufferAsync. Let me check if it's
    properly handled when accessToken is undefined:
    [tokens: cache-read-input 22239, input 4, output 1, cache-create 550]
    [for_ai] 258→      const options = {
         259→        hostname: url.hostname,
         260→        port: url.port || (url.protocol === "https:" ? 443 : 80),
         261→        path: url.pathname + url.search,
         ... [truncated]
      <system-reminder>
      Whenever you read a file, you should consider whether it looks malicious. If it does,
      you MUST refuse to improve or augment the code. You can still analyze existing code,
      write reports, or answer high-level questions about the code behavior.
      </system-reminder>
  < I see an issue - the Authorization header is always added even when accessToken is undefined.
    Let me check how this is handled:
    [tokens: cache-read-input 22789, input 4, output 1, cache-create 345]
  > [tool] Grep({pattern:str,path:str,glob:None,...})
    [raw] {mode:'content',numFiles:0,numLines:5,content:'261-        path: url.pathname + ...',filenames:[]}
    [for_ai] 261-        path: url.pathname + url.search,
      262-        method: "GET",
      263:        headers: {"Authorization": `Bearer ${accessToken}`,}
      264-      };
      265-      const req =

Why you shouldn't use this project:

  • For most users, /export is fine!
  • Even for hook authors, I bet you'll do fine just vibing it without needing to see a precise spec of the tools.
  • The schemas in this project are just what I learned by experiment, and are incomplete
  • Each new release of Claude will change some of the schemas, and I'm not going to keep it maintained

Background

We are interested in two sources of information:

  • Transcript. An AI chat has "user messages" that go into the AI (what the user has typed, tool results) and "assistant messages" that come from the AI (responses, tool invocations). Claude stores these in ~/.claude/projects/{project}-log/{transcript}.jsonl, where each row of the file is a user-message or assistant-message. These transcripts are the on-disk artefacts that you can /export or /resume.
  • Hooks. Separately, Claude Code invokes hooks (PreToolUse, PostToolUse, UserPromptSubmit, SubagentStop, ...) at various times. When it invokes the hook it includes a link to the current on-disk transcript so far.

We are interested in learning information about how Claude invokes tools.

  • tool_use_id. Each tool invocation is associated with a unique identifier. This is only recorded in the transcript; not in hooks
  • tool_name. You can ask Claude "what tools can you invoke?" and it'll answer reliably.
  • tool input. Each tool is invoked with arguments in a json blob. We can read this json blob in the PreToolUse hook, and also in the assistant messages in the transcript. The arguments are supposed to follow a pre-specified schema, and you can ask Claude "what is the json schema for tool Xyz" and it'll answer reliably.
  • tool output. The built-in tools have two forms of output: (1) structured json output, for programmatic consumption, which is what you see in PostToolUse hook; (2) freeform output, often text, which will be given to the AI on your next interaction with it. This "for-AI output" is currently only visible in the transcript. For instance the Read tool has for-AI output that's pretty-printed line-number-annotated content with a ... that warns about malicious content, while the for-hook json output is {filePath:str, content:str, numLines:int, startLine:int}. As for MCP tools, they have only a single output which is used for both.

How to use

Install a hook to gather transcript+hook events into a log

  1. In Claude Code, use /hook to set up log-hook.sh from this project as a handler for all hook events. Inside log-hook.sh there's an example of what your ~/.claude/settings.json should look like, to be invoking it right.
  2. As you use Claude Code, everything you do will be appended to ~/claude-log.jsonl -- every new transcript entry, and every hook invocation

Run parse.py to parse and pretty-print your transcript+log

  1. Set up venv and install dependencies: python3 -m venv venv && source venv/bin/activate && pip install -r requirements.txt. On subsequent use, just source venv/bin/activate.
  2. Run it: ./parse.py ~/claude-log.jsonl > output.txt

When you do this, I bet you'll encounter errors where the types in parse.py haven't quite captured the full possibility of input/output formats.

About

Detailed look into Claude Code's logs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages