Autonomous Hack The Box agent powered by Hermes + DeepSeek V4 Pro
βββββββββββββββ ββββββββ βββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββ βββ βββ ββββββ ββββββββ
βββββββββββββββ ββββββ βββ βββ ββββββ ββββββββ
βββββββββββ ββββββββββββββββ βββ βββββββββββ βββ
βββββββββββ ββββββββ βββββββ βββ βββββββββββ βββ
SPECTER is a fully autonomous HTB penetration testing agent built on Hermes Agent β the self-improving AI agent framework from Nous Research. Unlike Claude Code-based agents, SPECTER has persistent cross-session memory, self-improving skills, Telegram-native notifications, and automatic IP continuity when HTB resets machines.
| Feature | Claude Code | SPECTER (Hermes) |
|---|---|---|
| Memory across sessions | β Starts blank every time | β MEMORY.md β persists every finding |
| Self-improving skills | β Static skill files | β Skills patch themselves after each machine |
| Cross-session search | β No history | β FTS5 SQLite β recall any past session |
| Telegram interface | β Terminal only | β Native gateway β full conversation from phone |
| Cron automation | β Manual triggers only | β Scheduled IP monitor, auto-checkpoint |
| IP change handling | β Manual /etc/hosts edit | β HTB API auto-resolves, zero user input |
| Flag auto-submission | β Manual | β Regex detects + submits via HTB API |
| Context window recovery | β Start over | β Checkpoint every 8 iterations, resume anywhere |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SPECTER LOOP β
β β
β Telegram βββΊ Hermes Gateway βββΊ DeepSeek V4 Pro β
β β β
β SOUL.md (identity) β
β AGENTS.md (project context) β
β MEMORY.md (persistent findings) β
β skills/ (evolving playbooks) β
β β β
β OBSERVE β HYPOTHESIZE β PLAN β EXECUTE β REFINE β
β β β
β HTB API ββββββββββββ€ auto-submit flags β
β /etc/hosts βββββββββ€ auto-update on IP change β
β Telegram βββββββββββ flag capture / checkpoint notify β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
When HTB resets a machine and assigns a new IP (same machine, different address), SPECTER handles it automatically:
- Cron job polls HTB API every 5 minutes (
wakeAgent: falsewhen no change β zero LLM cost) - On IP change detected: updates
/etc/hosts,state/.env,MEMORY.mdentry - Sends Telegram:
π TwoMillion IP updated: 10.10.11.227 β 10.10.11.235 - Continues from last checkpoint β no context lost
You never touch /etc/hosts manually.
SPECTER's MEMORY.md grows with every machine:
active_machine: TwoMillion (10.10.11.227)
last_phase: PRIVESC
TwoMillion ports: 22/ssh OpenSSH 8.2, 80/http Apache 2.4.41
TwoMillion owned via: LFI β log poisoning β RCE, then sudo /usr/bin/dd SUID
[TECHNIQUE] LFI log poison works on: Apache, Nginx with error logging
[TECHNIQUE] sudo dd privesc: copy /etc/passwd, add root user
Next machine on Apache β SPECTER tries LFI first because it remembers.
After owning a machine, SPECTER patches its own skill files:
skill_manage(action="patch", name="exploit",
old_string="# LFI payloads",
new_string="# LFI payloads\n# β Apache 2.4.41: /var/log/apache2/access.log via User-Agent")Skills get smarter with every engagement. Never re-derive what already worked.
Talk to SPECTER from your phone while it works. All notifications fire automatically:
You: Start machine TwoMillion and begin recon
SPECTER: Machine spawned. IP: 10.10.11.227
Nmap fast scan running.
Ports: 22/ssh, 80/http
Apache 2.4.41 on port 80.
Running feroxbuster.
SPECTER: π© TwoMillion USER FLAG: HTB{r3v3rs3_3ng1n33r1ng_...}
SPECTER: β οΈ TwoMillion ROOT FLAG: HTB{4dm1n_t0k3n_...}
SPECTER: π― OWNED: TwoMillion β user+root in 2h 14m
Two scheduled jobs run without any user interaction:
IP Monitor (every 5 min) β uses wakeAgent: false when IP unchanged, meaning near-zero cost to run constantly. Wakes the agent only when the IP actually changes.
Checkpoint saver (every 8 iterations) β writes compressed state to wiki/sessions/, sends Telegram summary. Restart any session from any checkpoint.
specter-htb/
βββ SOUL.md β Agent identity (Hermes global personality)
βββ AGENTS.md β Project context (loaded at every session start)
βββ config.yaml β Hermes config: DeepSeek, Telegram, cron, skills
βββ setup.sh β One-command setup wizard
βββ .env.example β Environment variables template
β
βββ skills/ β Hermes SKILL.md files (slash commands)
β βββ htb-api/SKILL.md β /htb-api (spawn, IP resolve, flag submit)
β βββ recon/SKILL.md β /recon (nmap, web enum, service fingerprint)
β βββ exploit/SKILL.md β /exploit (web, network, Metasploit)
β βββ privesc/SKILL.md β /privesc (Linux + Windows LPE)
β βββ report/SKILL.md β /report (HTB writeup generator)
β
βββ wiki/ β LLM-maintained knowledge base
β βββ targets/ β One .md per HTB machine
β βββ techniques/ β Reusable attack patterns
β βββ sessions/ β Checkpoint files (resume points)
β βββ flags/ β Captured flags archive
β
βββ state/
β βββ session.json β Current machine, phase, findings, iteration count
β βββ scope.txt β Authorized IP ranges (HTB standard ranges pre-loaded)
β βββ .htb_token β HTB API token (gitignored)
β
βββ tools/
βββ htb.py β HTB API v4: spawn/ip/reset/submit-flag
βββ save.py β Auto-save engine + checkpoint generator
| File | Hermes Role | What SPECTER Uses It For |
|---|---|---|
SOUL.md |
Global personality | Agent identity β loaded every session, every platform |
AGENTS.md |
Project context | Session start protocol, architecture, memory strategy |
skills/*/SKILL.md |
Slash commands | /recon, /exploit, /privesc, /report, /htb-api |
MEMORY.md |
Persistent memory | Cross-session findings, owned machines, technique patterns |
- Hermes Agent installed
- DeepSeek API key (V4 Pro)
- HTB App Token (Settings β API Key β App Token)
- Telegram bot token from @BotFather
- HTB VPN connected (
tun0interface) - Python 3.10+ with
requestslibrary
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
source ~/.bashrcgit clone https://github.com/0xb4bal/specter.git
cd specter
chmod +x setup.sh
./setup.shThe setup wizard will ask for:
- DeepSeek API key
- HTB App Token
- Telegram bot token + your user ID
# Start the Hermes gateway (runs in background, enables Telegram)
HERMES_HOME=~/.hermes-specter hermes gateway start
# Or run interactive CLI
HERMES_HOME=~/.hermes-specter hermesStart machine TwoMillion and begin recon
SPECTER resolves the machine name β spawns it β gets IP β updates /etc/hosts β begins Phase 1.
Inside any Hermes session (CLI or Telegram):
| Command | What it does |
|---|---|
/htb-api |
Verify/refresh target IP, spawn machine, submit flags |
/recon |
Phase 1: nmap, web enum, service fingerprint |
/exploit |
Phase 3: web/network exploitation, reverse shells |
/privesc |
Phase 5: Linux/Windows privilege escalation |
/report |
Generate HTB writeup from wiki + memory |
/compress |
Compress context when approaching token limit |
/skills |
List all available skills |
/cron list |
Show scheduled jobs |
You: "Start TwoMillion"
β
ββββββββββββββΌβββββββββββββ
β /htb-api β
β Resolve name β ID β
β Spawn machine β
β Get IP (poll API) β
β Update /etc/hosts β
β Save to MEMORY.md β
ββββββββββββββ¬βββββββββββββ
β
ββββββββββββββΌβββββββββββββ
β /recon β
β Check MEMORY.md first β β Did we do this stack before?
β nmap fast + full β
β Web/SMB/SSH enum β
β Save all to MEMORY β
ββββββββββββββ¬βββββββββββββ
β
ββββββββββββββΌβββββββββββββ
β /exploit β
β Check session_search β β What worked on Apache 2.4.x?
β Try proven techniques β
β Get shell β
β Flag detected β auto β
β submit + Telegram β
ββββββββββββββ¬βββββββββββββ
β
ββββββββββββββΌβββββββββββββ
β /privesc β
β Run linpeas β
β Check memory for LPE β
β Escalate to root β
β Root flag β auto β
β submit + Telegram β
ββββββββββββββ¬βββββββββββββ
β
ββββββββββββββΌβββββββββββββ
β /report β
β Compile writeup β
β Update skills (patch) β
β Update MEMORY.md β
βββββββββββββββββββββββββββ
# config.yaml
model: deepseek/deepseek-chat-v4-pro
provider: deepseekHERMES_HOME=~/.hermes-specter hermes model
# Interactive picker β change to any Hermes-supported model
# Works with: OpenRouter, Anthropic, OpenAI, local endpoints- Message @BotFather β
/newbotβ name it "SPECTER HTB" - Copy the token into setup.sh when prompted
- Get your user ID from @userinfobot
- Start the gateway:
HERMES_HOME=~/.hermes-specter hermes gateway start - Send any message to your bot to activate it
SPECTER uses two Hermes memory files stored in ~/.hermes-specter/memories/:
MEMORY.md (~800 tokens) β agent's working notes:
active_machine: TwoMillion (10.10.11.227)
last_phase: PRIVESC | iteration: 12
TwoMillion ports: 22/ssh, 80/http Apache 2.4.41
TwoMillion owned via: LFI log poison β RCE β sudo dd privesc
[TECH] Apache 2.4.41 β check LFI first, /var/log paths work
USER.md (~500 tokens) β your preferences:
Prefers caveman output mode
HTB focus: medium/hard Linux machines
Notify Telegram on every flag, every checkpoint
Memory is bounded (2,200 chars max) and auto-consolidated. The agent manages it β you don't.
- HTB API token stored in
state/.htb_token(gitignored) - Never stored in MEMORY.md or wiki/
- Scope enforcement: only targets listed in
state/scope.txtare tested - All actions logged to
wiki/log.md(audit trail) - Responsible use: HTB lab machines only
| Project | What was taken |
|---|---|
| Hermes Agent | Agent runtime, skills system, memory, Telegram gateway, cron |
| PHANTOM HTB Agent | HTB methodology, recursive loop, wiki pattern |
| Karpathy autoresearch | Observe β hypothesize β execute recursive loop |
| LLM Wiki | Three-layer knowledge architecture |
MIT β see LICENSE
For authorized use on HTB lab machines only. Never test systems without explicit written permission.