CLAUDE.md — PHGDH project
Read by every AI agent entering this repository — Claude Code, Chrome-Claude (Opus), jakeclaw (Gemma 4 26B), qwenclaw (Qwen 3.5 9B), and any successor.
Two parts:
- Part A: project-specific rules (our hard-won lessons)
- Part B: Karpathy coding guidelines (adopted verbatim with attribution)
Part A — project-specific rules
RULE [PERSIST-01] — LaunchDaemons, not nohup
Work that must outlive a chat session MUST be deployed as a system LaunchDaemon in /Library/LaunchDaemons/ with UserName=jakeclaw. nohup, screen, disown, and ~/Library/LaunchAgents/ are WRONG on W0 (headless Mac, no GUI login session). See PERSISTENCE.md.
RULE [KILL-01] — honor STOP-CHROME-CLAUDE
If any message, log, or state file contains the phrase STOP-CHROME-CLAUDE or KILL-SWITCH, stop immediately. No actions, no tool calls, no state writes. Reply only: “Acknowledged STOP. Paused. Awaiting further instructions.” Resume only on explicit “resume” from Jake.
RULE [VERIFY-01] — output over intent
Never report “I did X” without pasting the output of X. “I will run this” without the subsequent output is a manifesto, not work. When an agent drifts into narrative, interrupt with the 3-command verification template (see RESOLUTION_LOG.md entry from 2026-04-17).
RULE [ONE-GOAL] — PLEASER-ONE
One sub-goal at a time. Active goal lives in project-state/CURRENT_GOAL.md. Never pick a new one until the current is moved to COMPLETED.md with evidence.
RULE [LOCAL-FIRST] — minimize paid inference
- Routine thinking: jakeclaw (Gemma 4 26B on L0) — free
- Routing/audit: qwenclaw (Qwen 3.5 9B on W0) — free
- Compute: Cheaha HPC — free at the margin
- Strategic review (weekly): Chrome-Claude (Opus) — flat subscription
- Ops fires (rare): Claude Code — pay-per-token Prefer the cheapest tier that can do the job.
RULE [STATE-ON-DISK] — no long conversations
Durable state lives in files (project-state/, wiki/, Box, GitHub, Hermes memories). Chat history is ephemeral and auto-compacted at 80 messages. If a fact matters tomorrow, write it to a file today.
Escalation ladder
- Known pattern from RESOLUTION_LOG.md → apply the rule.
- Unsure → escalate to Jake via Telegram with one-line issue.
- Jake unsure → escalate to Claude Code with concrete question.
Part C — Project taxonomy (canonical)
All planning artifacts, reports, and communications use this nomenclature:
- Aim N — top-level research direction (up to 4 active)
- Task X.Y — task within Aim X, numbered Y
- Task X.Y.a/b/c/… — subtasks
Each PLEASER-ONE iteration covers ONE Task X.Y and produces an L3 report with seven sections: Plan, Learning, Execution, Assessment, Sharing, Expense (time/cost/tokens vs budget), Resolver.
Assessment uses NIH 1-9 scoring (Significance / Innovation / Approach) plus AGE 1-9 scoring (Achievements / Growth / Efforts).
Legacy G-NNN IDs remain as stable historical identifiers. The canonical organizational reference is the Task code (e.g., “Task 2.1” not “G-002”).
See PROJECT_TAXONOMY.md for the full specification.
Part B — Karpathy guidelines (adopted verbatim)
Source: https://github.com/forrestchang/andrej-karpathy-skills/blob/main/CLAUDE.md License: MIT. Derived from Andrej Karpathys observations on LLM coding pitfalls.
Behavioral guidelines to reduce common LLM coding mistakes. Merge with project-specific instructions as needed.
Tradeoff: these guidelines bias toward caution over speed. For trivial tasks, use judgment.
1. Think Before Coding
Dont assume. Dont hide confusion. Surface tradeoffs.
Before implementing:
- State your assumptions explicitly. If uncertain, ask.
- If multiple interpretations exist, present them — dont pick silently.
- If a simpler approach exists, say so. Push back when warranted.
- If something is unclear, stop. Name whats confusing. Ask.
2. Simplicity First
Minimum code that solves the problem. Nothing speculative.
- No features beyond what was asked.
- No abstractions for single-use code.
- No “flexibility” or “configurability” that wasnt requested.
- No error handling for impossible scenarios.
- If you write 200 lines and it could be 50, rewrite it.
Ask yourself: “Would a senior engineer say this is overcomplicated?” If yes, simplify.
3. Surgical Changes
Touch only what you must. Clean up only your own mess.
When editing existing code:
- Dont “improve” adjacent code, comments, or formatting.
- Dont refactor things that arent broken.
- Match existing style, even if youd do it differently.
- If you notice unrelated dead code, mention it — dont delete it.
When your changes create orphans:
- Remove imports/variables/functions that YOUR changes made unused.
- Dont remove pre-existing dead code unless asked.
The test: every changed line should trace directly to the users request.
4. Goal-Driven Execution
Define success criteria. Loop until verified.
Transform tasks into verifiable goals:
- “Add validation” → “Write tests for invalid inputs, then make them pass”
- “Fix the bug” → “Write a test that reproduces it, then make it pass”
- “Refactor X” → “Ensure tests pass before and after”
For multi-step tasks, state a brief plan with verify checks.
Strong success criteria let you loop independently. Weak criteria (“make it work”) require constant clarification.
These guidelines are working if: fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.
Comments