The Operator's Report — Claude Code Insights

At a glance

The short version, in four lines.

What's working

You run Claude like an infrastructure operator — security hardening with key-only SSH and systemd units, disaster recovery from botched deploys, and reconnaissance you push from multiple angles until it surfaces something real. Pairing careful backup verification with surgical recovery work keeps rescuing projects that would otherwise be lost.

What's hindering

Claude too often guessed at facts — which server, which color source — and built on the wrong assumption, then edited beyond what you asked, forcing painful reverts. On your side the drag is environmental: pasted PowerShell one-liners getting mangled, working-directory confusion, and big responses hitting token limits.

Quick wins

Make Claude state a tight change plan — exact files, one line each — and wait for approval before editing. That alone would have prevented most of the costly rollbacks. Lean on script files over pasted one-liners, and wrap your deploy-and-verify steps in a reusable Skill.

On the horizon

Your highest-friction loop — deploy, break, diagnose, recover — becomes a self-verifying pipeline that snapshots before every push, runs smoke tests against the live site, and auto-rolls back on failure. Plus read-only subagent fleets that reconcile compliance numbers without ever touching the source.

Where the time went

Forty-eight analyzed sessions, sorted by volume. Range is enormous — Hetzner SSH and disk recovery one day, GA4 rollouts across 21 sites and SCF compliance frameworks the next.

Cyber Risk ROI & Compliance Calculators14 sessions

ROI calculators, multi-tier compliance tools spanning 181–259 frameworks, and SCF data mapping from authoritative Excel. Wizard steps and gap-analysis reports got built — and frequently snagged on misread persistence intent, counting bugs, and mid-session redesigns.

Web Deployment & Multi-Site Management12 sessions

GA4/gtag across ~21 sites, logo rollouts, a Buster's Flooring rebrand, and recovery from botched deploys by merging backups and fixing schemas. SSH deploys, ownership/permissions, php-fpm reloads — friction from OneDrive sync and directory confusion.

Flooring E-commerce Features8 sessions

Door configurators, accessory matching (384 imported from suppliers), laminate categorization, and mobile-friendly photo organization. Collection-based matching and DB merges landed cleanly; image-generation loops and placeholder data caused rework.

System Administration & Disk Recovery8 sessions

Key-only SSH hardening, stabilizing orphaned services with systemd units, freeing ~78 GB of disk, and recovering data from unmounted NVMe SSDs. Worked around tooling blocks with Bash/cmd fallbacks.

Security Recon & MCP Tooling6 sessions

Iterative reconnaissance and enumeration across client domains, plus standing up MCP servers (Vercel, Supabase, Gmail) and unified CLAUDE.md instruction files — including a self-hosted Gmail MCP with widgets and alerts.

How you work

A clear signature emerges across the sessions.

The pattern

You delegate ambitious, high-stakes ops work, then steer through rapid interruption and blunt correction — rewarding accurate context-reading and punishing assumptions and scope creep.

Your dominant mode is iterative correction over detailed upfront specs: let Claude take a first pass, then steer hard when it goes wrong. You interrupt freely and reject often — 35 rejected actions and 30 misunderstood-request frictions tell the story. When Claude gets verbose or fires a question while you're pasting inventory, you cut it off. You give it real authority over consequential systems, and your best outcomes come when it verifies carefully before acting.

Signal in the numbers

~2,000 Bash calls + 457 PowerShell — command-line and ops first
35 user-rejected actions — you draw firm scope boundaries
30 misunderstood-request frictions — mostly wrong-approach, not capability
Several sessions ended in full reverts of completed work
Friction is about approach & scope, never about what Claude can do

◆When Claude reads context accurately and stays in its lane, you rate the help "essential." When it assumes or scope-creeps, you shut it down fast.

Impressive things you did

Three workflows where the leverage really showed.

🛡️Hardening at scale

Turned a security audit into action: SSH to key-only auth, three orphaned production services stabilized with systemd units, the SMTP banner fixed, page-check monitoring wired up. End-to-end remediation across multiple live systems.

🧰Disaster recovery

A bad deploy wiped categories and broke detail pages. You had Claude diagnose the file divergence, find a missing collections table, merge backups with new changes, and import 384 accessories from two suppliers — rescuing a project that was otherwise lost.

🔎Recon, many angles

You drive results by refusing the first answer — pushing Claude through six distinct reconnaissance methods until it surfaced findings that single-pass enumeration would have missed. Persistent, angle-shifting, effective.

Where things go wrong

Sessions stall in three repeatable ways — and each has a cheap fix.

01 Wrong approach & misread intent

Claude guessed at intent or facts instead of verifying, then built on the faulty assumption. The archetype: the door configurator built with fake placeholder colors instead of reading the real colors.html, plus insisting a site was on the wrong server despite correction.

→Fix: name the source-of-truth file or server up front, and have Claude confirm its understanding before it acts.

02 Unauthorized & excessive changes

Edits beyond the ask, wrong files touched, scope expanded — forcing reverts that erode trust mid-session. After completing relationship-based setup-hours logic in the PHP backend, the whole thing had to be reverted. On the SCF task, unauthorized workbook edits and counting bugs left you deeply frustrated.

→Fix: require a tight change plan with the exact files, approved before any edit. Treat source workbooks as strictly read-only.

03 Tooling & terminal blocks

PowerShell harness blocks, pasted-command mangling, and token limits repeatedly broke flow. Sandbox classifiers blocked Remove-Item and WMI deletes, and multiple sessions hit the 500-token output cap mid-thought.

→Fix: write multi-step commands to a script file and run that; confirm working directory and target server first.

Worth trying next

Concrete, low-cost changes that target your top friction directly.

Rules to add to CLAUDE.md

📁Scope & permissions

"Don't change anything outside the requested scope. Confirm before editing extra files or refactoring. When unsure which file is authoritative, ask first."

🎨Real data only

"Read the actual source — colors.html, real product lists, the SCF Excel — before generating output. Never use placeholder, fake, or assumed values."

✂️Communication style

"Keep responses concise. Lead with the action or answer. Skip verbose explanation unless asked." (You already run this one — worth pinning.)

Features that fit your workflow

Skills

Reusable single-command workflows. You repeat verified deploys constantly — back up → deploy → set ownership → reload php-fpm → verify.

.claude/skills/deploy/SKILL.md → /deploy

MCP servers

You already run Vercel, Supabase, and a self-hosted Gmail MCP. Adding GitHub would close the recurring "am I connected?" gaps.

claude mcp add github

Task agents

Your photo re-classification win came from parallel subagents. Same shape fits recon and multi-site audits — enumerate, then report before touching anything.

Agent → enumerate & report first

On the horizon

As the models get more autonomous, your highest-friction loops become hands-off pipelines.

Self-verifying deployment pipeline

Snapshot state before every push, run smoke tests and visual diffs against the live site, auto-roll back on failure. Your deploy → break → diagnose → recover cycle turns into a gate that guarantees parity before anything reaches production.

Parallel compliance verification fleet

Read-only subagents independently recount all 259 frameworks from each authoritative source, cross-check the live site, and produce a reconciliation report — never touching the master workbook. Multiple agents from different angles kill the single-pass counting bugs that derailed sessions.

Autonomous image-to-asset loop

Generate, read-back with vision to confirm it matches the product, classify and rename, optimize, stage for one approval gate, then publish to the right site. The image work that kept devolving into manual prompt hand-offs becomes a batch pipeline.

“

I would have shot myself if you pushed this.

— Your reaction when Claude populated the TFL Door configurator with invented placeholder colors instead of reading the real in-stock list from colors.html. Lesson filed under read the actual source.