[bmdpat]

SEC 001 / LOCAL HARDWARE

Local GPU work, measured plainly._

I use an RTX 5090 to test local models, agent runs, and workflow experiments. The notes track what fits in VRAM, what breaks, and when local hardware is worth the extra setup.

AgentGuard stays in the stack when a run needs budget, loop, timeout, or rate limits before it touches real work.

RTX 5090benchmark artifactsfailure logsVRAM fit toolslocal-agent notes
SEC 002 / BENCHMARK EVIDENCE

The lab notebook is the proof surface.

The first public run proved RTX 5090 hardware detection and exposed an Ollama runner timeout before a valid tokens/sec row. That failure stays in the notebook.

Every public claim needs a concrete artifact: benchmark CSV, failure report, architecture diagram, repo note, or cost curve. If a run fails, it stays in the record.

Tokens/sec
Model x quant

Measured on real local-agent prompts, not placeholder demos.

VRAM pressure
Context + cache

What fits, what spills, and what changes after quantization.

Cost curve
Local vs API

Per-workload math for agents that run often enough to matter.

Failure log
Timeouts included

Runner crashes, bad configs, and dead ends stay in the record.

2026-06-12 / benchmark-failed

The 5090 Reports - 2026-06-12

Hardware capture is live. The first bounded Ollama benchmark failed before a valid tokens/sec row, so the public artifact reports the miss instead of inventing a performance claim.

Source: nvidia-smi + Reports/5090/benchmarks

2026-06-12 / failed

5090 Benchmark Failure - gemma4:26b

The Ollama request timed out after 5 seconds with gemma4:26b at num_ctx 1024 and num_predict 16.

Source: Reports/5090/failures/2026-06-12-gemma4-26b.md

SEC 003 / PRODUCT PATH

Local hardware is the lab. Repeated measurements become tools.

The destination is self-serve local-agent infrastructure: observability, memory, MCP, runtime limits, and local hardware fit.

Phase 0

Instrument the lab

Weekly reports from hardware snapshots, benchmark CSVs, and failure logs.

Phase 1

Distribute artifacts

Three posts per week across LinkedIn, X, and r/LocalLLaMA, all pointing here.

Phase 2

Capped deployments

Inbound-only, async paid R&D for regulated teams that need local AI.

Phase 3

Extract product

Local agent observability, memory, or MCP tooling rebuilt from repeated deployment work.

Operating rules

Publish the model, quant, prompt, hardware, and result.

Run one local experiment per week, even when the result is a failure.

No fake benchmark numbers. A failed run is a valid artifact.

Prefer measured notes over broad claims.

OPERATING LOOP

One person. Small tools. Agent-assisted ops.

01

Run

Execute the local model path on local GPUs.

02

Measure

Record tokens, latency, VRAM, cost, and failure mode.

03

Publish

Turn the result into a report, tool, or guarded SDK path.

SEC 007 / LAB NOTES

Get the local GPU build notes.

Weekly notes from the 5090 lab: benchmark rows, failure logs, private-agent deployment notes, and the tools that fall out of repeated local AI work.

Want more like this?

AI agent builds, real costs, what works. M-F only when there is something worth sending. No fluff.

Get the 5090 lab notes

Weekly local-GPU notes from scheduled lab runs on an RTX 5090: benchmark artifacts, model-fit checks, and failure logs.