July 4, 2026
Two Equations for Forcing KV-Cache Offload
Or how to stop setting --gpu-memory-utilization by vibes and start forcing your offloading tier to actually do something.
A personal journal exploring computer science. Expect to find a diverse mix of content from research papers breakdowns to hands-on tutorials — whatever captures my curiosity.
Or how to stop setting --gpu-memory-utilization by vibes and start forcing your offloading tier to actually do something.
We built Archē: a self-hosted agentic operating system for startups and teams. Not another AI chat wrapper. A real system: agents, knowledge, and integrations: all in one place, all under your control.
AI can make you faster. But if you stop thinking, you do not become faster, you become dependent. My goal is simple: use agents to accelerate thinking, not to outsource it.