The System I Built to Run 50 Projects Alone
Everyone talks about shipping fast. Nobody talks about maintaining 50 things after you ship them. Here's the system I built to survive.
Everyone talks about shipping fast. Nobody talks about what happens after you ship 50 things.
The indie hacker dream goes like this: build fast, ship often, see what sticks. I took that advice literally. Over the past few years I built SaaS tools, side projects, internal utilities, client work, learning experiments, and plenty of things that never went anywhere. Some got users. Some made money. Most just sat there, deployed and quietly running.
Then one morning I woke up and counted. 27 domains on Cloudflare Registrar. 50+ repositories on GitHub. 6 Supabase databases. Deployments spread across Vercel, Fly.io, Cloudflare Workers, Cloudflare Pages, and Netlify. Around 30 SaaS subscriptions. And $452/month leaving my bank account to keep it all alive. Things were breaking that I didn't know about. I'd become the worst SRE in the world, responsible for a fleet I couldn't even enumerate from memory.
The problems that compound
The first time an SSL certificate expired on one of my sites, I found out because a user emailed me. "Your site shows a security warning." The certificate had been expired for 11 days. Nobody told me. I fixed it in 5 minutes, but the site had been effectively down for almost two weeks.
That's the thing about running many projects: failures are silent. Dependencies accumulate CVEs and nobody notices. That lodash vulnerability from six months ago? It's in 12 of your repos. DNS records drift when you move between providers. A CNAME still points to an old Netlify deploy. A domain you transferred to Cloudflare still has stale records from its previous registrar. You discover these things by accident, if at all.
Costs creep in the same way. I was paying for three different monitoring services that I'd tried at various points and never cancelled. A $10/month subscription I'd forgotten about had been charging me for a year. When I finally sat down and audited everything, I found $60/month in pure waste -- services I wasn't using, plans I'd upgraded during a busy week and never downgraded.
But the worst problem was cognitive. Every time I opened Claude Code on a project I hadn't touched in two weeks, it started from zero. I'd re-explain the architecture. Re-explain my deployment setup. Re-explain that this project uses Supabase, not Prisma. That we deploy to Cloudflare Workers, not Vercel. That the database schema has this specific quirk because of a migration I ran three months ago. Every single session. Multiply that by 50 projects and the context-switching overhead alone was eating hours every week.
The moment it broke
I realized I'd stopped being a developer. I'd become my own DevOps team, my own SRE, my own IT department. And I was bad at all three.
The tipping point was a single Tuesday when three things happened: SSL expired on two sites simultaneously, a Supabase project hit its free-tier row limit and started rejecting writes (on a project that actually had users), and I discovered a dependency vulnerability that had been public for four months across 8 of my repos. I spent the entire day firefighting instead of building. That evening I opened a blank repo and started writing the thing that would eventually become my fleet management system.
What I built
Fleet monitoring: 16 cron jobs
The core is boring and that's the point. Sixteen cron jobs run on Vercel Cron, each doing one thing well.
A healthcheck pings every site every 15 minutes. Not a fancy uptime service -- just an HTTP GET that checks for a 200 response and measures latency. If a site is down, I get a notification. If latency spikes above its historical baseline, I get a warning.
An SSL scanner checks certificate expiry dates across the entire fleet. Anything within 30 days gets flagged. I haven't been surprised by an SSL expiry since.
A DNS verifier catches drift and misconfiguration. When I move a site between providers, it checks that DNS records actually point where they should. It catches the CNAME-still-pointing-to-old-Netlify problem before users do.
Lighthouse audits run on a schedule and flag performance regressions. If a deploy tanks the performance score by more than 10 points, I know about it before it affects users.
A dependency scanner checks for known CVEs across all repos. Instead of discovering a vulnerability when GitHub sends me the 47th email I've been ignoring, I get a single prioritized list.
The key insight was the severity system. Not everything deserves my attention.
- P0 (critical): Auto-fix and alert me. Site down? Restart it. SSL about to expire? Renew it.
- P1 (important): Create a PR and notify me. I review and merge.
- P2 (routine): Create a PR, auto-merge if tests pass. I find out in the weekly digest.
- P3 (informational): Batch into a monthly report. I read it with coffee.
This alone cut my firefighting time by 80%. Most issues are P2 or P3. The system handles them. I only get pulled in for things that actually need a human decision.
The spending tracker
Here's where I spend $452/month. Real numbers, because I think most indie hackers have no idea what their actual burn rate is.
AI Assistants: $281/month. Claude Max at $235 is the biggest single line item. I use it for everything -- coding, writing, architecture decisions, debugging. ChatGPT Plus at $20 for occasional second opinions and image generation. Gemini Advanced at $26 for cheap high-volume classification tasks.
Hosting: $63/month. Vercel Pro at $20 (16 cron jobs, 300-second function timeout, the features that matter). Supabase Pro at $34 across the projects that need it. Netlify at $9 for a couple of legacy static sites. Fly.io at $0.03, basically free for a tiny always-on service.
Dev Tools: $45/month. n8n Cloud at $31 for webhook orchestration (thin trigger layer, never owns state). ngrok at $10 for local development tunnels. GitHub at $4.
Domains: $46/month amortized. 27 domains across .com, .dev, .app, and .ai TLDs. The .ai domains are expensive.
Personal tools: $11/month. 1Password, Viber, Duolingo. Not really infrastructure but they show up in the audit.
How I actually track this: once a month, I use a Claude prompt in the browser to scrape all my billing pages and output a structured table. I paste the result into an import form. The whole process takes about 10 minutes. It's not automated because billing pages change their HTML constantly and a scraper would break every month. The semi-manual approach is more reliable.
AI memory layer
This is the piece I'm most excited about, because it solves the problem I was complaining about every day.
The problem: every AI coding session starts from zero context. If you use Cursor or Claude Code, you know the pain. The agent has no idea about your architecture, your conventions, your deployment setup, or the bug you fixed yesterday. CLAUDE.md helps for one project, but I have 50.
My approach: capture knowledge from everywhere with zero friction. Telegram voice notes while walking. Text messages from my phone. GitHub webhook events. Email forwards. CLI pastes. A web inbox. The capture surface doesn't matter -- what matters is that the barrier is under 3 seconds. If it takes longer than that, I won't do it, and the knowledge stays in my head where it's useless to my future self and invisible to AI agents.
Everything captured goes through a processing pipeline. First, deterministic secret scanning -- 22 regex patterns check for API keys, tokens, passwords, and credentials BEFORE any LLM sees the content. I caught myself almost sending an AWS key to Gemini in the first week. This step is non-negotiable.
Then Gemini Flash classifies the input: what scope is it (shared operations, project-specific, personal)? What kind (decision, fact, procedure, preference, issue, idea)? How confident is the classification? Should it be promoted to permanent knowledge, held for review, or given a short TTL and left to expire?
Deduplication catches the inevitable repetition. Hash matching for exact duplicates, fuzzy matching for near-duplicates, semantic matching via embeddings for conceptual duplicates. "Deploy to Cloudflare Workers" and "push to CF Workers" are the same knowledge and shouldn't exist twice.
The output is structured Markdown files that get synced into each project repo as .terso/ directories. When I open Claude Code on a project, the agent can read these files naturally. No APIs, no MCP servers, no authentication dance. Just Markdown files next to the code, with YAML frontmatter noting when they were generated and when they expire.
The newest piece: auto-generated CLAUDE.md. Instead of manually maintaining the instructions file for each project, it's assembled from the knowledge base. Architecture decisions, code standards, recent debug sessions, deployment configuration -- all pulled from the structured knowledge and organized into the format Claude Code expects. When I update a convention in one place, it propagates to every project that convention applies to.
There's also a session observer -- a Claude Code hook that captures what happened in each coding session and posts it back to the system. Fixed a tricky bug? The debug insight gets captured automatically. Made an architecture decision? It's recorded with context. Zero friction, zero extra effort.
The architecture
The whole thing runs as a Next.js app on Vercel Pro. Supabase Postgres with pgvector handles storage and semantic search. The processing pipeline is triggered by a cron job every 2 minutes -- no persistent workers, no message queues, no infrastructure to babysit.
LLM usage is stratified by cost. Gemini Flash handles 90% of the volume: classification, entity extraction, segmentation. It's cheap and fast. Claude Sonnet handles the 10% that needs real reasoning: consolidating contradictory knowledge, generating canonical documents, complex summarization. OpenAI provides embeddings via text-embedding-3-small. The total LLM budget is capped at $15/day and $200/month, with automatic task deferral if the budget runs hot.
The search layer is hybrid: 60% semantic similarity via pgvector, 40% keyword matching via Postgres full-text search. This handles the failure mode where embeddings miss exact technical terms. Searching for "wrangler.jsonc compatibility_flags" needs keyword matching, not vibes.
Lessons from building this
Automation without readiness contracts is just faster chaos. Early on, I pointed the dependency scanner at all 50 repos. Half of them had no tests, no healthcheck endpoint, and no documented rollback procedure. Auto-creating PRs for projects that can't verify the fix is worse than doing nothing. Now projects need to score 11 out of 16 on a readiness checklist before they enter the monitoring system. Tests exist. Healthcheck exists. Rollback is documented. CI passes. If a project isn't ready, it stays in manual mode.
The most important feature is expiry. Every piece of captured knowledge has a time-to-live. Promoted knowledge is permanent. Everything else expires: 90 days for items held for review, 30 days for short-lived notes, 7 days for transient captures. If something isn't promoted within its TTL, it ages out automatically. This prevents the failure mode of every knowledge management system I've tried before: the junk drawer that grows until it's unsearchable. Captured data must prove its worth or disappear.
Secret scanning before LLM calls is non-negotiable. Regex patterns run before any content reaches any model. API keys, database connection strings, JWT tokens, private keys -- 22 patterns covering every credential format I've encountered. The first week I built the capture pipeline, I almost sent an AWS secret key to Gemini Flash as part of a debug log. The scanner caught it. This step costs nothing in latency and everything in peace of mind.
The real ROI is cognitive load, not time. I can't honestly say this system saves me hours per day. Maybe 30 minutes on a good week. What it actually does is eliminate the background anxiety of "is something broken that I don't know about?" I stopped checking my sites manually. I stopped waking up wondering if a certificate expired. I stopped dreading the context-switching cost of jumping between projects. The monitoring handles the vigilance. The knowledge layer handles the memory. I just build.
What's next
I'm open-sourcing the CLI tool (terso) and the .terso/ directory format spec. The idea is simple: any developer should be able to run terso sync and give their AI coding agent structured context about their project, their conventions, and their recent decisions. The capture and classification pipeline is more opinionated, but the output format is something the whole ecosystem can use. If your AI agent can read Markdown files, it can read .terso/.
The system isn't finished. The venture studio layer -- automated opportunity scanning, MVP generation, template-bound overnight builds -- is implemented but untested. The Google Drive sync for personal knowledge is wired up but waiting on credentials. There are rough edges everywhere.
But for the first time in years, I know the state of my entire portfolio. Nothing is silently broken. Nothing is silently charging me. And when I open a coding session, the agent already knows what I know.
That's worth $452/month.