Scale Yourself Report / Owner application / 2026
This is how I've been working for the past year. The clearest example is one side project: I shipped LoveCompass — a wellness app for couples, live on web, iOS and Android — as 1,001 merged PRs in three months, with strategic human gates on each feature development cycle / decision.
Primary focus: LoveCompass · Mar 09 – Jun 06, 2026The volume came from building the right scaffolding and then getting out of its way. AI agents handle code review across four dimensions, a deterministic CI pipeline gates every merge, and a dashboard keeps everything visible. I step in for feature decisions and direction — the rest runs. Along the way I picked up patterns I find genuinely interesting: Deterministic Simulation Testing, saga orchestration, intuition engineering, RAG-based self-improvement.
The required chart, and the shape of the work: a heavy build-out in March, refinement in April, then sustained shipping through May.
A full-stack wellness app for couples, shipped across three platforms using AI-assisted development pipelines — automated multi-dimensional review backed by deterministic testing.
| Property | Detail |
|---|---|
| Platforms | Web · iOS · Android |
| Live | couplesapp.nextasy.co |
| Window | Mar 09 – Jun 06, 2026 |
| Throughput | 1,001 PRs · 11/day avg · 67 peak |








scroll → · click to expand
Here's what's actually running under the hood. These aren't magic — each one is a thing I had to build, debug, and tune before it was useful.
At this PR volume, manual review on every dimension isn't realistic. Four agents run in parallel, each focused on one area, and iterate with the dev up to three times per PR until their gate passes:
Model behaviour is non-deterministic — you can't rely on it for correctness gates. So I kept those separate: a three-phase pipeline that runs the same way every time, regardless of what the agents do:
Result: 100% deterministic validation on all 1,001 merges. Zero AI decision-making on merge eligibility — only human-defined gates. Snapshot (Jun 06): K8s cluster running 20 pods, ~1.1 merges/hour at peak.
Live ARC cluster — Jun 06, 2026 · click to expand
The thing I didn't anticipate: agents running 24/7 are hard to watch. Non-deterministic behaviour, surprise token costs, agents deadlocking — you only find out from the bill or a stuck queue. I built IntuitionOps to make the pipeline readable at a glance, borrowing a few old distributed-systems ideas:
Result: token cost is a first-class signal, stuck agents light up instantly, and infinite agent loops show as cost climbing while merges stay flat. Read the full write-up →
IntuitionOps dashboard · click to expand
The pipeline runs on Claude subscriptions, not pay-as-you-go API tokens. That means the standard Langfuse SDK instrumentation doesn't apply — there's no API call to intercept and no token counter to hook into. To get per-issue visibility anyway, I built a custom tracing layer that publishes traces to Langfuse independently of the model calls:
Every merged PR is an opportunity to learn. After merge, an automated agent extracts learnings from the issue — gotchas, patterns, decisions — and proposes them as structured entries. A human reviews and approves what's genuinely worth keeping. Approved entries are embedded and stored in LanceDB, and every future agent run queries that store before acting:
Standing on proven shoulders instead of reinventing:
| Metric | Value |
|---|---|
| Total merged PRs | 1,001 |
| Average PRs / day | 11 |
| Peak day | 67 — Mar 17, 2026 |
| Top 3 days | 67 (Mar 17) · 54 (May 18) · 47 (Mar 26) |
| Full PR log | all 1,001 PRs — searchable → |
This is just how I work now. Build the scaffolding, set the gates, stay in the loop for the decisions that matter. The volume is a side effect of having good tooling — not the goal itself.
A separate side-project, included as a second data point. It isn't part of the LoveCompass story above — it's another example of the same AI-leverage instinct applied to a different problem: answering recurring questions about me, safely.
Answer recurring questions about me automatically while resisting jailbreaks and off-topic abuse. Each request flows through six guard rails:
Live at marcuss.pro · handles adversarial input · observable via Langfuse · single-command recovery.