OUTPUT · 02
90 DAYS · FROM $18K · FIXED

Run experiments that actually decide.

A 90-day sprint that ships 4–8 tests per month with proper power, holdouts, and writeups your team can defend. No vanity wins. No two-week tests that prove nothing.
THE PROBLEM

Your team ships tests but the readouts are theatre — too small to call, peeked early, or compared against the wrong control. You're not learning. You're guessing in a more expensive way. The growth team wants velocity; the analytics team wants rigour; the founder wants both. Nobody's wrong, and nothing ships.

WHAT YOU GET
  1. 01
    Test backlog, prioritised

    Every idea scored by lift × confidence × effort. The next 12 tests, in order. You always know what's running and what's next.

  2. 02
    Power-aware planning

    Every test is sized before it ships. We don't run tests we can't call. Underpowered ideas get re-scoped or killed.

  3. 03
    Pre-registered hypotheses

    Hypothesis, primary metric, decision rule — written down before traffic flows. No outcome-shopping after the fact.

  4. 04
    Clean readouts

    Writeups your CFO can defend: confidence intervals, holdout validation, segment cuts, and a one-line recommendation.

  5. 05
    Weekly cadence baked in

    A standing test-review ritual the team can run after I leave. The cadence outlives the engagement.

  6. 06
    Infra + tooling

    Whatever you're running — VWO, Convert, Statsig, GrowthBook — wired correctly. SRM checks, holdout management, segment fidelity.

FIT
For
  • Brands with ≥10K sessions/month per testable page
  • A growth or analytics lead who'll own the cadence after I leave
  • Existing test infra (or willingness to set it up in week 1)
  • Teams ready to act on losing tests, not just winning ones
Not for
  • Sub-traffic landing pages where statistical power is impossible
  • Teams looking for a tool installation
  • Brands that only want to ship 'winners' (the point is the decision, not the trophy)
  • Hourly engagements
FAQ
How many tests can we realistically ship?

4–8/month is the band. Below 4 means we're not learning fast enough; above 8 means each test gets sloppy. We optimise for decision throughput, not test count.

Do you run the tests or does my team?

Your team runs them. I design, prioritise, and read out. You keep the muscle when I leave.

What if the first batch of tests all lose?

That's a useful finding. Losing tests inform the backlog as much as winning ones — they kill bad ideas before they ship to product roadmap.

What tools do you work in?

VWO, Convert, Statsig, GrowthBook, Optimizely, or a homegrown framework. Tool is irrelevant — the discipline is what matters.

Can we extend past 90 days?

About half do. After day 90 the cadence is your team's. I stay on as advisory if you want a second pair of eyes on the writeups.

Ship tests that decide.

If your test cadence has stalled or your readouts feel hand-wavy, let's talk. 90 days. 4–8 tests/month. Real decisions.

BOOK A 20-MIN CALL