Run experiments that actually decide.
Your team ships tests but the readouts are theatre — too small to call, peeked early, or compared against the wrong control. You're not learning. You're guessing in a more expensive way. The growth team wants velocity; the analytics team wants rigour; the founder wants both. Nobody's wrong, and nothing ships.
- 01 Test backlog, prioritised
Every idea scored by lift × confidence × effort. The next 12 tests, in order. You always know what's running and what's next.
- 02 Power-aware planning
Every test is sized before it ships. We don't run tests we can't call. Underpowered ideas get re-scoped or killed.
- 03 Pre-registered hypotheses
Hypothesis, primary metric, decision rule — written down before traffic flows. No outcome-shopping after the fact.
- 04 Clean readouts
Writeups your CFO can defend: confidence intervals, holdout validation, segment cuts, and a one-line recommendation.
- 05 Weekly cadence baked in
A standing test-review ritual the team can run after I leave. The cadence outlives the engagement.
- 06 Infra + tooling
Whatever you're running — VWO, Convert, Statsig, GrowthBook — wired correctly. SRM checks, holdout management, segment fidelity.
- Brands with ≥10K sessions/month per testable page
- A growth or analytics lead who'll own the cadence after I leave
- Existing test infra (or willingness to set it up in week 1)
- Teams ready to act on losing tests, not just winning ones
- Sub-traffic landing pages where statistical power is impossible
- Teams looking for a tool installation
- Brands that only want to ship 'winners' (the point is the decision, not the trophy)
- Hourly engagements
How many tests can we realistically ship?
4–8/month is the band. Below 4 means we're not learning fast enough; above 8 means each test gets sloppy. We optimise for decision throughput, not test count.
Do you run the tests or does my team?
Your team runs them. I design, prioritise, and read out. You keep the muscle when I leave.
What if the first batch of tests all lose?
That's a useful finding. Losing tests inform the backlog as much as winning ones — they kill bad ideas before they ship to product roadmap.
What tools do you work in?
VWO, Convert, Statsig, GrowthBook, Optimizely, or a homegrown framework. Tool is irrelevant — the discipline is what matters.
Can we extend past 90 days?
About half do. After day 90 the cadence is your team's. I stay on as advisory if you want a second pair of eyes on the writeups.
If your test cadence has stalled or your readouts feel hand-wavy, let's talk. 90 days. 4–8 tests/month. Real decisions.