Sixteen weeks. Checkout completion plus twenty-four percent.

A $30M apparel brand had been running A/B tests on hope, call-and-response with the agency, no sample sizing, every test 'won' until the win disappeared at the next pricing reset. We installed an experiment program that ran without me by week ten and shipped four real wins by week sixteen.

CHECKOUT COMPLETION

+24%

Vs 16-wk pre-program baseline

TESTS RUN

4 winners · 6 null · 2 losers

ANNUALIZED LIFT

+$3.8M

From 4 shipped winners

FROM HOPE TO PROGRAM

16 WK

Team runs it without me now

The problem.

The marketing team was running "tests" that weren't experiments. No sample sizing. No pre-registered hypothesis. Whichever variant looked best after a week became the new default. Revenue would slide three weeks later and nobody could explain why. The agency was a partner, not a system. The CFO had stopped believing the lift claims by the time I came in.

The approach.

First two weeks: an honest audit. Of the previous 18 "wins," nine were noise, three were seasonality, two were a pricing change the team hadn't controlled for. Six were real. Then we built a real program, pre-registered hypotheses, sample-sized at 95% confidence and 80% power, an MDE each test could actually detect, three concurrent test slots, a weekly review where every test had to declare its primary metric and stop date before launching. By week ten the marketing manager was chairing the meeting.

The outcome.

Twelve tests in sixteen weeks. Four winners shipped, a cart-page redesign (+11% completion), a guest-checkout default (+8%), a delivery-promise relocation (+6%), and a payment-icon order test (+3%). Stacked, they were the +24% headline. The other eight tests were null or losers, which the team now treats as a result, not a failure. The CFO trusts the lift numbers because they line up with the bank statement. The marketing team runs the program. I haven't touched a test in three months.

SERVICE CRO + Experimentation

Running tests, not getting lifts?

If your team is running A/Bs without sample sizing and your wins don't compound, we should talk. One slot open for Q3 2026.

BOOK A 20-MIN CALL

Sixteen weeks. Checkout completion plus twenty-four percent.

The problem.

The approach.

The outcome.

Related stories.

Eleven weeks. $1.4M of leaked revenue, recovered.

Eleven dashboards in. One decision per week out.