structured

Structured

Short description placeholder

Structured. Tier 3 of the OperatorAI Maturity Model

What this tier means

You have a regular testing cadence. 10-19 tests per quarter, scheduled rather than reactive. There's a hypothesis priority framework, PIE, ICE, or an internal rubric. Sample size is calculated before launch (most of the time). The team tracks losing tests in a document somewhere.

This is the tier where most ambitious in-house CRO programmes plateau. The cadence is real. The discipline is partial. The lift is in the 8-15% band, depending on which disciplines have been formalised.

What it looks like in practice

10-19 tests per quarter (mostly hero, CTA, copy variations; some pricing-page tests)
Hypotheses ranked via PIE / ICE / internal scoring rubric
Sample size calculated against 95% confidence with documented MDE
95% threshold gates the winner-call (tool default; no override)
Some peeking happens, but the team feels guilty about it
Losing tests logged in a Notion / Airtable tracker, not always reviewed for pattern-recognition
Self-serve AI tools used routinely

Why this matters

Structured programmes have built most of the testing infrastructure. What's missing is the protocol that turns infrastructure into compounding wins:

The 99 Rule. Moving from 95% to 99% significance reduces false positives 5x. The 4% gap costs ~5 false positives per year on a 120-test programme.
Failure-as-information. The tracker exists, but losing tests aren't being mined for failure-mode patterns.
Operator-set hypothesis quality. PIE/ICE rubrics are better than nothing, but they don't substitute for 13 years of pattern-recognition.

Recommended next move

Pricing Experimentation Audit — £2,500

Five testable pricing hypotheses + 12-week implementation roadmap. 21 days end-to-end. The most undertested surface in your funnel, with the highest revenue-to-conversion-lift mapping. Built on the same testing discipline that took Enzymedica from 3.4% to 16.9% conversion on Black Friday 2021.

Score your CRO programme in 7 questions

Take the diagnostic