Appearance
Multi-Armed Bandits
Understand how Dalton's dynamic traffic routing works.
What Are Multi-Armed Bandits?
Multi-armed bandits (MAB) are an optimization algorithm that dynamically allocates traffic to better-performing variants while still collecting data from all options.
The name: Imagine a gambler at a casino facing multiple slot machines ("one-armed bandits"). The gambler must decide which machines to play to maximize winnings while learning which machines pay out best. That's the multi-armed bandit problem.
Traditional A/B Testing vs. Multi-Armed Bandits
| Traditional A/B Testing | Multi-Armed Bandits (Dalton) |
|---|---|
| 50/50 fixed split | Dynamic traffic adjustment |
| 3+ months for significance | 2-4 weeks for insights |
| Losing variants get 50% traffic | Winners get more traffic automatically |
| Manual stopping decisions | Algorithm optimizes in real-time |
How It Works
Phase 1: Exploration (Week 1-2)
Traffic is split relatively evenly to gather initial data on all variants.
Example:
- Control: 33%
- Variant A: 33%
- Variant B: 34%
The algorithm is "exploring" to learn which variants perform best.
Phase 2: Exploitation (Week 2+)
As data accumulates, the algorithm shifts traffic toward better performers while still monitoring all variants.
Example:
- Control: 15%
- Variant A: 65% (clear winner)
- Variant B: 20%
The algorithm is "exploiting" knowledge while continuing to "explore" for confirmation.
Phase 3: Convergence (Week 4+)
The winning variant gets the majority of traffic. Other variants receive enough traffic to confirm they're not catching up.
Example:
- Control: 10%
- Variant A: 80% (confirmed winner)
- Variant B: 10%
Why This Matters
Fewer Lost Conversions
Traditional tests show losing variants to 50% of visitors for months. MAB minimizes this.
Example: If Variant A converts at 5% and Control converts at 3%, traditional A/B testing shows the inferior control to 50% of visitors for the entire test. MAB quickly shifts traffic to Variant A, capturing more conversions during the test period.
More Experiments
Faster results mean you can run more experiments per year.
- Traditional: 4 experiments/year (3 months each)
- MAB: 12+ experiments/year (2-4 weeks each)
Traffic Requirements
Lower Traffic Threshold
Traditional A/B testing needs 100K+ sessions/month, making it impractical for most businesses.
Multi-armed bandits lower this to just 5K sessions/month minimum (optimal at 10K+).
With lower traffic:
- Tests take 6-8 weeks instead of 2-4
- Confidence builds more slowly
Understanding the Algorithm
Thompson Sampling
Dalton uses Thompson Sampling, a Bayesian algorithm that:
- Maintains a probability distribution for each variant's true conversion rate
- Samples from these distributions to decide traffic allocation
- Updates distributions as new data arrives
- Balances exploration (learning) with exploitation (optimizing)
You don't need to understand the math—just know it works better than fixed splits.
Common Questions
Is MAB less statistically rigorous than A/B testing? No. MAB reaches valid statistical conclusions—just faster. Traditional significance thresholds (95%) can still be applied, but directional decisions at 85-90% confidence are often sufficient for business purposes.
What if traffic is uneven—does that bias results? No. The algorithm accounts for uneven traffic when calculating significance. Uneven traffic is the point—it's how MAB optimizes.
What if I have 5+ variants? MAB handles multiple variants well. The algorithm will find the best performer(s) and allocate traffic accordingly.
When NOT to Use MAB
Not Right for Everyone
Consider traditional A/B testing when:
- Your goal is pure learning or academic research
- You need very high certainty (95%+) over ROI and number of experiments
- You have extremely low traffic (<5K sessions/month)
For most business use cases focused on ROI and continuous optimization, MAB is superior.
Best Practices
- Let it run: Don't stop tests too early—give it 2-4 weeks minimum
- Trust the algorithm: Don't manually override traffic allocation
- Monitor trends, not daily changes: Traffic shifts are normal
- Combine with good hypotheses: MAB optimizes traffic, but you still need good variant ideas