Your multivariate regression ran clean. The paid SOV coefficient came out close to zero. Dashboard conclusion: "retail media isn't incremental." But that conclusion is only as honest as the variables the model could see. Here are the four blind spots only operational scraping reveals.
The model on its own is defensible — but incomplete
In the previous article we defended the observational attribution model: a panel regression with six variables (paid SOV, relative price, promos, content score, reviews, competitor actions) on organic position. It's defensible, IAB/MRC-compliant, and better than total darkness.
But it has a structural ceiling: endogeneity and unobserved confounders. The model only sees the variables you fed it. If the variable actually moving your rank isn't in the regression, the coefficients of the variables that are in there will lie — because they'll absorb credit (or blame) for whatever wasn't measured.
Operational scraping doesn't fix endogeneity. But it does close the unobserved-confounder gap, because it turns into measurable variables things most MMMs leave out.
Confounder #1 — Cannibalization: you paid for clicks you'd have gotten free
What the model says: paid SOV coefficient ≈ 0. Conclusion: not incremental.
What's actually happening: 50% to 70% of spend was allocated to keywords where the brand was already organic #1-3. Those paid clicks sat right next to the same brand's organic listing. The conversion was going to happen anyway. The average coefficient comes out flat because the real incremental effect (on keywords where you weren't organic) gets diluted by the zero effect of cannibalized keywords.
What scraping reveals: the organic × paid matrix per keyword. The pathological zone is top-left — organic #1-3 crossed with high paid SOV. Each cell in that zone is a hypothesis of non-incremental spend.
This isn't theory. Blake, Nosko & Tadelis (2015) at eBay ran a holdout on branded paid search and found that nearly all the brand-keyword spend was non-incremental for an established brand. Multivariate regression doesn't reach that conclusion. The experiment does. And scraping tells you where to look before you pay for the experiment.
Without the matrix, the "is it or isn't it incremental" debate stays at the aggregate level. With the matrix, the debate becomes "these 87 keywords are suspect, these other 213 probably are moving the needle." That second conversation is actionable.
Confounder #2 — Defensive spend: you won by losing less
What the model says: you raised paid SOV, your rank didn't move or it dropped. Coefficient null or negative. Conclusion: spend isn't working.
What's actually happening: a competitor attacked you with their own SOV ramp. Without your defensive spend, you'd have dropped 10 positions instead of 2. The spend was completely incremental against the goal of defending share — but the model reads the lack of improvement as failure.
What scraping reveals: the competitor's paid SOV timeline. If a competitor went from 8% to 22% SOV on the top 10 keywords during the same period, you're no longer measuring "did the spend work?" — you're measuring "did our spend net of theirs move rank?". That's a different question with a different answer.
The observational model includes "competitor actions" in theory, but most teams approximate it with a weak proxy (launches, generic promos) because they don't have competitor SOV measured by keyword. Without that real variable, the regression imputes the effect of competitor attack to the error term. The coefficient on your own SOV gets contaminated and says something it shouldn't.
Evidence of successful defense looks like this: the competitor's SOV ramped aggressively, your rank held, and the model "doesn't see" improvement because there was no rise — but there was a defended floor. Only scraping shows that floor.
Confounder #3 — Stockouts contaminating the period
What the model says: the paid SOV coefficient comes out misleadingly positive and noisy.
What's actually happening: the SKU was out of stock for 11 days at the main retailer during the period. There's a key LATAM detail here that breaks the intuition imported from the Amazon playbook: on VTEX retailers (Éxito, Jumbo, Olímpica) and on Mercado Libre, the platform automatically pulls sponsored slots for out-of-stock products. You can't "keep spending through a stockout" the way you can on Amazon — the campaign simply turns itself off.
The effect on the model is perverse. During the stockout: paid SOV drops to zero, sales velocity collapses, and organic rank falls. When stock returns: paid SOV returns, and rank starts to recover. The regression sees a clean correlation between paid SOV and organic rank — both go up and down together — and attributes the movement to spend. The real driver, however, is availability. SOV and rank are both downstream consequences of the same stockout; neither is causing the other. The coefficient comes out inflated and says something it shouldn't.
What scraping reveals: the availability timeline by SKU × retailer × day (ideally hourly). It lets you identify stockout periods, exclude them from the analysis, or add availability explicitly to the model so the SOV coefficient is cleaned of the stock-driven movement.
This is the cleanest unobserved-confounder case: the model can't include a variable it doesn't observe. If your internal data doesn't capture stockouts at the right granularity (many brands rely on logistics inventory, which lags weeks behind what shoppers actually see on the PDP), the regression is blind. And scraping is the only source that gives you the stockout exactly as the ranker experienced it — and as your ad engine experienced it.
Operating rule: before believing any coefficient, cross the analysis period against the availability timeline. If there's a stockout >5% of the time on core SKUs, the model needs to exclude those days or add availability as a variable inside.
Confounder #4 — Content gaps that cap conversion
What the model says: you raised paid SOV, rank stayed flat. Coefficient close to zero.
What's actually happening: your PDP has 4 images and the category leader's has 9. No video. You're missing three structured attributes — the fillable fields on the product sheet, like "sugar-free," "family size," "vegan-friendly" — that the retailer uses to feed the side-panel filters on the category page. Without those attributes filled in, when a shopper applies a filter your product disappears from results even when it actually meets the criterion. Your content score is below median, and paid spend drives traffic to the listing, but conversion hits a PDP ceiling that no impression increase can break.
What scraping reveals: content score timeline, attribute completeness, image count vs. competitors, review velocity, video presence. When the SOV coefficient is flat and content score is below category median, the limiter isn't SOV — it's the PDP.
This matters because the retail media team will receive a null coefficient and conclude "doesn't work, lower the bid." The correct conclusion is "doesn't work because the listing doesn't convert; fix the listing before judging the bid." Two opposite operating routes, and only one solves the problem.
The observational model includes content score as a variable, but it's typically measured at end-of-period as a snapshot — not as a time series. If the listing improved mid-period, the end snapshot paints a universe where content was always good. Continuous scraping gives you the real series and lets you see whether the problem was poor content before, or still is now.
The honest workflow
The observational model without operational context is a coefficient floating in air. With operational context, it becomes a diagnosis. The routine:
- Run the regression over the period. Note the paid SOV coefficient and its confidence interval.
- Cross against the organic × paid matrix. What percentage of spend fell in the top-left zone? If >40%, suspect cannibalization before believing the null coefficient.
- Cross against competitor SOV. Did the competitor ramp up during the period? If so, what you measured was net spend, not gross spend.
- Cross against the availability timeline. Were there stockouts >5% of the time on core SKUs? If so, drop the period or add the variable and re-run.
- Cross against content score and review velocity. Was the PDP converting during the period? If content score was below median, SOV wasn't the bottleneck.
Only after those four cross-checks does the coefficient mean what you think it means. Before that, it's an interpretation with undocumented bias — exactly what the IAB/MRC standard asks you to avoid.
Closing
Observational attribution alone is incomplete because models can't see what they don't measure. Geo holdouts remain the gold standard for the deepest counterfactual; between annual experiments, this cross-check — regression + operational scraping — is what you actually run.
ePerfectStore doesn't replace your MMM. What it does is give your model the context it can't capture: the organic × paid matrix, competitor SOV by keyword, the availability timeline by SKU × retailer, and continuous content score. It turns coefficients into diagnoses. That's the difference between "retail media isn't incremental" and "these 87 keywords aren't, these other 213 are, and this period needs to be redone because the competitor attacked in week 4."
One final distinction that deserves its own article: incrementality isn't the same as share. Your retail media can be perfectly incremental — those sales wouldn't have happened without the spend — and you can still be losing ground if competitors are growing faster. "Is my retail media working?" (incrementality) and "Am I winning in the market?" (share, competitive position) are two different questions that require two different lenses. For the first, the regression + scraping cross-check described above. For the second, the organic position thermometer — the other layer where ePerfectStore.com is the natural fit, and the one that actually answers the market-share question.
Sources
- Blake, Nosko & Tadelis (2015) — Consumer Heterogeneity and Paid Search Effectiveness: A Large-Scale Field Experiment. Econometrica. The canonical eBay branded paid search holdout: nearly all brand-keyword spend was non-incremental for an established brand. Econometrica / Wiley.
- Karmaker Santu, Sondhi & Zhai (2017) — On Application of Learning to Rank for E-Commerce Search. SIGIR. Survey of state-of-the-art LETOR for retailers, including how CTR, add-to-cart, conversion, and velocity are ranking signals. arXiv 1903.04263.
- Dash, Ghosh, Mukherjee, Chakraborty & Gummadi (2024) — Sponsored is the New Organic. AAAI. Algorithmic audit showing sponsored products consistently appear above products that Amazon's own algorithm classifies as more relevant, and that sponsorship feeds back into organic rank via clicks. arXiv 2407.19099.
- Sorokina & Cantu-Paz (2016) — Amazon Search: The Joy of Ranking Products. SIGIR. Amazon's own paper on how A9 incorporates engagement signals (clicks, conversions, reviews) into ranking — the foundation for why scraping confounders matter so much. Amazon Science.
- IAB/MRC — Retail Media Measurement Guidelines (January 2024). Joint standard accepting observational attribution as a valid method as long as bias is documented. IAB official PDF.
Is your MMM measuring retail media without knowing where you're cannibalizing, where you're being attacked, where you ran out of stock, or where content is capping conversion? ePerfectStore.com gives the operational context that turns coefficients into diagnoses.