The Web’s Closing Doors – And AI’s Feeling the Lock
By Asaf Shamly | August 21, 2025

Recently, Cloudflare took the unusual step of publicly calling out Perplexity AI for scraping publisher content without permission.
And they didn’t hold back – accusing the company of disguising its crawler, bypassing robots.txt rules, and consuming massive bandwidth without attribution or consent.
Perplexity pushed back.
They said they’re playing by the rules.
That they’re helping the internet, not hurting it.
That they’re being misunderstood.
Cue the familiar debate: who owns online content? Should AI assistants be allowed to “learn” from everything on the web? Is scraping theft – or fair use?
These questions are important. But they’re also a distraction. Because the real tension here isn’t moral. It’s strategic.
This is a fight over visibility. And more precisely: who gets to see what – and what that seeing is worth.
Everyone’s chasing a data edge. Some are running out of road.
What makes tools like Perplexity valuable isn’t their UI. It’s the illusion of omniscience. The ability to summarize any topic, any site, any angle, without needing to click, browse, or read.
But that illusion depends on access – to vast amounts of content, refreshed constantly, without paywalls, slowdown, or blocks.
And that kind of access is starting to evaporate.
Cloudflare’s crackdown is just the most public signal. Publishers have been tightening their controls for months. Robots.txt is being revisited. IPs are being blacklisted. Some are exploring licensing models. Others are demanding compensation.
The message is simple:
If you want visibility, you’re going to have to earn it.
The web is fragmenting into gated intelligence zones
There was a time when AI startups could learn by crawling. But those days are numbered.
As content creators reclaim their walls, platforms are being forced to rethink how and where they train their models.
It’s not a technical debate, it’s a competitive one.
Because as scraping becomes harder – and legal pressure mounts – the companies that will thrive will be the ones with the strongest data partnerships, the clearest usage rights, and the richest exposure to real-world behavior.
Scraping tells you what’s on a page.
Intelligence comes from knowing what users do with it.
Visibility isn’t free (and it never really was)
There’s a growing myth that AI needs “the open web” to function.
But much of what’s out there isn’t open. It’s optimized. Stylized. Rewritten for performance. And increasingly, it’s not even human – it’s content written for other machines.
Training AI on scraped outputs isn’t learning. It’s recycling.
What’s actually scarce now is clean, signal-rich, human-centered behavioral data – the kind that shows how people scroll, react, ignore, pause. The kind that requires consent, cooperation, and context to collect.
That’s what platforms like Perplexity are struggling to access..and that’s why the fight is getting louder.
Don’t mistake the noise for the story
It’s easy to frame this as a debate about fairness, but it’s really a reshuffling of power.
The internet is dissolving into ecosystems where visibility is earned, not inherited. Where systems trained on borrowed content get beat by those trained on direct interaction. Where permission, not reach, drives performance.
AI needs grounding.
And grounding comes from access to what’s real.
Latest Articles
-
We Now Know What Ad Was Served. We Still Don’t Know if it Mattered.
The IAB’s new ACIF standard brings much-needed structure to creative asset tracking in programmatic advertising. It’s a meaningful step — but knowing what was served isn’t the same as knowing if it mattered. This piece explores what ACIF fixes, what it doesn’t, and why understanding the experience of an ad is key to measuring true impact.
View Now -
When Budgets Shrink, Transparency Becomes Power
Today’s media teams aren’t just asked to drive results; they’re asked to defend them. In a market where ad budgets are under the microscope and every dollar must prove its worth, clarity and efficiency win. It’s no longer about spending more — it’s about spending smarter. The brands that can surface waste, double down on what works, and adapt in real time will be the ones that thrive in this new media economy.
View Now -
In Real-Estate and in AdTech – Location is (Almost) Everything
You can have the best ad in the world - high-performing creative, long time in view, shown to an engaged user. But if it’s squeezed between seven other ads on a cluttered page? It’s not going to land. That’s the part we don’t talk about enough. The environment. The layout. The density. The visual noise around the impression. Let’s talk about two under appreciated KPIs that shape that environment.
View Now