Analysis uncovered that when its official crawler was blocked, Perplexity switched tactics. Sometimes it masqueraded as a regular browser, pretending to be Chrome running on a Mac. Other times, the source of these requests would jump between different internet addresses and networks, sidestepping bans by cloaking itself as just another user.
Attempts to block this shadowy traffic sparked a digital cat and mouse game: as websites tightened restrictions, Perplexity’s crawlers simply found new doorways, swapping their digital fingerprints and routing requests through fresh channels.
A company spokesperson from the research team stated, “We fingerprinted this crawler through network behavior and machine learning, watching as it rotated IPs and used undisclosed user agents to slip past automated defenses.”
Interestingly, when the stealthy strategies were neutralized, Perplexity switched gears and gathered more generic info from third-party sites instead. The specificity faded, showing that the block actually worked in depriving it of restricted data.
Meanwhile, other leading AI platforms are showcasing a better approach. OpenAI, for example, is transparent about its bots’ identities and explains what each is doing. When tested on restricted websites, OpenAI’s ChatGPT followed instructions to the letter, halting its crawl if told not to proceed, and making no further attempts under different disguises.
This behavior contrasts sharply with what has been seen from Perplexity, according to the experts reviewing web traffic patterns. One engineer involved in the analysis said, “The right way is clear: bots should play by the rules and respect the choices of website owners.”
For now, a growing list of sites are tuning their security systems to pick up on Perplexity’s more elusive attempts. Tools are updating their filters, aiming to catch stealth crawlers that try to outsmart traditional blocks.