Humans now share the web equally with bots, according to a major new report – as some fear that the internet is dying.
In recent months, the so-called “dead internet theory” has gained new popularity. It suggests that much of the content online is in fact automatically generated, and that the number of humans on the web is dwindling in comparison with bot accounts.
Now a new report from cyber security company Imperva suggests that it is increasingly becoming true. Nearly half, 49.6 per cent, of all internet traffic came from bots last year, its “Bad Bot Report” indicates.
That is up 2 per cent in comparison with last year, and is the highest number ever seen since the report began in 2013.
In some countries, the picture is worse. In Ireland, 71 per cent of internet traffic is automated, it said.
Some of that rise is the result of the adoption of generative artificial intelligence and large language models. Companies that build those systems use bots scrape the internet and gather data that can then be used to train them.
Some of those bots are becoming increasingly sophisticated, Imperva warned. More and more of them come from residential internet connections, which makes them look more legitimate.
“Automated bots will soon surpass the proportion of internet traffic coming from humans, changing the way that organizations approach building and protecting their websites and applications,” said Nanhi Singh, general manager for application security at Imperva. “As more AI-enabled tools are introduced, bots will become omnipresent.”
The widespread use of bots has already caused problems for online services such as X, formerly known as Twitter. Popular posts on the site are now hit by a huge number of comments from accounts advertising pornography, and the company appears to be struggling to limit them.
Recently, its owner Elon Musk said that the site would start charging users to send posts and interact with others. That was the only way of stopping the proliferation of automated accounts, he said.
But X is far from the only site to be hit by automated content that is posing as real. Many similar posts are spreading across Facebook and TikTok, for instance.
When you consider how much traffic goes towards the larger sites, it’s actually believable. Even before the great migration Reddit was infested with reposter bots whose sole purpose was to farm karma in order to later sell the accounts. Those bots have gotten more sophisticated now, replicating not only original posts but entire comment threads. That’s not new content, but it’s content nevertheless, especially in the context of the dead Internet theory. Yes, it’s engagement farming, but that engagement is getting more sophisticated, both to trick the user (to drive engagement) as well as to trick the server (to prevent getting blocked).
This is a very insidious problem, because it means that such bots can and will be abused by threat actors (both internal and external) to drive popular sentiment in certain directions. We know how susceptible a generation that only watched cable news became, imagine what such campaigns can do to internet generations - if you can generate content that supports your rhetoric faster than humans but without appearing fake, then you can drown out dissident speech. Brigading is bad already, and it will get worse.
I think what I said still applies tbh, though I’m absolutely not disagreeing with you that the ~10% creating content isn’t getting much more sophisticated at a potentially alarming rate.
But as someone who has experience working as an engineer on some of the biggest sites on the internet—the sheer volume of basic scraper and exploit scanner traffic that sites get is truly staggering in some cases.
Oh yes, absolutely. I’ve seen sites with millions of legitimate active users where we just dropped 98% of traffic because it’s all malicious, either exploit scanners or just plain DDoS attempts. Going back to your earlier comment,
On paper, any kind of automated traffic, be it DDoS, scanners, or automated content generation is bot activity. What is happening now though is that while consumptive bot activity is steady (because the field is already saturated), generative bot activity is skyrocketing. What it means for humans is that it turns media consumption from walking through an orchard and ignoring the rotten fruit to wading through a lake of shit and finding half-edible scraps. And I harbor no illusion that it wasn’t bad before LLMs - even years ago I remember resetting the filters on my Reddit client and the feed getting inundated with ragebait, porn, and all sorts of low quality content. But when I had my filters they were effective, and that is becoming less so these days.
It’s way past “like bots” but it wasn’t always nefarious.
The nefarious ones were good and hard to pick out. The majority were very shitty and obvious bots that individuals ran just to see how well it would work.
The thing is, some of those bots were set up with no end date, and the maker just kind of forgets about them. So we get a large percentage of them.
If Lemmy every gets big enough, we’ll have the same problem here.