there’s a very real anxiety pertaining to the notion of a “dead internet”: the assertion that, for several years now, the internet has primarily consisted of bot activity and automated content.

for a time, this bot activity had been very easily discernible; with premade, almost round-robin post patterns with appendaged sketchy links and typoed sentences scattered within social app threads and spam email communication.

this idea has particularly gained traction, though, after the rise of LLMs in the 2020s, which have fed into the drowning out of human-authored content on the web in a way that is radically different and unprecedented. in addition to the mere presence of spam content in the past, there’s a degree of plausibility here that is now leading to much more deep-rooted interaction and engagement with these bots, and to domino effects of language model inteaction.

meanwhile, “dead internet” is a very passive term aiming to describe the lack of human presence on the internet, and not really making any commentary about the very real harm that these systems will first bring to digital spaces at a plausible-engagement level in the in-between where humans are still very much actively inhabiting these spaces.

to this end, i think, positing a polluted internet theory is a much better idea.

large-scale community is tainted first. it is easier than ever to produce mass amounts of plausible truth. LLMs, at their core, are probability machines that love predicting words. ignoring bias, the highest probability tokens they will generate when asked to imitate a human poster will typically be a statistically average simulacrum of a human thought. it will be unremarkable. but unremarkable and seemingly ordinary is all you need to pass by in communities so large they could not possibly hope to vet out every individual member. especially so when these bots can be fine-tuned to fit better in the communities they’d seek to inhabit.

we’re already seeing this happen in open source software, where extremely high-profile and earnest volunteer maintainers are quite literally demoralized over how draining it is to review ai-generated code contributions with ai-generated descriptions.

it gets hard to give the benefit of the doubt here. is the code partially written by a human? did a user use an LLM to translate from their language? is the entire code contribution potentially generated? are the results described hallucinated? there is no way to know but to check and waste more time on volumes of content that never needed to be there.

the mass commodification of plausibly fake information brings no benefit to anyone. but what it does does serve to do is economies-of-scale-ing bot campaigns. API costs become laughably cheap in carrying out these invasions, where bots can plausibly interact with, reply to, and engage in conversation with posters; either with intentions to sway ideals, for advertising, or with plain malicious intent.

this is going to affect digital spaces at every scale. it’s easier to slip by in a large community, but the plausibility aspect affects even smaller-scale forums, even those as noble as being in the medical space, and even those as isolated as being invite-only mastodon servers.

the future of the internet, bleakly, is one dredged with plausible falsities, and it’s probably a good idea to maintain an active recognition of that concept going forwards.

a polluted internet before a dead internet