Stop Using AI LLMs
For the sake of our future, stop using AI LLMs. LLMS (also known by "large language models", "AI", or "generative AI") are unethical, dangerous, and pose a large threat to the future of the internet as it exists today.
Environmental Concerns ¶
LLMs are greatly harming the environment in both their training and regular usage. Training a model with 213 million parameters produces 300 tCO2-eq, equivalent to the emissions from 125 round-trip flights between Beijing and New York City. This number pales in comparison to the 552 tCO2-eq produced while training GPT-3[1], equivalent to running an MRI machine in a hospital for 9.5 straight years[2], or the energy consumption of an average American household for 32 years[1:1]. These carbon footprints exclude training trails and are continually growing larger with the advent of new models from every tech company. Many of these companies are investing in nuclear power to sustain their own energy usage from LLMs. These statistics also exclude the other natural resources consumed during intense computation, such as the amount of water required to cool computer processors[3].
Increased usage of LLMs will expedite the rate of climate change and the consumption of other natural resources.
Ethical Concerns ¶
LLMs raise ethical concerns due to the nature of their datasets. Approximately 70% of WildChat queries contain some kind of PII (personally identifiable information), and 15% mention a non-PII sensitive topic[4]. Hashed IP addresses, country locations[4:1], private medical records, and other confidential information have been found in training datasets[3:1]. Class action lawsuits have already been waged against Microsoft, Github, and OpenAI for violating copyright laws by training models directly from verbatim copyrighted information[5], with other lawsuits waged against various tech companies. Although governments have begun to consider creating legislation for these technological innovations, it may be as late as 2035 before any regulations are imposed on LLM training datasets.
LLMS additionally spread misinformation due to their assumed credibility and relative cleanliness of their outputs. LLMs are trained on extensive amounts of information which provides them with a certain degree of simulated authority, and combined with the polished language of the output, can be easily confused with expert opinions[6]. For example, two days before the US election primary in New Hampshire, robocalls went out using AI-generated audio of President Biden's voice to urge Democrat voters to save their votes for November rather than vote in the primary[3:2]. There are also concerns for the future credibility of general internet media. A study found that humans can only successfully spot 9.6% of LLM-generated news on average, leaving a startling 90.4% of LLM-generated news interpreted as factual. When pitting LLMs against each other, GPT-4 can only detect 10.0% of LLM-generated news[7]. This becomes extremely dangerous when understanding the nature of the datasets used. Biases exist in these datasets and can result in outputs that accentuate stereotypes, involve unjust discrimination against certain groups, and negatively affect marginalized groups[8].
LLMs are unregulated, biased, and spreading misinformation.
The Threat to Technology and the Internet ¶
LLMs threaten the future of our technology and the future of the internet. LLMs have also been used for software creation, resulting in gaps in foundational software knowledge and poorly written software. For example, security issues found by Clang in LLM-generated code are consistently higher than human-written code: by 11.2% in LeetCode and 7.1% in algorithm tasks[9]. Then, comparing SHA1 implementations, GPT-4o's SHA1 implementation produced incorrect hash values for all inputs[9:1], raising a huge concern for security.
The explosion of LLM-driven bots has resulted in an exponential increase in bot traffic to all websites. Many websites have been taken down due to LLM bot spam and DDoSing, further increasing the amount of CAPTCHAs, human verification, and 2FA checks that websites are requiring every user to take to combat the surge.
LLM-driven bots also disregard internet standards. While some LLM bots respect robots.txt, a widely respected internet scraping standard, a number of bots misbehave and ignore it, including Meta[10]. For example, Baidu added directives to its robots.txt file prohibiting a rival company from scraping its content. Disregarding the instructions, the robots scraped the content without consent and reproduced the content for its users[11]. In fact, many tech companies do not disclose a complete list of LLM bots they use, limiting the control users have over blocking them[10:1]. This has resulted in an "online arms proliferation" where creators and companies have begun to employ LLM-generated labyrinths to trap LLM-driven bot traffic.
LLMs are producing insecure software and breaking website standards, forcing an online arms proliferation of bots and checks to verify human status. This trend will only grow worse over time.
Conclusion ¶
In conclusion, AI LLMs threaten the future of the internet. In addition to being unethical and unregulated, these models actively harm the environment, spread misinformation and bias, produce insecure software, and break internet standards. I urge every one of you to stop using LLMs. By the time humanity comes to its senses, it may already be too late.
Preventing the Immense Increase in the Life-Cycle Energy and Carbon Footprints of LLM-Powered Intelligent Chatbots ↩︎ ↩︎
Generating Harms: Generative AI's New & Continued Impacts ↩︎ ↩︎ ↩︎
Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild ↩︎ ↩︎
Beyond Fair Use: Legal Risk Evaluation for Training LLMs on Copyrighted Text ↩︎
Risks and Benefits of Large Language Models for the Environment ↩︎
Navigating LLM Ethics: Advancements, Challenges, and Future Directions ↩︎
Artificial-Intelligence Generated Code Considered Harmful: A Road Map for Secure and High-Quality Code Generation ↩︎ ↩︎
Somesite I Used To Crawl: Awareness, Agency and Efficacy in Protecting Content Creators From AI Crawlers ↩︎ ↩︎