More Thoughts

Arewewitnessingabotmageddon?

In the age of big data, information is the new gold rush. But on the web, a different kind of prospector is emerging – bots. As a managed hosting provider, we've witnessed a surge in website traffic consumption, and a significant portion of this comes from bots. This begs the question, ‘Are we witnessing a ‘botmageddon?’ and what does it mean for website owners?

brain 1500x1000

Understanding bots

A bot is an automated program that mimics human actions online. These range from the helpful such as search engine crawlers indexing your site to the harmful such as stealing content or launching attacks.

The bot boom

The internet landscape is experiencing a surge in automated activity. A recent report revealed nearly half (49.6%) of all internet traffic in 2023 originated from bots. While some bots serve legitimate purposes, like search engine crawlers indexing websites, the significant rise is concerning.

Malicious 'bad bots' now comprise a worrying 32% of this traffic. These bad actors can scrape valuable data, spread misinformation, and even attempt account takeovers. This highlights the growing need for website owners to understand and manage bot activity on their sites.

The impact on websites

Even good bots can cause issues. Unexpected traffic spikes can increase infrastructure costs, slow down page loads, or even crash your site. Here's a deeper dive into the challenges:

LLM issue

Large language models (like ChatGPT, CoPilot and Gemini) are trained on massive datasets, including website content. A language model’s response does not directly credit or link back to the website whose content was used to train the model. There is no economic exchange as we see with search engines. You can see why original content creators could be upset about that.

Plagiarism issue

Not all AI writing is plagiarism, but if using content from online publishers and original content creators, duplicating sentences or even paragraphs of text from existing web content. Resultingly, other websites could contain your content and vice versa if you’re using any tool drawing from LLMs using this data. This could cause problems, such as lower search engine rankings or even legal action due to copying copyrighted content.

Cost issue

Increased bot traffic creates a double whammy for website owners. First, it drives up hosting and maintenance costs. Servers need to scale to accommodate traffic spikes, but if a large portion of that traffic comes from unwanted bots, it translates to higher costs without any actual benefit.

Secondly, massive bot influxes can overwhelm servers, especially if auto-scaling isn't enabled or capacity limits are reached. This leads to slow page loading times or even website crashes, further hindering legitimate user experiences.

Performance & speed issues

Google Search’s Core Web Vitals uses metrics like speed, responsiveness, and interactivity that directly impact how Google views and ranks your website. Because of the traffic influx, your website can be slow. Slow loading times can lead to frustrated visitors exiting before they find out if the content is any good.

Analytics issues

Since digital marketing and business decisions are only as good as the data that informs them, inflated page views and skewed engagement rates could be leading you to conclude a page is or isn’t working incorrectly. As a standard, known bots are not included in your Google Analytics reports. However, the unknown bots, such as the LLMs, and the malicious ones are likely to be mixed up in your data.

To understand how much you’re impacted, you need to take a look in your reports and look out for anomalies, such as locations you don’t target, unusual peaks of traffic from one source, and changes in time or engagement rate behaviours. After investigating, you can exclude IP ranges or create segment exclusions to make the data cleaner at least.

Security issues

Website owners face a multitude of security threats from bad bots. These automated programs can steal valuable content through web scraping, jeopardising a business's competitive edge. Credential stuffing bots use stolen login information to gain unauthorised access to user accounts, potentially leading to fraud, account takeovers, and data breaches.

DDoS bots launch denial-of-service attacks, overwhelming websites and online stores with traffic, causing outages and financial losses. Finally, spam bots spread harmful or misleading content, damaging a company's reputation and potentially causing financial losses.

The bot balancing act

The rise of bots presents a complex challenge for website owners. While some bots play a vital role in the internet ecosystem, the giant surge in good and bad bot activity demands a proactive approach.

The answer isn't a simple ‘block or not to block.’ It's about finding a balance. Here are a few strategies to consider:

  • Leverage robots.txt - This file acts as a traffic guide for bots, specifying which areas of your website they can access. Use it to restrict unwanted bots while allowing search engines and legitimate crawlers to do their jobs.
  • Embrace host-level blocking - Many hosting providers, including Dynamo6’s Advanced Host service, offer bot mitigation services that can identify and block malicious bots at the network level stopping them from even getting onto your website.
  • Consider alternatives - For high-value content, consider requiring logins or subscriptions to deter content scraping by simple bots.

A future of collaboration

By understanding bot activity and implementing strategic solutions, website owners can mitigate risks and harness the benefits of good bots. Collaboration between website owners, hosting providers and security professionals is essential to navigate space.

Reach out to the Dynamo6 team if you’d like to find out more about our secure web hosting service and security solutions.

Sources

Did you know that your Internet Explorer browser is out of date? To get the best experience on our website we recommend that you upgrade your browser.