Web Scraping - Technical Measures To Stop Bots

Technical Measures To Stop Bots

The administrator of a website can use various measures to stop or slow a bot. Some techniques include:

  • If the application is well behaved, adding entries to robots.txt will be adhered to. Google and other well-behaved bots can be stopped this way.
  • Blocking an IP address. This will also block all browsing from that address.
  • Disabling any web service API that the website's system might expose.
  • Bots sometimes declare who they are and can be blocked on that basis; 'googlebot' is an example. Some bots make no distinction between themselves and a human browser.
  • Bots can be blocked by excess traffic monitoring.
  • Bots can sometimes be blocked with tools to verify that it is a real person accessing the site, like a CAPTCHA. Bots are sometimes coded to explicitly break specific Captcha patterns.
  • Commercial anti-bot services: Several companies, such as Distil, SiteBlackBox and Sentor, offer anti-bot and anti-scraping services for websites. A few Web Application Firewalls have limited bot detection capabilities as well.
  • Locating bots with a honeypot or other method to identify the IP addresses of automated crawlers.
  • Using CSS sprites to display such data as phone numbers or email addresses, at the cost of accessibility to screen reader users.

Read more about this topic:  Web Scraping

Famous quotes containing the words technical, measures and/or stop:

    The best work of artists in any age is the work of innocence liberated by technical knowledge. The laboratory experiments that led to the theory of pure color equipped the impressionists to paint nature as if it had only just been created.
    Nancy Hale (b. 1908)

    However much we may differ in the choice of the measures which should guide the administration of the government, there can be but little doubt in the minds of those who are really friendly to the republican features of our system that one of its most important securities consists in the separation of the legislative and executive powers at the same time that each is acknowledged to be supreme, in the will of the people constitutionally expressed.
    Andrew Jackson (1767–1845)

    I do wish you’d stop reading my mind.... It’s so frightfully disconcerting—like being followed up one’s trousers.
    Abraham Polonsky, U.S. screenwriter, Frank Butler, and Helen Deutsch. Mitchell Leisen. Col. Deniston (Ray Milland)