Web Scraping - Technical Measures To Stop Bots

Technical Measures To Stop Bots

The administrator of a website can use various measures to stop or slow a bot. Some techniques include:

  • If the application is well behaved, adding entries to robots.txt will be adhered to. Google and other well-behaved bots can be stopped this way.
  • Blocking an IP address. This will also block all browsing from that address.
  • Disabling any web service API that the website's system might expose.
  • Bots sometimes declare who they are and can be blocked on that basis; 'googlebot' is an example. Some bots make no distinction between themselves and a human browser.
  • Bots can be blocked by excess traffic monitoring.
  • Bots can sometimes be blocked with tools to verify that it is a real person accessing the site, like a CAPTCHA. Bots are sometimes coded to explicitly break specific Captcha patterns.
  • Commercial anti-bot services: Several companies, such as Distil, SiteBlackBox and Sentor, offer anti-bot and anti-scraping services for websites. A few Web Application Firewalls have limited bot detection capabilities as well.
  • Locating bots with a honeypot or other method to identify the IP addresses of automated crawlers.
  • Using CSS sprites to display such data as phone numbers or email addresses, at the cost of accessibility to screen reader users.

Read more about this topic:  Web Scraping

Famous quotes containing the words technical, measures and/or stop:

    The axioms of physics translate the laws of ethics. Thus, “the whole is greater than its part;” “reaction is equal to action;” “the smallest weight may be made to lift the greatest, the difference of weight being compensated by time;” and many the like propositions, which have an ethical as well as physical sense. These propositions have a much more extensive and universal sense when applied to human life, than when confined to technical use.
    Ralph Waldo Emerson (1803–1882)

    The reliance on authority measures the decline of religion, the withdrawal of the soul.
    Ralph Waldo Emerson (1803–1882)

    Midnight Special on a sabre track movering movering,
    first stop Mercy and the last Hallelujah.
    Robert Earl Hayden (1913–1980)