Spider Trap

A spider trap (or crawler trap) is a set of web pages that may intentionally or unintentionally be used to cause a web crawler or search bot to make an infinite number of requests or cause a poorly constructed crawler to crash. Web crawlers are also called web spiders, from which the name is derived. Spider traps may be created to "catch" spambots or other crawlers that waste a website's bandwidth. They may also be created unintentionally by calendars that use dynamic pages with links that continually point to the next day or year.

Common techniques used are:

  • creation of indefinitely deep directory structures like http://foo.com/bar/foo/bar/foo/bar/foo/bar/.....
  • dynamic pages like calendars that produce an infinite number of pages for a web crawler to follow.
  • pages filled with a large number of characters, crashing the lexical analyzer parsing the page.
  • pages with session-id's based on required cookies.

There is no algorithm to detect all spider traps. Some classes of traps can be detected automatically, but new, unrecognized traps arise quickly.

Read more about Spider Trap:  Politeness, See Also

Famous quotes containing the words spider and/or trap:

    By straightening out and lifting a forefinger,
    He pointed with his hand from where it lay
    Like a white crumpled spider on his knee:
    Robert Frost (1874–1963)

    We are not very much to blame for our bad marriages. We live amid hallucinations; and this especial trap is laid to trip our feet with, and all are tripped up first and last. But the mighty Mother who had been so sly with us, as if she felt that she owed us some indemnity, insinuates into the Pandora-box of marriage some deep and serious benefits, and some great joys.
    Ralph Waldo Emerson (1803–1882)