Website Parse Template - WPT Syntax

WPT Syntax

Website Parse Template consists of following sections:

  • Ontology, where publisher defines concepts and relations which are used in the website.
  • Templates, where publisher provides templates for groups of web pages which are similar by their content category and structure. Publisher provides the HTML elements’ XPath or TagIDs and links with website Ontology concepts.
  • URLa, where publisher provides URL Patterns which collect the group of web pages linking them to "Parse Template". In the URLa section publisher can separate form URLs the part as a concept and link to website Ontology.

Website Parse Template begins with opening <icdl> tag and ends with closing icdl> tag. Single Website Parse Template is referred to the same host, while single host may have several Website Parse Templates describing its HTML structure. It is required to specify the host for Website Parse Template at the beginning in <icdl> tag:

. . . . . . . . . . . . . . . . . . .

Read more about this topic:  Website Parse Template