Windows Search - Programmability

Programmability

The Windows Search index can be accessed programmatically using both managed as well as native code. Native code connects to the index catalog by using a Data Source Object retrieved from the Indexing Service OLE DB provider. Managed code use the MSIDXS ADO.NET provider. A catalog on a remote machine can also be queried by specifying a UNC path. The criteria for the search is specified using SQL-like syntax. The SQL query can either be created by hand, or by using an implementation of the ISearchQueryHelper interface. Windows Search provides implementations of the interface to convert an AQS or NQS queries to their SQL counterpart.

The OLE DB/SQL API implements the functionality for searching and querying across the indices and property stores. It uses a variant of SQL in which to represent the query (regular SQL with certain restrictions). Results are returned as OLE DB Rowsets. Whenever a query is executed, the parts of the index it used are temporarily cached so that further searches filtering the result set need not access the disk again, to improve performance. Windows Search stores its index in an Extensible Storage Engine file named Windows.edb that exists, by default, in the \ProgramData\Microsoft\Search\Data\Applications\Windows\ folder at the root of the system drive in Windows Vista or later versions of Windows. (The corresponding location in Windows XP is \All Users\Application Data\Microsoft\Search\Data\Applications\Windows\ inside the Documents and Settings folder.)

The index store is called SystemIndex and contains all retrievable Windows IPropertyStore values, for indexed items. For example, the name and location of documents in the system is exposed as a table with the column names System. ItemName and System. ItemURL respectively. A SQL query can directly refer these tables and index catalogues and use the MSIDXS provider to run queries against them. The search index can also be used via OLE DB, using the CollatorDSO provider. However, the OLE DB provider is read-only, supporting only SELECT and GROUP ON SQL statements.

Windows Search also registers a search-ms application protocol, which can be used to represent searches as URIs. The search parameters and filters are encoded in the URI using AQS, or its natural language counterpart, NQS. When the URI is invoked by Explorer, Windows Search (which is the default registered handler for the protocol) launches the Search Explorer with the results of the search. In Windows Vista SP1 or later, third party handlers can also register themselves as the application protocol handler, so that searches can be performed using any search engine which the user has set as default, and not just Windows Search.

The Windows Search service provides the Notifications API component to allow applications to "push" changed items that need indexing to the Windows Search indexer. Applications use the component to supply the URIs of the items that need to be indexed, and the URIs are written to the Gather Queue, where they are read off by the indexer. Microsoft Office Outlook 2007, as well as Microsoft Office OneNote 2007 use this ability to index the items managed by them and use Windows Search queries to provide the in-application searching features. The Notifications API is also used by the internal USN Journal Notifier component of Windows Search, which monitors the Change Journal in an NTFS volume to keep track of files that has changed on the volume. If the file is in a location indexed by Windows Search and does not have the FANCI (File Attribute Not Content Indexed) attribute set, the Windows Search service is notified of its path via the Notification API.

Windows Search Configuration APIs are used to specify the configuration settings, such as the root of the URIs that needs to be monitored, setting the frequency of crawling or viewing status information like number of items indexed or length of the gather queue or the reason for throttling the indexer. It also exposes APIs to register protocol handlers (via the ISearchProtocol interface, property handlers (via the IPropertyStore interface) or IFilter implementations (via the IFilter interface). IFilter implementations allow only read-only extraction of text and properties, whereas IPropertyStore allows properties to be written as well.

Read more about this topic:  Windows Search