Indexing
ADS currently receives abstracts or tables of contents from almost two hundred journal sources. The service may receive data referring to the same article from multiple sources, and creates one bibliographic reference based on the most accurate data from each source. The common use of TeX and LaTeX by almost all scientific journals greatly facilitates the incorporation of bibliographic data into the system in a standardized format, and importing HTML-coded web-based articles is also simple. ADS utilizes Perl scripts for importing, processing and standardizing bibliographic data.
The apparently mundane task of converting author names into a standard Surname, Initial format is actually one of the more difficult to automate, due to the wide variety of naming conventions around the world and the possibility that a given name such as Davis could be a first name, middle name or surname. The accurate conversion of names requires a detailed knowledge of the names of authors active in astronomy, and ADS maintains an extensive database of author names, which is also used in searching the database (see below).
For electronic articles, a list of the references given at the end of the article is easily extracted. For scanned articles, reference extraction relies on OCR. The reference database can then be "inverted" to list the citations for each paper in the database. Citation lists have been used in the past to identify popular articles missing from the database; mostly these were from before 1975 and have now been added to the system.
Read more about this topic: Astrophysics Data System