Automatic Identification and Data Capture - Capturing Data From Printed Documents

Capturing Data From Printed Documents

One of the most useful application tasks of data capture is collecting information from paper documents and saving it into databases (CMS, ECM and other systems). There are several types of basic technologies used for data capture according to the data type:

OCR – for printed text recognition
ICR – for hand-printed text recognition
OMR – for marks recognition
OBR – for barcodes recognition
BCR – for business cards recognition

These basic technologies allow extracting information from paper documents for further processing it in the enterprise information systems such as ERP, CRM and others.
The documents for data capture can be divided into 3 groups: structured, semi-structured and unstructured.
Structured documents (questionnaires, tests, insurance forms, tax returns, ballots, etc.) have completely the same structure and appearance. It is the easiest type for data capture, because every data field is located at the same place for all documents.
Semi-structured documents (invoices, purchase orders, waybills, etc.) have the same structure but their appearance depends on number of items and other parameters. Capturing data from these documents is a complex, but solvable task.
Unstructured documents (letters, contracts, articles, etc.) could be flexible with structure and appearance.

Developer Basic Technologies Data Capture Application Data Capture SDK
ABBYY OCR (195 languages),
ICR (113 languages),
OMR, OBR, BCR
ABBYY FlexiCapture is an intelligent data and document capture software that delivers automated processing of any type of structured, semi-structured and unstructured documents and forms ABBYY FlexiCapture Engine is a data and document capture SDK for any type of structured, semi-structured and unstructured documents and forms
Accusoft OCR (118 languages),
ICR (11 languages),
OMR, OBR
ImageGear for .NET is an SDK that delivers fully managed code for WinForms, ASP.NET, and WPF application development. Optional Recognition component enables a comprehensive integrated OCR toolkit.

FormSuite, available for .NET or ActiveX, is a structured forms processing SDK designed to handle forms processing from scanning to recognition. Barcode recognition and creation can also be added.
AnyDoc Software OCR (4 languages),
ICR, OMR, OBR
OCR for AnyDoc automates data capture from all business documents, including structured, semi-structured, and unstructured documents by incorporating AnyApp Technology for template-free processing.
Cvision Technologies OCR (60 languages),
ICR (60 languages),
OMR, OBR
Cvision's Trapeze is an intelligent software that is able to recognize and capture text from structured, semi-structured, and unstructured documents including forms, invoices, and EOBs Cvision's Trapeze's SDK captures data from structured, semi-structured, and unstructured documents including forms, invoices, and EOBs
Expervision OCR (18 languages),
ICR (18 languages),
OMR, OBR, BCR
Expervision TypeReader Expervision TypeReader can automatically process full text documents. In under the premise of accurately identification, its processing speed can reach above 100 pages each minute. Expervision OpenRTK Engine is an intelligent capture data and document processing SDK. It has flexible language support function, in theory, it can support additional anyone language and train the engine to adapt various fonts according to customize demand. Customized API definition and development are supported.
I.R.I.S. Group OCR (120 languages),
ICR (Latin based languages),
OMR, OBR, BCR
IRISCapture for Invoices – invoice processing solution

IRISCapture Pro for Forms is an intelligent software suite that automatically captures, sorts and identifies all types of documents and forms

LEADTOOLS OCR (118 languages),
ICR (15 languages),
OMR, OBR, BCR
LEADTOOLS Forms Recognition module is a .NET SDK that harnesses the power of LEAD's image processing technology to intelligently identify form components and features that can be used to recognize structured forms
Nuance Communications OCR (120 languages),
ICR, OMR, OBR, BCR
OmniPage Professional 17 makes structured forms made easy from start to finish. You can turn paper forms into electronic forms and then collect the data. OmniPage Capture SDK for Windows with its advanced Logical Form Recognition (LFR) automates form template creation and structured forms processing.
PSIGEN Software OCR (99 languages),
ICR, OMR, OBR, BCR (1D and 2D)
PSI:Capture is a complete capture solution that includes all the functionlity required to automatically process all structured and semi-structured documents, including invoices, forms and general mail. One of its key strengths is its unrivalled dynamic interface to SharePoint.

Read more about this topic:  Automatic Identification And Data Capture

Famous quotes containing the words data, printed and/or documents:

    Mental health data from the 1950’s on middle-aged women showed them to be a particularly distressed group, vulnerable to depression and feelings of uselessness. This isn’t surprising. If society tells you that your main role is to be attractive to men and you are getting crow’s feet, and to be a mother to children and yours are leaving home, no wonder you are distressed.
    Grace Baruch (20th century)

    I wish I had a man and not a dishrag printed over with big words like ‘constitutional rights’ and ‘progress’!
    Christina Stead (1902–1983)

    Our medieval historians who prefer to rely as much as possible on official documents because the chronicles are unreliable, fall thereby into an occasionally dangerous error. The documents tell us little about the difference in tone which separates us from those times; they let us forget the fervent pathos of medieval life.
    Johan Huizinga (1872–1945)