Confusion set based method and system for correcting misrecognized words appearing in documents generated by an optical character recognition technique

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 6205261
SERIAL NO

09018575

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

A method and apparatus for correcting misrecognized words appearing in electronic documents that have been generated by scanning an original document in accordance with an optical character recognition ('OCR') technique. If an incorrect word is found in the electronic document, the present invention generates at least one reference word and selects the reference word that is the most likely correct replacement for the incorrect word. This selection is accomplished by comparing each character member of every reference word to a plurality of confusion sets. On the basis of this comparison, the reference words are reduced to a smaller candidate set of reference words, from which a reference word for replacing the incorrect word is selected on the basis of predetermined criteria.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

  • AT&T CORP.

International Classification(s)

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Goldberg, Randy G Princeton, NJ 58 2757

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation