Document page analyzer and method

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 5848184
SERIAL NO

08496917

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

Apparatus and method are provided which determine the geometric and logical structure of a document page from its image. The document image is partitioned into regions (both text and non-text) which are then organized into related 'article' groups in the correct reading order. The invention uses image-based features, text-based features, and assumptions based on knowledge of expected layout, to find the correct reading order of the text blocks on a document page. It can handle complex layouts which have multiple configurations of columns on a page and inset material (such as figures and inset text blocks). The apparatus comprises two main components, a geometric page segmentor and a logical page organizer. The geometric page segmentor partitions a binary image of a document page into fundamental units of text or non-text, and produces a list of rectangular blocks, their locations on the page in points (1/72 inch), and the locations of horizontal rule lines on the page. The logical page organizer receives a list of text region locations, rule line locations, associated ASCII text (as found from an OCR) for the text blocks, and a list of text attributes (such as face style and point size). The logical page organizer groups appropriately the components (both text and non-text) which comprise a document page, sequences them in a correct reading order and establishes the dominance pattern (e.g., find the lead article on a newspaper page).

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
UNISYS CORPORATIONBLUE BELL PA

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Lipshutz, Mark Philadelphia, PA 1 208
Taylor, Suzanne Amy West Chester, PA 1 208

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation