Methods and apparatus for storing and processing natural language text data as a sequence of fixed length integers

Number of patents in Portfolio can not be more than 2000

United States of America Patent

APP PUB NO 20020165707A1
SERIAL NO

09793267

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

A mechanism for more rapidly processing natural language text data and more compactly storing such data in a memory array of 16-bit integers, each integer identifying an individual term in the text data stored in a term lookup table. The original text is parsed into a sequence of substrings consisting of alternating alphanumeric terms and intervening punctuation strings. Each substring (with the exception of a single space between adjacent alphanumeric terms) is translated into an identifying integer placed in the memory array. To perform the conversion of each term into its identifying integer, a term lookup table is searched for a previously stored term which matches the given term and, if a matching term is found, the said given term is converted into the integer which identifies the matching term. If a previously stored matching term is not found, the given term is stored in an available empty location in the term first lookup table and is converted into the integer which addresses that available empty location. High-speed term-to-integer conversion is performed using a vectored binary tree as the term lookup table. High speed searches are performed by scanning the memory array for integers which identify target words, and additional lookup tables which are also addressable by an given term's identifying number may be employed to determine attributes of that term. A text file manipulation program employs the integer array text data to rapidly search, display, categorize, annotate, and highlight the text of a natural language text database. Highlighted passages are specified by their starting and ending positions in the integer array and are characterized by stored data which specifies the highlight color, annotation text, and one or more category codes associated with the highlighted passage. A keyword in context listing may be displayed which presents a sorted list of all phrases beginning with any term in a user-specified term list.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
CALL CHARLES GNot Provided

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Call, Charles G Boston, MA 30 6767

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation