Method and apparatus for improved tokenization of natural language text

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 5890103
SERIAL NO

08684002

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

This invention improves information retrieval by providing a tokenizing apparatus and method that parses natural language text in a manner that increases the throughput of an information retrieval or natural language analysis system. The tokenizer includes a parser that extracts characters from the stream of text, an identifying element for identifying a token formed of characters in the stream of text that include lexical matter, and a filter for assigning tags to those tokens requiring further linguistic analysis. The tokenizer, in a single pass through the stream of text, determines the further linguistic processing suitable to each particular token contained in the stream of text.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

  • VANTAGE TECHNOLOGY HOLDINGS, LLC

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Carus, Alwin B Newton, MA 41 2528

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation