Tokenizer for a natural language processing system

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 7092871
APP PUB NO 20030023425A1
SERIAL NO

09822976

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

The present invention is a segmenter used in a natural language processing system. The segmenter segments a textual input string into tokens for further natural language processing. In accordance with one feature of the invention, the segmenter includes a tokenizer engine that proposes segmentations and submits them to a linguistic knowledge component for validation. In accordance with another feature of the invention, the segmentation system includes language-specific data that contains a precedence hierarchy for punctuation. If proposed tokens in the input string contain punctuation, they can illustratively be broken into subtokens based on the precedence hierarchy.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
MICROSOFT TECHNOLOGY LICENSING LLCONE MICROSOFT WAY REDMOND WA 98052

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Bradlee, David G Seattle, WA 4 128
Knoll, Sonja S Redmond, WA 7 202
Pentheroudakis, Joseph E Seattle, WA 5 134

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation