System and method for context-dependent probabilistic modeling of words and documents

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 6925433
APP PUB NO 20020194158A1
SERIAL NO

09851675

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

A computer-implemented system and method is disclosed for retrieving documents using context-dependant probabilistic modeling of words and documents. The present invention uses multiple overlapping vectors to represent each document. Each vector is centered on each of the words in the document, and consists of the local environment, i.e., the words that occur close to this word. The vectors are used to build probability models that are used for predictions. In one aspect of the invention a method of context-dependant probabilistic modeling of documents is provided wherein the text of one or more documents are input into the system, each document including human readable words. Context windows are then created around each word in each document. A statistical evaluation of the characteristics of each window is then generated, where the results of the statistical evaluation are not a function of the order of the appearance of words within each window. The statistical evaluation includes the counting of the occurrences of particular words and particular documents and the tabulation of the totals of the counts. The results of the statistical evaluation for each window are then combined. These results are then used for retrieving a document, for extracting features from a document, or for finding a word within a document based on its resulting statistics.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
NUANCE COMMUNICATIONS INC1 WAYSIDE ROAD BURLINGTON MA 01803

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Stensmo, Jan Magnus San Jose, CA 12 156

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation