Method for learning to infer the topical content of documents based upon their lexical content

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 5687364
SERIAL NO

08308037

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

An unsupervised method of learning the relationships between words and unspecified topics in documents using a computer is described. The computer represents the relationships between words and unspecified topics via word clusters and association strength values, which can be used later during topical characterization of documents. The computer learns the relationships between words and unspecified topics in an iterative fashion from a set of learning documents. The computer preprocesses the training documents by generating an observed feature vector for each document of the set of training documents and by setting association strengths to initial values. The computer then determines how well the current association strength values predict the topical content of all of the learning documents by generating a cost for each document and summing the individual costs together to generate a total cost. If the total cost is excessive, the association strength values are modified and the total cost recalculated. The computer continues calculating total cost and modifying association strength values until a set of association strength values are discovered that adequately predict the topical content of the entire set of learning documents.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
XEROX CORPORATION201 MERRITT 7 NORWALK CT 06851-1056

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Hearst, Marti A San Francisco, CA 4 414
Saund, Eric San Carlos, CA 66 2379

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation