Method and apparatus for establishing topic word classes based on an entropy cost function to retrieve documents represented by the topic words

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 6128613
SERIAL NO

09069618

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

A computer-based method and system for establishing topic words to represent a document, the topic words being suitable for use in document retrieval. The method includes determining document keywords from the document; classifying each of the document keywords into one of a plurality of preestablished keyword classes; and selecting words as the topic words, each selected word from a different one of the preestablished keyword classes, to minimize a cost function on proposed topic words. The cost function may be a metric of dissimilarity, such as cross-entropy, between a first distribution of likelihood of appearance by the plurality of document keywords in a typical document and a second distribution of likelihood of appearance by the plurality of document keywords in a typical document, the second distribution being approximated using proposed topic words. The cost function can be a basis for sorting the priority of the documents.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
CHINESE UNIVERSITY OF HONG KONG THESHATIN N T HONG KONG

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Qin, An Shatin, CN 3 132
Wong, Wing S Shatin, CN 2 160

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation