Phrase-based detection of duplicate documents in an information retrieval system

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 8108412
APP PUB NO 20100161625A1
SERIAL NO

12717687

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are identified that predict the presence of other phrases in documents. Documents are the indexed according to their included phrases. Related phrases and phrase extensions are also identified. Phrases in a query are identified and used to retrieve and rank documents. Phrases are also used to cluster documents in the search results, create document descriptions, and eliminate duplicate documents from the search results, and from the index.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
GOOGLE LLC1600 AMPHITHEATRE PARKWAY MOUNTAIN VIEW CA 94043

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Patterson, Anna L San Jose, US 20 575

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation