Document clustering that applies a locality sensitive hashing function to a feature vector to obtain a limited set of candidate clusters

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 7797265
APP PUB NO 20080205774A1
SERIAL NO

12072179

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

Documents from a data stream are clustered by first generating a feature vector for each document. A set of cluster centroids (e.g., feature vectors of their corresponding clusters) are retrieved from a memory based on the feature vector of the document using a locality sensitive hashing function. The centroids may be retrieved by retrieving a set of cluster identifiers from a cluster table, the cluster identifiers each indicative of a respective cluster centroid, and retrieving the cluster centroids corresponding to the retrieved cluster identifiers from a memory. Documents may then be clustered into one or more of the candidate clusters using distance measures from the feature vector of the document to the cluster centroids.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

  • SIEMENS CORPORATION

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Brinker, Klaus Princeton, US 6 438
Glomann, Bernhard Bayonne, US 5 383
Moerchen, Fabian Princeton, US 14 659
Neubauer, Claus Monmouth Junction, US 39 1037

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation