Online document clustering using TFIDF and predefined time windows

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 7711668
APP PUB NO 20080205775A1
SERIAL NO

12072254

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

Documents from a data stream are clustered by first generating a feature vector for each document. A set of cluster centroids (e.g., feature vectors of their corresponding clusters) are retrieved from a memory based on the feature vector of the document and a relative age of each of the cluster centroids. The centroids may be retrieved by retrieving a set of cluster identifiers from a cluster table, the cluster identifiers each indicative of a respective cluster centroid, and retrieving the cluster centroids corresponding to the retrieved cluster identifiers from a memory. A list of cluster identifiers in the cluster table may be maintained based on the relative age of cluster centroids corresponding to the cluster identifiers. Cluster identifiers that correspond to cluster centroids with a relative age exceeding a predetermined threshold are periodically removed from the list of cluster identifiers.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

  • SIEMENS CORPORATION

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Brinker, Klaus Princeton, US 6 439
Glomann, Bernhard Bayonne, US 5 384
Moerchen, Fabian Princeton, US 14 662
Neubauer, Claus Monmouth Junction, US 39 1039

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation