Method and system for similarity search and clustering

Number of patents in Portfolio can not be more than 2000

United States of America Patent

APP PUB NO 20030120630A1
SERIAL NO

10027195

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

Provided is a similarity search method that makes use of a localized distance metric. The data includes a collection of items, wherein each item is associated with a set of properties. The distance between two items is defined in terms of the number of items in the collection that are associated with the set of properties common to the two items. A query is generally composed of a set of properties. The distance between a query and an item is defined in terms of the number of items in the collection that are associated with the set of properties common to the query and the item. The properties can be of various types, such as binary, partially ordered, or numeric. The distance metric may be applied explicitly or implicitly for similarity search. One embodiment of this invention uses random walks such that the similarity search can be performed exactly or approximately, trading-off between accuracy and performance. The distance metric of the present invention can also be the basis for matching and clustering applications. In these contexts, the distance metric of the present invention may be used to build a graph, to which matching or clustering algorithms can be applied.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
ENDECA TECHNOLOGIES INC5TH FLOOR 215 FIRST STREET CAMBRIDGE MA 02142

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Tunkelang, Daniel Cambridge, MA 41 2098

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation