Method for organizing structurally similar web pages from a web site

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 7941420
APP PUB NO 20090049062A1
SERIAL NO

11838351

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

Techniques are described for organizing structurally similar web pages for a website. Fingerprints are made of the structure of the web pages using shingling by placing the web page's HTML tags and attributes in sequence and encoding the tags and attributes using a standard encoding technique. Fixed-size portions of the encoded sequence are taken and a set of values extracted using independent hash functions to compute the shingles. Alternatively, a DOM tree representation of HTML of the web page is generated and each path of the DOM tree encoded and values extracted using independent hash functions to compute the shingles. A specified number of shingles are retained as the fingerprint. The pages are then clustered based upon the URL and the similarity of the shingles. The clustered hierarchal organization of pages is further pruned by various criteria including similarity of shingles or support of the cluster node in the hierarchy.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
R2 SOLUTIONS LLC6136 FRISCO SQUARE BLVD SUITE 400 FRISCO TX 75034

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Chitrapura, Krishna Prasad Bangalore, IN 16 687
Poola, Krishna Leela Karnataka, IN 14 336

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation