TECHNIQUES FOR KEYWORD EXTRACTION FROM URLS USING STATISTICAL ANALYSIS

Number of patents in Portfolio can not be more than 2000

United States of America Patent

APP PUB NO 20090089278A1
SERIAL NO

11937417

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

Techniques are described for keyword extraction from URLs using regular expression patterns and keyword ranking. Tokenization of URLs also generates regular expressions of URLs from a website. The regular expressions are stored in the form of any type of indexing structure. When a new URL is received, the URL is examined to determine whether the URL is from a website that has previously been tokenized. If the URL is not from such a website, then the URL is tokenized using every delimiter and unit change to extract keywords. If the URL is from a website previously processed, the corresponding regular expression is used to extract keywords from the URL. The keywords extracted from the URLs are then ranked based on any ranking methodology for better relevance and performance.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
OATH INC770 BROADWAY NEW YORK NY 10003

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Poola, Krishna Leela Bangalore , IN 14 336
Ramanujapuram, Arun Bangalore, IN 14 616

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation