Scheduler for search engine crawler

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 8042112
SERIAL NO

10882956

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

A search engine crawler includes a distributed set of schedulers that are associated with one or more segments of document identifiers (e.g., URLs) corresponding to documents on a network (e.g., WWW). Each scheduler handles the scheduling of document identifiers (for crawling) for a subset of the known document identifiers. Using a starting set of document identifiers, such as the document identifiers crawled (or scheduled for crawling) during the most recent completed crawl, the scheduler removes from the starting set those document identifiers that have been unreachable in each of the last X crawls. Other filtering mechanisms may also be used to filter out some of the document identifiers in the starting set. The resulting list of document identifiers is written to a scheduled output file for use in a next crawl cycle.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
GOOGLE LLC1600 AMPITHEATRE PARKWAY MOUNTAIN VIEW CA 94043

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Acharya, Anurag Campbell, US 48 1341
Gobioff, Howard Bradley San Francisco, US 4 70
Ibel, Maximilian Pfaeffikon, CH 20 660
Zhu, Huican San Jose, US 15 896

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation