System and method for efficient representation of data set addresses in a web crawler
Number of patents in Portfolio can not be more than 2000
United States of America Patent
Stats
-
Oct 9, 2001
Grant Date -
N/A
app pub date -
Nov 2, 1999
filing date -
Nov 2, 1999
priority date (Note) -
In Force
status (Latency Note)
![]() |
A preliminary load of PAIR data current through [] has been loaded. Any more recent PAIR data will be loaded within twenty-four hours. |
PAIR data current through []
A preliminary load of cached data will be loaded soon.
Any more recent PAIR data will be loaded within twenty-four hours.
![]() |
Next PAIR Update Scheduled on [ ] |

Importance
|
US Family Size
|
Non-US Coverage
|
|
Patent Longevity
|
Forward Citations
|
Abstract
A web crawler stores fixed length representations of document addresses in first and second caches and a disk file. When the web crawler downloads a document from a host computer, it identifies URL's (document addresses) in the downloaded document. Each identified URL is converted into a fixed size numerical representation. The numerical representation is systematically compared to numerical representations in the caches and disk file. If the representation is not found in the caches and disk file, the document corresponding to the representation is scheduled for downloading, and the representation is stored in the second cache. If the representation is not found in the caches but is found in the disk file, the representation is added to the first cache. When the second cache is full, it is merged with the disk file and the second cache is reset to an initial state. When the first cache is full, one or more representations are evicted in accordance with an eviction policy. The representations include a prefix that is a function of a host component of the corresponding URL's, and the representations are stored in the disk file in sorted order. When the web crawler searches for a representation in the disk file, an index of the disk file is searched to identify a single block of the disk file, and only that single block of the disk file is searched for the representation.
First Claim
Family
- 15 United States
- 10 France
- 8 Japan
- 7 China
- 5 Korea
- 2 Other
Patent Owner(s)
| Patent Owner | Address | |
|---|---|---|
| R2 SOLUTIONS LLC | 6136 FRISCO SQUARE BLVD SUITE 400 FRISCO TX 75034 |
International Classification(s)
Inventor(s)
| Inventor Name | Address | # of filed Patents | Total Citations |
|---|---|---|---|
| Heydon, Clark Allan | San Francisco, CA | 13 | 711 |
| Najork, Marc Alexander | Palo Alto, CA | 27 | 1043 |
Cited Art Landscape
- No Cited Art to Display

Patent Citation Ranking
Forward Cite Landscape
- No Forward Cites to Display

Maintenance Fees
| Fee | Large entity fee | small entity fee | micro entity fee | due date |
|---|
| Fee | Large entity fee | small entity fee | micro entity fee |
|---|---|---|---|
| Surcharge after expiration - Late payment is unavoidable | $700.00 | $350.00 | $175.00 |
| Surcharge after expiration - Late payment is unintentional | $1,640.00 | $820.00 | $410.00 |
Full Text
Legal Events
Matter Detail
Update Public Data
Dismiss
Edit
Save
Renewals Detail
Edit
Save
Note
The template below is formatted to ensure compatibility with our system.
Provide tags with | separated like (tags1|tags2).
Maximum length is 128 characters for Customer Application No
Mandatory Fields * - 'MatterType','AppType','Country','Title','SerialNo'.
Acceptable Date Format - 'MM/DD/YYYY'.
Acceptable Filing/App Types -
- Continuation/Divisional
- Original
- Paris Convention
- PCT National
- With Priority
- EP Validation
- Provisional Conversion
- Reissue
- Provisional
- Foreign Extension
Acceptable Status -
- Pending
- Abandoned
- Unfiled
- Expired
- Granted
Acceptable Matter Types -
- Patent
- Utility Model
- Supplemental Protection Certificate
- Design
- Inventor Certificate
- Plant
- Statutory Invention Reg
Advertisement
Advertisement
Advertisement
Recipient Email Address
Recipient Email Address
Comment
Recipient Email Address
Success
E-mail has been sent successfully.
Failure
Some error occured while sending email. Please check e-mail and try again!
PAIR load has been initiated
A preliminary load of cached data will be loaded soon. Current PAIR data will be loaded within twenty four hours.
File History PDF
Thank you for your purchase! The File Wrapper for Patent Number 6301614 will be available within the next 24 hours.
Add to Portfolio(s)
To add this patent to one, or more, of your portfolios, simply click the add button.
This Patent is in these Portfolios:
Add to additional portfolios:
Last Refreshed On:
Changes done successfully
Important Notes on Latency of Status data
Please note there is up to 60 days of latency in this Status indicator for certain status conditions. You can obtain up-to-date Status indicator readings by ordering PAIR for the file.
An application with the status "Published" (which means it is pending) may be recently abandoned, but not yet updated to reflect its abandoned status. However, an application filed less than one year ago is unlikely to be abandoned.
A patent with the status "Granted" may be recently expired, but not yet updated to reflect its expired status. However, it is highly unlikely a patent less than 3.5 years old would be expired.
An application with the status "Abandoned" is almost always current, but there is a small chance it was recently revived and the status not yet updated.
Important Note on Priority Date data
This priority date is an estimated earliest priority date and is purely an estimation. This date should not be taken as legal conclusion. No representations are made as to the accuracy of the date listed. Please consult a legal professional before relying on this date.
We are sorry but your current selection exceeds the maximum number of portfolios (0) for this membership level. Upgrade to our Level for up to -1 portfolios!.
