Method and system for incremental web crawling
Number of patents in Portfolio can not be more than 2000
United States of America Patent
Stats
-
Oct 7, 2003
Grant Date -
N/A
app pub date -
Jun 30, 1999
filing date -
Jun 30, 1999
priority date (Note) -
In Force
status (Latency Note)
A preliminary load of PAIR data current through [] has been loaded. Any more recent PAIR data will be loaded within twenty-four hours. |
PAIR data current through []
A preliminary load of cached data will be loaded soon.
Any more recent PAIR data will be loaded within twenty-four hours.
Next PAIR Update Scheduled on [ ] |
Importance
US Family Size
|
Non-US Coverage
|
Patent Longevity
|
Forward Citations
|
Abstract
A Web crawler creates an index of documents in a document store on a computer network. In an initial crawl, the crawler creates a first full index for the document store. The first full crawl is based on a set of predefined 'seed' URLs and crawl restrictions, and involves recursively retrieving each folder/document directly or indirectly linked to the seed URLs. In the process of creating the first full index, the crawler creates a History Table containing a list of URLs for each folder and document found in the first full crawl. The History Table also includes a local commit time (LCT) for each document and a deleted documents count (DDC) and LCT or maximum LCT (MLCT) for each folder (this assumes that the store supports a folder hierarchy and the MLCT, LCT and DDC properties). Thereafter, in an incremental crawl, the crawler determines, for each folder, (1) whether the DDC for that folder has changed and (2) whether the MLCT is more recent than the corresponding value in the History Table. If the DDC has changed, the crawler obtains a full list of items (URLs) in that folder, and compares the list with the URLs in the History Table to identify the deleted documents. The deleted documents are then deleted from the History Table and index. If the MLCT is more recent, the crawler queries the document store for the URLs of linked documents having a LCT more recent than the MLCT in the History Table for the folder. The History Table and index are then updated accordingly to reflect the changes to the document store.
First Claim
Family
- 15 United States
- 10 France
- 8 Japan
- 7 China
- 5 Korea
- 2 Other
Patent Owner(s)
- MICROSOFT TECHNOLOGY LICENSING, LLC
International Classification(s)
Inventor(s)
Inventor Name | Address | # of filed Patents | Total Citations |
---|---|---|---|
Meyerzon, Dmitriy | Bellevue, WA | 112 | 3584 |
Sanu, Sankrant | Redmond, WA | 6 | 1115 |
Shoroff, Srikanth | Issaquah, WA | 30 | 1469 |
Terek, F Soner | Bellevue, WA | 16 | 876 |
Cited Art Landscape
- No Cited Art to Display
Patent Citation Ranking
Forward Cite Landscape
- No Forward Cites to Display
Maintenance Fees
Fee | Large entity fee | small entity fee | micro entity fee | due date |
---|
Fee | Large entity fee | small entity fee | micro entity fee |
---|---|---|---|
Surcharge after expiration - Late payment is unavoidable | $700.00 | $350.00 | $175.00 |
Surcharge after expiration - Late payment is unintentional | $1,640.00 | $820.00 | $410.00 |
Full Text
Legal Events
Matter Detail
Update Public Data Dismiss Edit SaveRenewals Detail
Edit SaveNote
The template below is formatted to ensure compatibility with our system.
Provide tags with | separated like (tags1|tags2).
Maximum length is 128 characters for Customer Application No
Mandatory Fields * - 'MatterType','AppType','Country','Title','SerialNo'.
Acceptable Date Format - 'MM/DD/YYYY'.
Acceptable Filing/App Types -
- Continuation/Divisional
- Original
- Paris Convention
- PCT National
- With Priority
- EP Validation
- Provisional Conversion
- Reissue
- Provisional
- Foreign Extension
Acceptable Status -
- Pending
- Abandoned
- Unfiled
- Expired
- Granted
Acceptable Matter Types -
- Patent
- Utility Model
- Supplemental Protection Certificate
- Design
- Inventor Certificate
- Plant
- Statutory Invention Reg
Advertisement
Advertisement
Advertisement
Recipient Email Address
Recipient Email Address
Comment
Recipient Email Address
Success
E-mail has been sent successfully.
Failure
Some error occured while sending email. Please check e-mail and try again!
PAIR load has been initiated
A preliminary load of cached data will be loaded soon. Current PAIR data will be loaded within twenty four hours.
File History PDF
Thank you for your purchase! The File Wrapper for Patent Number 6631369 will be available within the next 24 hours.
Add to Portfolio(s)
To add this patent to one, or more, of your portfolios, simply click the add button.
This Patent is in these Portfolios:
Add to additional portfolios:
Last Refreshed On:
Changes done successfully
Important Notes on Latency of Status data
Please note there is up to 60 days of latency in this Status indicator for certain status conditions. You can obtain up-to-date Status indicator readings by ordering PAIR for the file.
An application with the status "Published" (which means it is pending) may be recently abandoned, but not yet updated to reflect its abandoned status. However, an application filed less than one year ago is unlikely to be abandoned.
A patent with the status "Granted" may be recently expired, but not yet updated to reflect its expired status. However, it is highly unlikely a patent less than 3.5 years old would be expired.
An application with the status "Abandoned" is almost always current, but there is a small chance it was recently revived and the status not yet updated.
Important Note on Priority Date data
This priority date is an estimated earliest priority date and is purely an estimation. This date should not be taken as legal conclusion. No representations are made as to the accuracy of the date listed. Please consult a legal professional before relying on this date.
We are sorry but your current selection exceeds the maximum number of portfolios (0) for this membership level. Upgrade to our Level for up to -1 portfolios!.