
US Patent No: 7,765,236
Number of patents in Portfolio can not be more than 2000
Extracting data content items using template matching
Stats
-
Jul 27, 2010
Issued date -
Aug 31, 2007
filing date -
11/848,987
serial no -
In Force
status
Importance
Abstract
Systems and methods for extracting data content items from a web page are provided. A template is created by labeling data content items of interest associated with a web page and generating a template Document Object Model (DOM) tree based on the labeled web page. DOM trees are also generated for additional web pages that contain data content items for which extraction may be desired. These DOM trees are compared to the template DOM tree to determine alignment there between. The aligned data content items may then be extracted from the additional web pages and indexed, as desired. Labeling the data content items of interest prior to generating a template DOM tree allows for the desired data content items to be specified and more accurately extracted from related and/or similarly structured web pages.
First Claim
Related Publications
International Classification(s)
- [Classification Symbol]
- [Patents Count]
Cited Art
| Patent Info | (Count) | # Cites | Year |
|---|---|---|---|
|
|
|||
| 6,778,703 Form recognition using reference areas | 19 | 2000 | |
| 7,174,327 Generating one or more XML documents from a relational database using XPath data model | 27 | 2002 | |
|
|
|||
| 7,072,984 System and method for accessing customized information over the internet using a browser for a plurality of electronic devices | 83 | 2001 | |
| 2004/0049,737 System and method for displaying information content with selective horizontal scrolling | 71 | 2002 | |
|
|
|||
| 2009/0198,714 Document processing and management approach for reflecting changes in one representation of a document to another representation | 1 | 2005 | |
|
|
|||
| 6,538,673 Method for extracting digests, reformatting, and automatic monitoring of structured online documents based on visual programming of document tree navigation and transformation | 71 | 2000 | |
|
|
|||
| 2004/0102,958 Computer-based system and method for generating, classifying, searching, and analyzing standardized text templates and deviations from standardized text templates | 8 | 2003 | |
|
|
|||
| 2008/0195,626 Data Processing Device,Document Processing Device,Data Relay Device,Data Processing Method ,and Data Relay Method | 2 | 2005 | |
|
|
|||
| 2009/0070,295 DOCUMENT PROCESSING DEVICE AND DOCUMENT PROCESSING METHOD | 3 | 2006 | |
|
|
|||
| 2008/0010,056 Aligning hierarchal and sequential document trees to identify parallel data | 3 | 2006 | |
|
|
|||
| 2006/0242,563 Optimizing XSLT based on input XML document structure description and translating XSLT into equivalent XQuery expressions | 15 | 2005 | |
|
|
|||
| 7,176,921 Graphical rewriting system for multimedia descriptions | 2 | 2001 | |
|
|
|||
| 2006/0265,712 Methods for supporting intra-document parallelism in XSLT processing on devices with multiple processors | 2 | 2005 | |
|
|
|||
| 6,772,165 Electronic document processing system and method for merging source documents on a node-by-node basis to generate a target document | 104 | 2002 | |
|
|
|||
| 6,810,414 System and methods for easy-to-use periodic network data capture engine with automatic target data location, extraction and storage | 110 | 2000 | |
| 2009/0043,777 Methods and apparatus for enabling use of web content on various types of devices | 1 | 2007 | |
Patent Citation Ranking
Maintenance Fees
| Fee | Large entity fee | small entity fee | micro entity fee | due date |
|---|---|---|---|---|
| 3.5 Year Payment | $1600.00 | $800.00 | $400.00 | Jan 27, 2014 |
| 7.5 Year Payment | $3600.00 | $1800.00 | $900.00 | Jan 27, 2018 |
| 11.5 Year Payment | $7400.00 | $3700.00 | $1850.00 | Jan 27, 2022 |
| Fee | Large entity fee | small entity fee | micro entity fee |
|---|---|---|---|
| Surcharge - 3.5 year - Late payment within 6 months | $160.00 | $80.00 | $40.00 |
| Surcharge - 7.5 year - Late payment within 6 months | $160.00 | $80.00 | $40.00 |
| Surcharge - 11.5 year - Late payment within 6 months | $160.00 | $80.00 | $40.00 |
| Surcharge after expiration - Late payment is unavoidable | $700.00 | $350.00 | $175.00 |
| Surcharge after expiration - Late payment is unintentional | $1,640.00 | $820.00 | $410.00 |