Method and system for automatically extracting data from web sites

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 8843490
SERIAL NO

13191369

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

In accordance with an embodiment, data may be automatically extracted from semi-structured web sites. Unsupervised learning may be used to analyze web sites and discover their structure. One method utilizes a set of heterogeneous “experts,” each expert being capable of identifying certain types of generic structure. Each expert represents its discoveries as “hints.” Based on these hints, the system may cluster the pages and text segments and identify semi-structured data that can be extracted. To identify a good clustering, a probabilistic model of the hint-generation process may be used.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
IMPORT IO CORPORATION122 E HOUSTON STREET SUITE 105 SAN ANTONIO TX 78205

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Gazen, Bora C Huntington Beach, US 2 77
Minton, Steven N El Segundo, US 2 77

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation

Maintenance Fees

Fee Large entity fee small entity fee micro entity fee due date
11.5 Year Payment $7400.00 $3700.00 $1850.00 Mar 23, 2026
Fee Large entity fee small entity fee micro entity fee
Surcharge - 11.5 year - Late payment within 6 months $160.00 $80.00 $40.00
Surcharge after expiration - Late payment is unavoidable $700.00 $350.00 $175.00
Surcharge after expiration - Late payment is unintentional $1,640.00 $820.00 $410.00