System and method for automatically detecting and extracting semantically significant text from a HTML document associated with a plurality of HTML documents

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 8051372
SERIAL NO

11734467

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

A system and method for automatically detecting and extracting semantically significant text from a HTML document associated with a plurality of HTML documents is disclosed. The method may include receiving a HTML document, parsing the HTML document into a parse tree, segmenting the parse tree into one or more segments of one or more unique paths, processing the one or more segments based at least the HTML document, and extracting one or more processed segments from the at least the HTML document based on a predetermined number.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

  • THE NEW YORK TIMES COMPANY

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Sandhaus, Evan Stapleton Brooklyn, US 2 21

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation