US Patent No: 7,130,837

Number of patents in Portfolio can not be more than 2000

Systems and methods for determining the topic structure of a portion of text

Stats

ALSO PUBLISHED AS: 20030182631
See full text
ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

Systems and methods for determining the topic structure of a document including text utilize a Probabilistic Latent Semantic Analysis (PLSA) model and select segmentation points based on similarity values between pairs of adjacent text blocks. PLSA forms a framework for both text segmentation and topic identification. The use of PLSA provides an improved representation for the sparse information in a text block, such as a sentence or a sequence of sentences. Topic characterization of each text segment is derived from PLSA parameters that relate words to "topics", latent variables in the PLSA model, and "topics" to text segments. A system executing the method exhibits significant performance improvement. Once determined, the topic structure of a document may be employed for document retrieval and/or document summarization.

Loading the Abstract Image... loading....

First Claim

See full text

all claims..

Related Publications

Loading Related Publications... loading....

Patent Owner(s)

Patent OwnerAddressTotal Patents
XEROX CORPORATIONSTAMFORD, CT18074

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Brants, Thorsten H Palo Alto, CA 16 75
Chen, Francine R Menlo Park, CA 57 1418
Tsochantaridis, Ioannis Providence, RI 4 15

Cited Art Landscape

Patent Info (Count) # Cites Year
 
XEROX CORPORATION (5)
5,606,643 Real-time audio recording system for automatic speaker indexing 22 1994
5,659,766 Method and apparatus for inferring the topical content of a document based upon its lexical content without supervision 42 1994
5,687,364 Method for learning to infer the topical content of documents based upon their lexical content 55 1994
6,128,634 Method and apparatus for facilitating skimming of text 25 1998
6,239,801 Method and system for indexing and controlling the playback of multimedia documents 47 1999
 
FUJI XEROX CO., LTD. (1)
5,943,669 Document retrieval device 42 1997
 
TECHNOLOGY LICENSING CORPORATION (1)
5,675,819 Document information retrieval using global word co-occurrence patterns 301 1994

Patent Citation Ranking

Forward Cite Landscape

Patent Info (Count) # Cites Year
 
MICROSOFT CORPORATION (2)
8,335,683 System for using statistical classifiers for spoken language understanding 0 2003
7,853,596 Mining geographic knowledge using a location aware topic model 1 2007
 
APTIMA, INC. (1)
7,822,750 Method and system to compare data entities 1 2008
 
EBAY INC. (1)
8,631,005 Header-token driven automatic text segmentation 0 2006
 
Networked Insights, LLC (1)
7,925,743 Method and system for qualifying user engagement with a website 5 2008
 
SONY CORPORATION (1)
8,666,915 Method and device for information retrieval 0 2011
 
XEROX CORPORATION (1)
7,457,808 Method and apparatus for explaining categorization decisions 17 2004

Maintenance Fees

Fee Large entity fee small entity fee micro entity fee due date
7.5 Year Payment $3600.00 $1800.00 $900.00 Apr 30, 2014
11.5 Year Payment $7400.00 $3700.00 $1850.00 Apr 30, 2018
Fee Large entity fee small entity fee micro entity fee
Surcharge - 7.5 year - Late payment within 6 months $160.00 $80.00 $40.00
Surcharge - 11.5 year - Late payment within 6 months $160.00 $80.00 $40.00
Surcharge after expiration - Late payment is unavoidable $700.00 $350.00 $175.00
Surcharge after expiration - Late payment is unintentional $1,640.00 $820.00 $410.00