US Patent No: 7,130,837

Number of patents in Portfolio can not be more than 2000

Systems and methods for determining the topic structure of a portion of text

Stats

ALSO PUBLISHED AS: 20030182631
ATTORNEY / AGENT: (SPONSORED)
 

Importance

Loading Importance Indicators... loading....

Abstract

Systems and methods for determining the topic structure of a document including text utilize a Probabilistic Latent Semantic Analysis (PLSA) model and select segmentation points based on similarity values between pairs of adjacent text blocks. PLSA forms a framework for both text segmentation and topic identification. The use of PLSA provides an improved representation for the sparse information in a text block, such as a sentence or a sequence of sentences. Topic characterization of each text segment is derived from PLSA parameters that relate words to "topics", latent variables in the PLSA model, and "topics" to text segments. A system executing the method exhibits significant performance improvement. Once determined, the topic structure of a document may be employed for document retrieval and/or document summarization.

Loading the Abstract Image... loading....

First Claim

Related Publications

Loading Related Publications... loading....

Patent Owner(s)

Patent OwnerAddressTotal Patents
XEROX CORPORATIONSTAMFORD, CT17163

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Brants, Thorsten H Palo Alto, CA 10 58
Chen, Francine R Menlo Park, CA 46 1128
Tsochantaridis, Ioannis Providence, RI 3 12

Cited Art

Patent Info (Count) # Cites Year
 
XEROX CORPORATION (5)
5,606,643 Real-time audio recording system for automatic speaker indexing 22 1994
5,659,766 Method and apparatus for inferring the topical content of a document based upon its lexical content without supervision 40 1994
5,687,364 Method for learning to infer the topical content of documents based upon their lexical content 52 1994
6,128,634 Method and apparatus for facilitating skimming of text 21 1998
6,239,801 Method and system for indexing and controlling the playback of multimedia documents 45 1999
 
FUJI XEROX CO., LTD. (1)
5,943,669 Document retrieval device 39 1997
 
TECHNOLOGY LICENSING CORPORATION (1)
5,675,819 Document information retrieval using global word co-occurrence patterns 260 1994

Patent Citation Ranking

Forward Cites

Patent Info (Count) # Cites Year
 
MICROSOFT CORPORATION (2)
8,335,683 System for using statistical classifiers for spoken language understanding 0 2003
7,853,596 Mining geographic knowledge using a location aware topic model 1 2007
 
APTIMA, INC. (1)
7,822,750 Method and system to compare data entities 1 2008
 
Networked Insights, LLC (1)
7,925,743 Method and system for qualifying user engagement with a website 4 2008
 
XEROX CORPORATION (1)
7,457,808 Method and apparatus for explaining categorization decisions 8 2004

Maintenance Fees

Fee Large entity fee small entity fee micro entity fee due date
7.5 Year Payment $3600.00 $1800.00 $900.00 Apr 30, 2014
11.5 Year Payment $7400.00 $3700.00 $1850.00 Apr 30, 2018
Fee Large entity fee small entity fee micro entity fee
Surcharge - 7.5 year - Late payment within 6 months $160.00 $80.00 $40.00
Surcharge - 11.5 year - Late payment within 6 months $160.00 $80.00 $40.00
Surcharge after expiration - Late payment is unavoidable $700.00 $350.00 $175.00
Surcharge after expiration - Late payment is unintentional $1,640.00 $820.00 $410.00