Systems and methods for determining the topic structure of a portion of text

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 7130837
APP PUB NO 20030182631A1
SERIAL NO

10103053

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

Systems and methods for determining the topic structure of a document including text utilize a Probabilistic Latent Semantic Analysis (PLSA) model and select segmentation points based on similarity values between pairs of adjacent text blocks. PLSA forms a framework for both text segmentation and topic identification. The use of PLSA provides an improved representation for the sparse information in a text block, such as a sentence or a sequence of sentences. Topic characterization of each text segment is derived from PLSA parameters that relate words to 'topics', latent variables in the PLSA model, and 'topics' to text segments. A system executing the method exhibits significant performance improvement. Once determined, the topic structure of a document may be employed for document retrieval and/or document summarization.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddressTotal Patents
XEROX CORPORATIONSTAMFORD, CT13693

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Brants, Thorsten H Palo Alto, CA 8 200
Chen, Francine R Menlo Park, CA 39 2288
Tsochantaridis, Ioannis Providence, RI 2 42

Cited Art Landscape

Patent Info (Count) # Cites Year
 
FUJI XEROX CO., LTD. (1)
5943669 Document retrieval device 53 1997
 
TECHNOLOGY LICENSING CORPORATION (1)
5675819 Document information retrieval using global word co-occurrence patterns 451 1994
 
XEROX CORPORATION (5)
5606643 Real-time audio recording system for automatic speaker indexing 22 1994
5659766 Method and apparatus for inferring the topical content of a document based upon its lexical content without supervision 51 1994
5687364 Method for learning to infer the topical content of documents based upon their lexical content 71 1994
6128634 Method and apparatus for facilitating skimming of text 30 1998
6239801 Method and system for indexing and controlling the playback of multimedia documents 51 1999
* Cited By Examiner

Patent Citation Ranking

Forward Cite Landscape

Patent Info (Count) # Cites Year
 
Other [Check patent profile for assignment information] (1)
* 2008/0114,737 METHOD AND SYSTEM FOR AUTOMATICALLY IDENTIFYING USERS TO PARTICIPATE IN AN ELECTRONIC CONVERSATION 50 2007
 
INTERNATIONAL BUSINESS MACHINES CORPORATION (1)
* 2011/0202,484 ANALYZING PARALLEL TOPICS FROM CORRELATED DOCUMENTS 6 2010
 
SONY CORPORATION (1)
8666915 Method and device for information retrieval 0 2011
 
NETWORKED INSIGHTS, INC. (WISCONSIN CORPORATION) (1)
* 2009/0222,551 METHOD AND SYSTEM FOR QUALIFYING USER ENGAGEMENT WITH A WEBSITE 58 2008
 
XEROX CORPORATION (2)
* 7457808 Method and apparatus for explaining categorization decisions 39 2004
* 2006/0136,410 Method and apparatus for explaining categorization decisions 1 2004
 
Aptima, Inc. (2)
* 7822750 Method and system to compare data entities 9 2008
* 2008/0250,064 METHOD AND SYSTEM TO COMPARE DATA ENTITIES 3 2008
 
MITSUBISHI ELECTRIC RESEARCH LABORATORIES, INC. (2)
* 9251250 Method and apparatus for processing text with variations in vocabulary usage 1 2012
* 2013/0262,083 Method and Apparatus for Processing Text with Variations in Vocabulary Usage 1 2012
 
NETWORKED INSIGHTS, LLC (1)
7925743 Method and system for qualifying user engagement with a website 24 2008
 
NEC CORPORATION (2)
* 9015161 Mismatch detection system, method, and program 2 2011
* 2013/0031,098 MISMATCH DETECTION SYSTEM, METHOD, AND PROGRAM 0 2011
 
PAYPAL, INC. (4)
* 8631005 Header-token driven automatic text segmentation 2 2006
* 2008/0162,520 Header-token driven automatic text segmentation 6 2006
9053091 Header-token driven automatic text segmentation 1 2013
9529862 Header-token driven automatic text segmentation 0 2015
 
AVAYA INC. (1)
* 2013/0081,056 SYSTEM AND METHOD FOR ALIGNING MESSAGES TO AN EVENT BASED ON SEMANTIC SIMILARITY 7 2012
 
MICROSOFT TECHNOLOGY LICENSING, LLC (7)
* 8335683 System for using statistical classifiers for spoken language understanding 6 2003
* 2004/0148,154 System for using statistical classifiers for spoken language understanding 32 2003
* 2004/0148,170 Statistical classifiers for spoken language understanding and command/control scenarios 78 2003
* 7853596 Mining geographic knowledge using a location aware topic model 3 2007
* 2008/0319,974 MINING GEOGRAPHIC KNOWLEDGE USING A LOCATION AWARE TOPIC MODEL 27 2007
* 2009/0119,284 METHOD AND SYSTEM FOR CLASSIFYING DISPLAY PAGES USING SUMMARIES 4 2008
* 8924391 Text classification using concept kernel 0 2010
 
QBASE, LLC (1)
* 9542477 Method of automated discovery of topics relatedness 0 2014
* Cited By Examiner

Maintenance Fees

Fee Large entity fee small entity fee micro entity fee due date
11.5 Year Payment $7400.00 $3700.00 $1850.00 Apr 30, 2018
Fee Large entity fee small entity fee micro entity fee
Surcharge - 11.5 year - Late payment within 6 months $160.00 $80.00 $40.00
Surcharge after expiration - Late payment is unavoidable $700.00 $350.00 $175.00
Surcharge after expiration - Late payment is unintentional $1,640.00 $820.00 $410.00