US Patent Application No: 2002/0091,671

Number of patents in Portfolio can not be more than 2000

Method and system for data retrieval in large collections of data

2 Status Updates

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

A method, system and computer readable medium for retrieving relevant data in large collections of documents is disclosed. The method, system and computer readable medium of the present invention includes retrieving a document to be indexed, generating a document extract from the document, wherein the document extract comprises a portion of the document, and decomposing the document extract into tokens. The tokens are then stored in a search index, wherein a search engine accesses the search index to retrieve information satifying a search query. Through aspects of the method, system and computer readable medium of the present invention, the quality of the search result is improved because the retrieved documents are more relevant in view of the semantic concept or notion represented by the search query. Moreover the storage requirements are reduced, while expediting the processing time for conducting a search.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddressTotal Patents
INTERNATIONAL BUSINESS MACHINES CORPORATIONARMONK, NY45606

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Prokoph, Andreas Boeblingen, DE 5 106

Cited Art Landscape

  • No Cited Art to Display

Patent Citation Ranking

Forward Cite Landscape

Patent Info (Count) # Cites Year
 
Other [Check patent profile for assignment information] (3)
8,990,235 Automatically providing content associated with captured information, such as information captured in real-time 0 2010
* 8,069,162 Enhanced search indexing 2 2010
* 2011/0099,134 Method and System for Agent Based Summarization 1 2010
 
INTERNATIONAL BUSINESS MACHINES CORPORATION (12)
8,214,391 Knowledge-based data mining system 7 2002
* 7,010,526 Knowledge-based data mining system 6 2002
6,993,534 Data store for knowledge-based data mining system 31 2002
* 7,254,571 System and method for generating and retrieving different document layouts from a given content 9 2002
7,146,361 System, method and computer program product for performing unstructured information management and automatic text analysis, including a search operator functioning as a Weighted AND (WAND) 82 2003
7,139,752 System, method and computer program product for performing unstructured information management and automatic text analysis, and providing multiple document views derived from different document tokenizations 60 2003
7,289,983 Personalized indexing and searching for information in a distributed data processing system 19 2003
8,014,997 Method of search content enhancement 2 2003
7,512,602 System, method and computer program product for performing unstructured information management and automatic text analysis, including a search operator functioning as a weighted and (WAND) 2 2006
8,280,903 System, method and computer program product for performing unstructured information management and automatic text analysis, including a search operator functioning as a Weighted AND (WAND) 1 2008
8,027,966 Method and system for searching a multi-lingual database 2 2008
8,027,994 Searching a multi-lingual database 0 2008
 
BELLSOUTH INTELLECTUAL PROPERTY CORPORATION (2)
7,409,593 Automated diagnosis for computer networks 12 2003
* 7,324,986 Automatically facilitated support for complex electronic services 0 2003
 
HYPERTEXT SOLUTIONS INC. (1)
7,953,593 Method and system for extending keyword searching to syntactically and semantically annotated data 3 2009
 
LINKEDIN CORPORATION (1)
7,854,009 Method of securing access to IP LANs 2 2003
 
ORACLE OTC SUBSIDIARY LLC (11)
8,874,549 System and method for measuring the quality of document sets 0 2008
8,832,140 System and method for measuring the quality of document sets 1 2008
8,219,593 System and method for measuring the quality of document sets 3 2008
8,051,073 System and method for measuring the quality of document sets 11 2008
8,051,084 System and method for measuring the quality of document sets 12 2008
8,024,327 System and method for measuring the quality of document sets 14 2008
8,005,643 System and method for measuring the quality of document sets 8 2008
* 2011/0246,378 IDENTIFYING HIGH VALUE CONTENT AND DETERMINING RESPONSES TO HIGH VALUE CONTENT 1 2010
8,560,529 System and method for measuring the quality of document sets 2 2011
8,527,515 System and method for concept visualization 1 2011
8,935,249 Visualization of concepts within a collection of information 0 2012
 
INTELLECTUAL VENTURES II LLC (1)
7,735,142 Electronic vulnerability and reliability assessment 2 2007
 
VIAVIENTE (1)
7,580,929 Phrase-based personalization of searches in an information retrieval system 31 2004
 
MICROSOFT TECHNOLOGY LICENSING, LLC (3)
8,713,024 Efficient forward ranking in a search engine 0 2010
8,620,907 Matching funnel for large document index 1 2010
8,478,704 Decomposable ranking for efficient precomputing that selects preliminary ranking features comprising static ranking features and dynamic atom-isolated components 0 2010
 
SCHLUMBERGER TECHNOLOGY CORPORATION (1)
* 8,156,131 Quality measure for a data context service 2 2009
 
HARRIS CORPORATION (1)
* 7,801,887 Method for re-ranking documents retrieved from a document database 4 2004
 
FACEBOOK, INC. (5)
7,584,194 Method and apparatus for an application crawler 23 2005
* 7,370,381 Method and apparatus for a ranking engine 31 2005
7,912,836 Method and apparatus for a ranking engine 6 2008
8,954,416 Method and apparatus for an application crawler 0 2009
8,788,488 Ranking search results based on recency 0 2012
 
VCVCIII LLC (1)
8,954,469 Query templates and labeled search tip system, methods, and techniques 0 2008
 
GOOGLE INC. (44)
7,711,679 Phrase-based detection of duplicate documents in an information retrieval system 21 2004
7,599,914 Phrase-based searching in an information retrieval system 26 2004
* 7,584,175 Phrase-based generation of document descriptions 33 2004
7,580,921 Phrase identification in an information retrieval system 33 2004
7,536,408 Phrase-based indexing in an information retrieval system 37 2004
7,430,556 Phrase-based indexing in an information retrieval system 1 2004
7,426,507 Automatic taxonomy generation in search results using phrases 67 2004
8,407,239 Multi-stage query processing system and method for use with tokenspace repository 0 2004
* 7,917,480 Document compression system and method for use with tokenspace repository 4 2004
7,702,618 Information retrieval system for archiving multiple document versions 27 2005
7,567,959 Multiple index based information retrieval system 49 2005
* 8,713,418 Adding value to a rendered document 1 2005
* 2008/0141,117 Adding Value to a Rendered Document 143 2005
7,603,345 Detecting spam documents in a phrase based information retrieval system 24 2006
8,166,021 Query phrasification 6 2007
8,166,045 Phrase extraction using subphrase scoring 14 2007
8,086,594 Bifurcated document relevance scoring 7 2007
7,925,655 Query scheduling using hierarchical tiers of index servers 18 2007
7,702,614 Index updating using segment swapping 17 2007
7,693,813 Index server architecture using tiered and sharded phrase posting lists 23 2007
8,117,223 Integrating external related phrase information into a phrase-based indexing information retrieval system 7 2007
8,560,550 Multiple index based information retrieval system 1 2009
8,619,287 System and method for information gathering utilizing form identifiers 0 2009
8,078,629 Detecting spam documents in a phrase based information retrieval system 5 2009
8,090,723 Index server architecture using tiered and sharded phrase posting lists 9 2010
8,612,427 Information retrieval system for archiving multiple document versions 0 2010
8,108,412 Phrase-based detection of duplicate documents in an information retrieval system 4 2010
8,874,504 Processing techniques for visual capture data from a rendered document 0 2010
8,793,162 Adding information or functionality to a rendered document via association with an electronic counterpart 0 2010
8,903,759 Determining actions involving captured information and electronic content associated with rendered documents 0 2010
8,621,349 Publishing techniques for adding value to a rendered document 0 2010
8,619,147 Handheld device for capturing text from both a document printed on paper and a document displayed on a dynamic display device 0 2010
8,620,760 Methods and systems for initiating application processes by data capture from rendered documents 0 2010
8,799,303 Establishing an interactive environment for rendered documents 0 2010
8,321,445 Generating content snippets using a tokenspace repository 2011
8,402,033 Phrase extraction using subphrase scoring 4 2011
8,489,628 Phrase-based detection of duplicate documents in an information retrieval system 1 2011
8,682,901 Index server architecture using tiered and sharded phrase posting lists 2 2011
8,631,027 Integrated external related phrase information into a phrase-based indexing information retrieval system 0 2012
8,600,975 Query phrasification 0 2012
8,799,099 Processing techniques for text capture from a rendered document 0 2012
8,781,228 Triggering actions in response to optically or acoustically capturing keywords from a rendered document 0 2012
8,831,365 Capturing text from rendered documents using supplement information 0 2013
8,943,067 Index server architecture using tiered and sharded phrase posting lists 0 2013
 
KABOODLE, INC. (2)
* 7,630,968 Extracting information from formatted sources 1 2006
* 7,606,797 Reverse value attribute extraction 1 2006
 
VCVC III LLC (11)
7,283,951 Method and system for enhanced data searching 37 2001
7,398,201 Method and system for enhanced data searching 42 2003
7,526,425 Method and system for extending keyword searching to syntactically and semantically annotated data 45 2004
8,856,096 Extending keyword searching to syntactically and semantically annotated data 0 2006
8,594,996 NLP-based entity recognition and disambiguation 1 2008
8,700,604 NLP-based content recommender 1 2008
8,131,540 Method and system for extending keyword searching to syntactically and semantically annotated data 7 2009
8,645,372 Keyword-based search engine results using enhanced query strategies 0 2010
8,645,125 NLP-based systems and methods for providing quotations 0 2011
8,838,633 NLP-based sentiment analysis 0 2011
8,725,739 Category-based content recommendation 0 2011
 
NETBASE SOLUTIONS, INC. (4)
8,055,608 Method and apparatus for concept-based classification of natural language discourse 3 2006
8,046,348 Method and apparatus for concept-based searching of natural language discourse 4 2006
8,935,152 Method and apparatus for frame-based analysis of search results 0 2008
8,949,263 Methods and apparatus for sentiment analysis 0 2012
 
TOPIX LLC (3)
* 8,271,495 System and method for automating categorization and aggregation of content from network sites 1 2004
7,930,647 System and method for selecting pictures for presentation with text content 2 2005
7,814,089 System and method for presenting categorized content on a site using programmatic and manual selection of content items 3 2007
 
YAHOO! INC. (2)
* 8,984,398 Generation of search result abstracts 0 2008
* 2010/0057,710 GENERATION OF SEARCH RESULT ABSTRACTS 2 2008
* Cited By Examiner