Compressed yet quickly searchable digital textual data format

Number of patents in Portfolio can not be more than 2000

United States of America Patent

APP PUB NO 20040225497A1
SERIAL NO

10429326

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

A data processing method is disclosed for storing and retrieving text. The method achieves a significant level of efficiency in compression over prior art without having to compress the token dictionary through an iterative tokenization of the text and tokens. A benefit of the uncompressed token dictionary is faster searches and decompression of tokenized text. To achieve faster searches, an index with a given text resolution for each unique word is created and added as an additional column element in the alphabetized word table. Since tokens consisting of multiple tokens populate the tokenized text, they are parsed to tokens that represent unique words before a search for a word or phrase is conducted. In a relatively large text such as a Bible, there could be a large number of tokens that consist of multiple tokens, which could take fair amount of time to parse. Therefore, the method includes a step of creating an additional index that is added as an additional column element in the alphabetized word table. The resulting invention enables high levels of compression and faster searches of text in documents.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
CALLAHAN JAMES PATRICKNot Provided

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Callahan, James Patrick Santa Clara, CA 1 25

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation