Document compression system and method for use with tokenspace repository

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 7917480
SERIAL NO

10917739

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

The disclosed embodiments enable multi-stage query scoring, including “snippet” generation, through incremental document reconstruction facilitated by a multi-tiered mapping scheme. The mapping scheme includes a first mapping between unique tokens contained in a set of documents and unique global token identifiers (e.g., 32-bit integers) contained in a global-lexicon (i.e., dictionary). The mapping scheme also includes a second mapping between the global token identifiers and a set of fixed-length local token identifiers (e.g., 8-bit integers) contained in one or more mini-lexicons (i.e., sub-dictionaries). Each mini-lexicon is associated with a range of token positions in the tokenized documents. The first and second mappings are used to encode/decode documents into local token identifiers having fixed widths which can be compactly stored in the tokenspace repository. The use of fixed-length local token identifiers allows for fast and efficient decoding of tokenized documents.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
GOOGLE LLC1600 AMPHITHEATRE PARKWAY MOUNTAIN VIEW CA 94043

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Dean, Jeffrey Palo Alto, US 76 3734
Ghemawat, Sanjay Mountain View, US 118 3845
Gomes, Benedict Anthony Mountain View, US 5 134
Sercinoglu, Olcan Mountain View, US 28 1027
Thambidorai, Gautham K Sunnyvale, US 2 52

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation