Method of identifying the language of a textual passage using short word and/or n-gram comparisons

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 7359851
APP PUB NO 20050154578A1
SERIAL NO

10757313

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

A method and system identifying the language of a textual passage is disclosed. The method and system includes parsing the textual passage into n-grams and assigning an initial weight to each n-gram, and adjusting the weight initially assigned to a word or n-gram parsed from the textual passage. The initially assigned weight is adjusted in a manner proportionate to the inverse of the number of languages within which such words or n-grams appear. Reducing the weight assigned to such words or n-grams diminishes--without completely eliminating--their importance in comparison to other words or n-grams parsed from the same textual passage when determining the language of a passage. The method and system of the present invention appropriately weighs the short words or n-grams common to multiple languages without affecting the short words or n-grams that are uncommon to several languages.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
JUSTSYSTEMS EVANS RESEARCH INC5001 BAUM BOULEVARD SUITE 700 PITTSBURGH PA 15213-1854

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Evans, David A Pittsburgh, PA 101 3351
Grefenstette, Gregory T Gieres, FR 10 3004
Tong, Xiang Beaverton, OR 4 425

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation