Learning word segmentation from non-white space languages corpora

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 8165869
APP PUB NO 20090150145A1
SERIAL NO

11953635

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

Illustrative embodiments provide a computer implemented method, apparatus, and computer program product for learning word segmentation from non-white space language corpora. In one illustrative embodiment, the computer implemented method receives text input characters and calculates a ratio-measure for each pair of characters in the input characters. The computer implemented method further determines whether the ratio-measure of each pair of characters is equal to a predetermined threshold value. Responsive to determining the ratio-measure is less than the predetermined threshold value, and a local-minimum value, the computer method further identifies the pair as a weak pair and breaks the weak pair of characters.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
INTERNATIONAL BUSINESS MACHINES CORPORATIONNEW ORCHARD ROAD ARMONK NY 10504

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Cohen, Daniel Har'ei Yehuda, IL 183 2404
Dayan, Yigal Shai Jerusalem, IL 5 28
Magdalon, Josemina Marcolla Jerusalem, IL 1 0
Mazel, Victoria Jerusalem, IL 11 63

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation