System and method for building diverse language models

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 9081760
SERIAL NO

13042890

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

Disclosed herein are systems, methods, and non-transitory computer-readable storage media for collecting web data in order to create diverse language models. A system configured to practice the method first crawls, such as via a crawler operating on a computing device, a set of documents in a network of interconnected devices according to a visitation policy, wherein the visitation policy is configured to focus on novelty regions for a current language model built from previous crawling cycles by crawling documents whose vocabulary considered likely to fill gaps in the current language model. A language model from a previous cycle can be used to guide the creation of a language model in the following cycle. The novelty regions can include documents with high perplexity values over the current language model.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

  • NUANCE COMMUNICATIONS, INC.

International Classification(s)

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Bangalore, Srinivas Morristown, US 176 5179
Barbosa, Luciano De Andrade Madison, US 15 735

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation

Maintenance Fees

Fee Large entity fee small entity fee micro entity fee due date
11.5 Year Payment $7400.00 $3700.00 $1850.00 Jan 14, 2027
Fee Large entity fee small entity fee micro entity fee
Surcharge - 11.5 year - Late payment within 6 months $160.00 $80.00 $40.00
Surcharge after expiration - Late payment is unavoidable $700.00 $350.00 $175.00
Surcharge after expiration - Late payment is unintentional $1,640.00 $820.00 $410.00