05467425 is referenced by 97 patents and cites 3 patents.

The present invention is an n-gram language modeler which significantly reduces the memory storage requirement and convergence time for language modelling systems and methods. The present invention aligns each n-gram with one of "n" number of non-intersecting classes. A count is determined for each n-gram representing the number of times each n-gram occurred in the training data. The n-grams are separated into classes and complement counts are determined. Using these counts and complement counts factors are determined, one factor for each class, using an iterative scaling algorithm. The language model probability, i.e., the probability that a word occurs given the occurrence of the previous two words, is determined using these factors.

Title
Building scalable N-gram language models using maximum likelihood maximum entropy N-gram models
Application Number
8/23543
Publication Number
5467425
Application Date
February 26, 1993
Publication Date
November 14, 1995
Inventor
Salim Roukos
Scarsdale
NY, US
Ronald Rosenfeld
Pittsburgh
PA, US
Raymond Lau
Cambridge
MA, US
Agent
Sterne Kessler Goldstein & Fox
Agent
Robert Tasinari
Assignee
International Business Machines Corporation
NY, US
IPC
G10L 9/00
View Original Source