06782357 is referenced by 18 patents.

Cluster- and pruning-based language model compression is disclosed. In one embodiment, a language model is first clustered, such as by using predictive clustering. The language model after clustering has a larger size than it did before clustering. The language model is then pruned, such as by using entropy-based techniques, such as Rosenfeld pruning, or by using Stolcke pruning or count-cutoff techniques. In one particular embodiment, a word language model is first predictively clustered by a technique described as P(Z|xy)×P(z|xyZ), where a lower-case letter refers to a word, and an upper-cluster letter refers to a cluster in which the word resides.

Title
Cluster and pruning-based language model compression
Application Number
9/565608
Publication Number
6782357 (B1)
Application Date
May 4, 2000
Publication Date
August 24, 2004
Inventor
Jianfeng Gao
Beijing
US
Joshua Goodman
Redmond
WA, US
Agent
Westman Champlin & Kelly P A
US
Agent
Theodore M Magee
US
Assignee
Microsoft Corporation
WA, US
IPC
G06F 17/27
View Original Source