05371807 is referenced by 177 patents and cites 9 patents.

A text classification system and method that can be used by an application for classifying natural language text input into a computer system having a domain specific knowledge base that includes a knowledge base having a plurality of categories. The text classification system classifies input natural language input text by first parsing the natural language input text into a first list of recognized keywords. This list is then used to deduce further facts from the natural language input text which are then compiled into a second list. Next, a numeric similarity score for each one of the plurality of categories in the knowledge base is calculated which indicates how similar one of the plurality of categories is to the natural language input text. A dynamic threshold is then applied to determine which ones of the plurality of categories are most similar to the recognized keywords of the natural language input text. A third list is compiled of the ones of the plurality of categories determined to be most similar to the recognized keywords. An optional rule base can be utilized to further refine the determination of which ones of the plurality of categories are most similar to the recognized keywords of the natural language input text. Also, an optional learning capability can be added to improve the accuracy of the text classification system.

Title
Method and apparatus for text classification
Application Number
7/855378
Publication Number
5371807
Application Date
March 20, 1992
Publication Date
December 6, 1994
Inventor
Narasimhan Kannan
Colorado Springs
CO, US
Michael S Register
Colorado Springs
CO, US
Agent
Kenyon & Kenyon
Assignee
Digital Equipment Corporation
MA, US
IPC
G06F 15/38
G06K 9/72
View Original Source