A computer-readable medium comprises data structure for providing information about levels of similarity between pairs of N documents. The data structure comprises a plurality of entries of similarity values representing levels of similarity for a plurality of pairs of the documents. Each of the similarity values represents a level of similarity of one document of a given pair relative to the other document of the given pair. The similarity value of each entry is greater than a threshold similarity value that is greater than zero. The plurality of similarity-value entries are fewer than N2−N in number if the similarity values are asymmetric with regard to document pairing, and the plurality of similarity-value entries are fewer than N 2 - N 2 in number if the similarity values are symmetric with regard to document pairing. A method and apparatus for generating the data structure are described.

Title
Method and apparatus for constructing a compact similarity structure and for using the same in analyzing document relevance
Application Number
11/298500
Publication Number
20070136336
Application Date
December 12, 2005
Publication Date
June 14, 2007
Inventor
David A Evans
Pittsburgh
PA, US
Norbert Roma
Pittsburgh
PA, US
James G Shanahan
San Francisco
CA, US
Agent
Jones Day
NY, US
Assignee
Clairvoyance Corporation
PA, US
IPC
G06F 07/00
View Original Source