1
Jan O Pedersen, Per Kristian Halvorsen, Douglass R Cutting, John W Tukey, Eric A Bier, Daniel G Bobrow: Iterative technique for phrase query formation and an information retrieval system employing same. Xerox Corporation, Oliff & Berridge, January 11, 1994: US05278980 (435 worldwide citation)

An information retrieval system and method are provided in which an operator inputs one or more query words which are used to determine a search key for searching through a corpus of documents, and which returns any matches between the search key and the corpus of documents as a phrase containing th ...


2
Frank Zdybel Jr, Henry W Sang Jr, Jan O Pedersen, Z E Smith III, D A Henderson Jr, David L Hecht, Dan S Bloomberg: Hardcopy lossless data storage and communications for electronic document processing systems. Xerox Corporation, January 23, 1996: US05486686 (308 worldwide citation)

Machine readable electronic domain definitions of part or all of the electronic domain descriptions of hardcopy documents and/or of part or all of the transforms that are performed to produce and reproduce such hardcopies documents are encoded in codes that are printed on such documents, thereby per ...


3
Jan O Pedersen, David Karger, Douglass R Cutting, John W Tukey: Scatter-gather: a cluster-based method and apparatus for browsing large document collections. Xerox Corporation, Oliff & Berridge, August 15, 1995: US05442778 (183 worldwide citation)

Scatter-Gather is a computer based document browsing method which operates in time proportional to a number of documents in a target corpus. The Scatter-Gather method includes: preparing an initial ordering of the corpus using, for example, an off-line computational method; determining a summary of ...


4
John W Tukey, Jan O Pedersen: Method and apparatus for information access employing overlapping clusters. Xerox Corporation, December 7, 1999: US05999927 (116 worldwide citation)

The present invention is a method and apparatus for document clustering-based browsing of a corpus of documents, and more particularly to the use of overlapping clusters to improve recall. The present invention is directed to improving the performance of information access methods and apparatus thro ...


5
Jan O Pedersen, John W Tukey: Method and apparatus for automatic document summarization. Xerox Corporation, Oliff & Berridge, June 10, 1997: US05638543 (113 worldwide citation)

Regions of a document such as sentences and blocks of sentences are scored and classified based upon their scores. An abstract of the document can be formed from the classified sentences. Sentences are classified by the use of words classified as stop words and vanish words. Sentences are scored bas ...


6
Julian M Kupiec, Jan O Pedersen, Francine R Chen, Daniel C Brotsky, Steven B Putz: Automatic method of generating feature probabilities for automatic extracting summarization. Xerox Corporation, Tracy L Hurt, July 7, 1998: US05778397 (110 worldwide citation)

A method of automatically generating feature probabilities that allow later automatic generation of document extracts. The computer system generates the probabilities by analyzing each document a document at a time. First, the computer system designates one of the documents as a selected document. N ...


7
John W Tukey, Jan O Pedersen: Method and apparatus for information accesss employing overlapping clusters. Xerox Corporation, Duane C Basch, July 28, 1998: US05787422 (78 worldwide citation)

The present invention is a method and apparatus for document clustering-based browsing of a corpus of documents, and more particularly to the use of overlapping clusters to improve recall. The present invention is directed to improving the performance of information access methods and apparatus thro ...


8
Julian M Kupiec, Jan O Pedersen, Francine R Chen, Daniel C Brotsky, Steven B Putz: Automatic method of extracting summarization using feature probabilities. Xerox Corporation, Tracy L Hurt, June 29, 1999: US05918240 (47 worldwide citation)

A method of automatically generating document extracts. The method makes use of feature value probabilities generated from a statistical analysis of manually generated summaries to extract the same set of sentences an expert might. The method is based upon an iterative approach. First, the computer ...


9
Douglass R Cutting, Per Kristian G Halvorsen, Ronald M Kaplan, Lauri Karttunen, Martin Kay, Jan O Pedersen: Finite-state transduction of related word forms for text indexing and retrieval. Xerox Corporation, Rosen Dainow & Jacobs Liability Partnership, April 29, 1997: US05625554 (37 worldwide citation)

The present invention solves a number of problems in using stems (canonical indicators of word meanings) in full-text retrieval of natural language documents, and thus permits recall to be improved without sacrificing precision. It uses various arrangements of finite-state transducers to accurately ...


10
John W Tukey, Jan O Pedersen: Method of ordering document clusters without requiring knowledge of user interests. Xerox Corporation, Tracy L Hurt, July 28, 1998: US05787420 (30 worldwide citation)

A computerized method of ordering document clusters for presentation after browsing a corpus of documents that presents document clusters in a logical fashion in the absence of any indication of the computer user's interests. The method begins by grouping the corpus into a plurality of clusters, eac ...