1
Dmitriy Meyerzon, Sankrant Sanu: Method of web crawling utilizing crawl numbers. Microsoft Corporation, Christensen O&apos Connor Johnson Kindness PLLC, October 28, 2003: US06638314 (182 worldwide citation)

A computer based system and method of retrieving information pertaining to electronic documents on a computer network is disclosed. The method includes maintaining a database that associates each electronic document with a corresponding crawl number that indicates the most recent crawl during which ...


2
Dmitriy Meyerzon, Srikanth Shoroff, F Soner Terek, Scott Norin: Method and system for detecting duplicate documents in web crawls. Microsoft Corporation, Woodcock Washburn, April 15, 2003: US06547829 (147 worldwide citation)

A Web crawler application takes advantage of a document store's ability to provide a content identifier (CID) having a value that is a unique function of the physical storage location of a data object or document, such as a Web page. In operation, the crawler first tries to fetch the CID for a ...


3
Dmitriy Meyerzon, Sankrant Sanu: Synchronizing crawler with notification source. Microsoft Corporation, Christensen O&apos Connor Johnson Kindness PLLC, July 23, 2002: US06424966 (143 worldwide citation)

A method and system for the processing and maintenance of electronic information retrieved from electronic documents stored on a computer network. The gatherer program of the present invention employs a crawler to crawl a portion of the computer network to retrieve electronic documents found during ...


4
Dmitriy Meyerzon, William G Nichols: Automatic tagging of documents and exclusion by content. Microsoft Corporation, Christensen O&apos Connor Johnson Kindness PLLC, March 6, 2001: US06199081 (139 worldwide citation)

A computer-based method and system for processing data obtained from documents retrieved from a computer network during a gathering project is disclosed. Plugging in modular active and consumer plug-ins into the gathering project configures the information processing capability of the gathering proc ...


5
Sankrant Sanu, Dmitriy Meyerzon: Method of web crawling utilizing address mapping. Microsoft Corporation, Christenson O Connor Johnson Kindness PLLC, November 7, 2000: US06145003 (134 worldwide citation)

A computer-based system and method of retrieving information pertaining to Web documents on a computer network is disclosed. The method includes maintaining an address map that associates primary addresses with secondary addresses. A primary address includes a network retrieval protocol and a networ ...


6
Dmitriy Meyerzon, Srikanth Shoroff, F Soner Terek, Sankrant Sanu: Method and system for incremental web crawling. Microsoft Corporation, Woodcock Washburn, October 7, 2003: US06631369 (115 worldwide citation)

A Web crawler creates an index of documents in a document store on a computer network. In an initial crawl, the crawler creates a first full index for the document store. The first full crawl is based on a set of predefined “seed” URLs and crawl restrictions, and involves recursively retrieving each ...


7
Kenji C Obata, Dmitriy Meyerzon: Adaptive web crawling using a statistical model. Microsoft Corporation, Christensen O Connor Johnson Kindness PLLC, February 5, 2008: US07328401 (65 worldwide citation)

A computer based system and method of retrieving information pertaining to documents on a computer network is disclosed. The method includes selecting a set of documents to be accessed during a Web crawl by utilizing a statistical model to determine which previously retrieved documents are most like ...


8
Kyle Peltonen, Dmitriy Meyerzon: Scoping queries in a search engine. Microsoft Corporation, Workman Nydegger, May 24, 2005: US06898592 (57 worldwide citation)

Systems and methods for scoping a search. When a content index for electronic data is built, one or more scope restrictions are included in the content index. The scope restriction may be, for example, a root folder identifier, a mailbox identifier, or a URL. Because the scope restriction is include ...


9
Vladimir Tankovich, Dmitriy Meyerzon, Michael Taylor, Stephen Robertson: Techniques to perform relative ranking for search results. Microsoft Corporation, Merchant & Gould P C, July 5, 2011: US07974974 (41 worldwide citation)

Techniques to perform relative ranking for search results are described. An apparatus may include an enhanced search component operative to receive a search query and provide ranked search results responsive to the search query. The enhanced search component may comprise a resource search module ope ...


10
Kenji Obata, Dmitriy Meyerzon: Proxy server using a statistical model. Microsoft Corporation, Christensen O Connor Johnson Kindness PLLC, April 19, 2005: US06883135 (40 worldwide citation)

A computer based system and method of determining whether to re-fetch a previously retrieved document across a computer network is disclosed. The method utilizes a statistical model to determine whether the previously retrieved document likely changed since last accessed. The statistical model is co ...