06360215 is referenced by 346 patents.
A method and apparatus are provided for retrieving documents from a collection of documents based on information other than the contents of a desired document. The collection of documents, which may be a hypertext system or documents available via the World Wide Web, is indexed. In one embodiment, an indexing process of a search engine receives one or more specifications that identify documents, or document locations, and non-content information such as a tag word or code word. The indexing process searches the index to identify all documents in the index that match one or more of the specifications. If a match is found, the tag word is added to the index, and information about the matching document is stored in the index in association with the tag word. A search query is submitted to the search engine. The search query is automatically modified to add a reference to the tag word, such as a query term that will exclude any index entry for a document associated with the tag word. The search is executed against the index, and a set of search results is generated. Accordingly, the search results automatically exclude all documents associated with the tag word. These techniques may be used, for example, to implement a Web search service that produces more accurate search results or that prevents certain documents, such as pornographic materials, from appearing in search results.