Weights are assigned to features of data objects and the weights are utilized to determine whether data objects are substantially identical or not. One application of such weights is to assign weights to terms in web page documents. The weights assigned to the terms may then be utilized to determine whether web page documents are substantially identical. A set of identicals may be generated for each web page that is indexed by the system and utilized to repair broken hyperlinks. Specifically, when a uniform resource locator (URL) associated with the hyperlink cannot be resolved or cannot be resolved in a timely fashion, one of the identicals of the desired web page documents may be returned to provide a requesting party with the desired content.