A system and method for efficiently reducing a number of duplicate blocks of stored data. A file server both removes duplicate data and prevents duplicate data from being stored in the shared storage. A sampling rate may be used to determine which fingerprints, or hash values, are stored in an index. The sampling rate may be modified in response to changes in characteristics of the system, such as a change in the shared storage size, a change in a utilization of the shared storage, a change in the size of the storage unit, and reaching a threshold corresponding to utilization of the index. Also, a small cache may be maintained for holding fingerprint and pointer pair values prefetched from the shared storage. Each prefetched pair may be associated with data corresponding to a previous hit in the index. The association may be related to spatial locality, temporal locality, or otherwise.