08311964 is referenced by 18 patents and cites 45 patents.

A system and method for efficiently reducing a number of duplicate blocks of stored data. A file server both removes duplicate data and prevents duplicate data from being stored in the shared storage. A sampling rate may be used to determine which fingerprints, or hash values, are stored in an index. The sampling rate may be modified in response to changes in characteristics of the system, such as a change in the shared storage size, a change in a utilization of the shared storage, a change in the size of the storage unit, and reaching a threshold corresponding to utilization of the index. Also, a small cache may be maintained for holding fingerprint and pointer pair values prefetched from the shared storage. Each prefetched pair may be associated with data corresponding to a previous hit in the index. The association may be related to spatial locality, temporal locality, or otherwise.

Title
Progressive sampling for deduplication indexing
Application Number
12/617426
Publication Number
8311964 (B1)
Application Date
November 12, 2009
Publication Date
November 13, 2012
Inventor
Dharmesh Shah
San Jose
CA, US
Fanglu Guo
Los Angeles
CA, US
Petros Efstathopoulos
Los Angeles
CA, US
Agent
Meyertons Hood Kivlin Kowert & Goetzel P C
Rory D Rankin
Assignee
Symantec Corporation
CA, US
IPC
G06N 5/00
G06F 17/00
View Original Source