Skip to Content

Verifying checksums


I am curious if there is any best practice that currently exists, or is developing, around frequency of checksum verification for files in a dark archive. If filestreams and checksum manifests are backed up weekly and those backups are kept for six months, does it make sense to verify checksums every six months so that if a problem is found, you could theoretically go back to the first backup after the last check?



Yes, that is sensible.

There is no documented "best practice" although I have heard a few rules-of-thumb thrown about. Yes, what you suggest is sensible, but you need to be careful. You need to make sure you run the checksum validation far enough ahead of the purged backup that you have enough time to complete the validation and any recovery effort before your known copy is still safe. You also need to be careful that the dates of backup, validation, and purging line up properly. I would probably advocate being more aggressive with the timeline and run the validation half way through the retention period so you have two known good copies. Depending on the volume of materials (how long it takes to run the validation over the corpus) and cost of validation (CPU time, bandwidth, etc.) you might be as frequent as monthly (we are).

about seo | qa