The funny thing is that data deduplication storage systems work on sha-1 hashes to identify identical data blocks (two blocks are considered equal if the hash is equal - most systems don't care to check for collisions); they kept telling that the likelyhood of collisions is so astronomically small [1] ; i wonder if this statement is still considered to be true.
The risk that two blocks would hash identically by accident is astronomically small, but there is indeed some possibility that a skilled person might construct two blocks that do.
So as long as your data on disk hasn't evolved intelligence and actively tries to create two blocks with the same hash, your storage system is absolutely fine.
Security systems must retain collision-resistance even in the presence of adversaries. If your design doesn't have adversaries, you can use cheaper, less secure hashes. Nobody would still recommend SHA-1, though.
Can you tell why SHA256 was chosen?
Was it because of the larger hash, or was it a requirement to have a cryptographic secure hash?
Do you know how the hash is used in the Netapp in regard to the remark MichaelMoser123 made:
"(two blocks are considered equal if the hash is equal - most systems don't care to check for collisions" [...]
Is it (overoversimplifed ;):
if sha256(a) == sha256(b) {
performDedup()
[...]
Or is a compare added if the hashes found to be identical:
if ( sha256(a) == sha256(b) and
byteCompare(a) == byteCompare(b)
) {
performDedup()
[...]
sha256 was chosen over sha1 because of the larger hash space making collision much less likely. These systems can store 100s of TBs to PBs and at that number of hash blocks you have to start worrying about the birthday paradox.
You are talking about making a combined hash function from MD5 and SHA-1. Keep in mind that it is not trivial. For instance, if you just output both SHA-1 and MD5, you are actually only as strong as the weakest hash function (trivially for preimage, but also true for collisions, see [1]).
If you are interested in hash function combiners, have a look at Robust Multi-Property Combiners for Hash Functions [2], a recent paper on the topic. It aims at getting several properties at the same time (preimage resistance, collision resistance). For simpler schemes with a single property, just explore the bibliography at the end of the paper.
Using two algorithms together is basically just creating a new hash algorithm (with a longer hash), which could also have collisions.
If you know a technique for making SHA-1 collisions and a technique for making MD5 collisions, it's probably possible to combine the techniques and make a collision for SHA-1-MD5.
[1] https://pibytes.wordpress.com/2013/02/09/deduplication-inter...