开发者

Will changing a file name affect the MD5 Hash of a file?

Will changing a file name effect the MD5 Hash of a file?开发者_开发问答


No, the hash is of the file contents only. You can see this in the source for md5sum and its MD5 implementation. You can also test this if you have access to md5sum:

$ echo "some arbitrary content" > file1
$ cp file1 file2
$ md5sum file1
f0007cbddd79de02179de7de12bec4e6  file1
$ md5sum file2
f0007cbddd79de02179de7de12bec4e6  file2
$


The usual definition of "MD5 hash of a file" is that the hash is based on the file contents. The name can be freely changed.

$hash1 = md5(file);
// change file name
$hash2 = md5(file);

The two hash codes will be the same.

In some (fairly specialized) use cases, file metadata (name, time stamp(s), etc.) are part of the data used to compute the hash. Then

$hash1 = md5(file);
// change file name
$hash2 = md5(file);

will produce two separate hashes.


In Linux using EXT filesystem, it will not, because a file name is not stored in a file, it is stored in the directory entry (dentry) that the file lives in, where the inode of the file is then mapped to a name. Changing a filename will have no affect on its md5sum in Linux. In Windows, I cannot be sure.


If the hash is computed from the file contents, it shouldn't.


In ESXi (Precisely ESXi 5.5) md5sum on same content but different file names is different. That leads me to believe that VMFS-5 file structure includes file name too. If we are not concerned about file name, Is there a way to check only the md5sum of file content? I couldn't see any option. Any suggestions?


In response to the comment, https://stackoverflow.com/a/14360831/9392847:

This works only if one file is copy of another file but not when two different files with different names are generated with exactly same content. I have tried this:

nancy@nancy:~/Documents$ md5sum /home/nancy/Documents/1test.pdf
c5a445b7186dfb220ea79d2001acf3f1  /home/nancy/Documents/1test.pdf
nancy@nancy:~/Documents$ md5sum /home/nancy/Documents/2test.pdf
cefa063abf0c0a9e80b2b75e70100836  /home/nancy/Documents/2test.pdf

Both the files 1test.pdf and 2test.pdf are created using gimp software. Same content is exported twice with two different names.


1.md5 is calculated based on binary content of the FILE. 2.File name,last modified etc. things are meta data.md5 not really rely on meta-data. I have tested this with below steps,lets work with "last modified" meta-data i)I have created a file named "a.txt" and added some content and created a hash say hash is "xyz" ii)Then I have just added a space in the file and again calculated the hash say it returned "abc" iii)I just removed my change in step (ii),on calculating hash again I have got the initial hash("xyz")

This concludes that even though the metadata of file is changed,the hash remains same till the file content remains unaltered.

Hope it helps.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜