开发者

How to identify duplicate assemblies?

While l was looking over some questions about MEF, I stumbled onto this particular answer to a question. It made me wonder about such a bit since I've never had to attempt such but can see it being very valid in the scenario of that question.

Scenario: If you have a directory of various .Net Assemblies all named different, how would you be able to identify ones that may be the same but renamed (i.e. Copy Of MyAssembly.dll vs MyAssembly.dll)?

I can think of the following items:

  1. Check File Size (should be the same)

  2. Check A开发者_Python百科ssembly Version Number

  3. Loop through the assembly using Reflection and attempt to locate any differences.

Is there any other/easier way of addressing this issue? Are there other criteria to look at for determining if 2 differently named DLLs are in fact the same compiled Assembly?


At first I thought you could use the Equals or ReferenceEquals to do this, but this proofs too error prone. If you use Assembly.LoadFile, this will not work, for instance.

With nUnit, I did the following tests, which are a bit basic, but gives you something to go on. The weird way of loading the types is necessary (see MSDN). I assume you know how to do the "quick tests" in case you want to check for binary equality etc (see PS below).

Assembly asm1 = Assembly.LoadFile(@"someDebugAssembly.dll");
Assembly asm2 = Assembly.LoadFile(@"someReleaseAssembly.dll");

// load all the types (the double try/catch is on purpose)
Type[] types1 = null
Type[] types2 = null;
try
{
    types1 = asm1.GetTypes();
}
catch (ReflectionTypeLoadException e)
{
    types1 = e.Types;
}
try 
{
    types2 = asm1.GetTypes();
}
catch (ReflectionTypeLoadException e)
{
    types2 = e.Types;
}

// same length
Assert.AreEqual(types1.Length, types2.Length);

// check each type
IEnumerator types1Enumerator =  types1.GetEnumerator();
types1Enumerator.Reset();
foreach (Type t in types2)
{
    types1Enumerator.MoveNext();
    Assert.AreEqual(types1Enumerator.Current, t);
}

A note on the code: this method of comparison will treat two assemblies as equal when they contain the same types. This means that a debug and a release build, or different versions, are not taken into consideration. Use asm1.GetName() and its properties (again: do not use Equals!) to compare the individual strings (version, full name etc).

PS: it would be intriguing to define what constitutes two equal assemblies, i.e.:

  1. they are binary equal
  2. there versions and fully qualified names are equal
  3. the strong names are equal
  4. all types, deeply compared, have equal signatures

depending on what you choose, two entirely different assemblies (i.e. debug build vs release build) can come up as equal. It really depends on how you want to compare.

Update: corrected previous errors and added code sample


I'll go first for a simple quick check using point 1. and 2., that is checking file size and assembly version number. If they're all different well, you're done.

If not, keep the files who have the same file size / version and compute their MD5/SHA1/whatever-you-prefer hash. If the hash is the same, you're definitely in presence of the same assembly twice. Since assemblies generally aren't very large (at most a few megabytes), the hash computing should be fast enough.


You can also use the good old comp command line program:

c:\tests> comp one.dll two.dll
Comparing one.dll and two.dll...
Files compare OK

Update: even better. Download the Windows XP Service Pack 2 support tools, install it (choose Complete installation). Then go to the 'Run command' dialog and type dupfinder. Point it to the folder that you want you'll start seing all the duplicates in that path and its subfolders.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜