C# matching two text files, case sensitive issue
What I have is two files, sourcecolumns.txt
and destcolumns.txt
. What I need to do is compare source to dest and if the dest doesn't contain the source value, write it out to a new file. The code below works except I have case sensitive issues like this:
source: CPI
dest: CpiThese don't match because of captial letters, so I get incorrect outputs. Any help is always welcome!
string[] sourcelinestotal =
File.ReadAllLines("C:\\testdirectory\\" + "sourcecolumns.txt");
string[] destlinesto开发者_运维问答tal =
File.ReadAllLines("C:\\testdirectory\\" + "destcolumns.txt");
foreach (string sline in sourcelinestotal)
{
if (destlinestotal.Contains(sline))
{
}
else
{
File.AppendAllText("C:\\testdirectory\\" + "missingcolumns.txt", sline);
}
}
You could do this using an extension method for IEnumerable<string>
like:
public static class EnumerableExtensions
{
public static bool Contains( this IEnumerable<string> source, string value, StringComparison comparison )
{
if (source == null)
{
return false; // nothing is a member of the empty set
}
return source.Any( s => string.Equals( s, value, comparison ) );
}
}
then change
if (destlinestotal.Contains( sline ))
to
if (destlinestotal.Contains( sline, StringComparison.OrdinalIgnoreCase ))
However, if the sets are large and/or you are going to do this very often, the way you're going about it is very inefficient. Essentially, you're doing an O(n2) operation -- for each line in the source you compare it with, potentially, all lines in the destination. It would be better to create a HashSet from the destination columns with a case insenstivie comparer and then iterate through your source columns checking if each one exists in the HashSet of the destination columns. This would be an O(n) algorithm. note that Contains on the HashSet will use the comparer you provide in the constructor.
string[] sourcelinestotal =
File.ReadAllLines("C:\\testdirectory\\" + "sourcecolumns.txt");
HashSet<string> destlinestotal =
new HashSet<string>(
File.ReadAllLines("C:\\testdirectory\\" + "destcolumns.txt"),
StringComparer.OrdinalIgnoreCase
);
foreach (string sline in sourcelinestotal)
{
if (!destlinestotal.Contains(sline))
{
File.AppendAllText("C:\\testdirectory\\" + "missingcolumns.txt", sline);
}
}
In retrospect, I actually prefer this solution over simply writing your own case insensitive contains for IEnumerable<string>
unless you need the method for something else. There's actually less code (of your own) to maintain by using the HashSet implementation.
Use an extension method for your Contains. A brilliant example was found here on stack overflow Code isn't mine, but I'll post it below.
public static bool Contains(this string source, string toCheck, StringComparison comp)
{
return source.IndexOf(toCheck, comp) >= 0;
}
string title = "STRING";
bool contains = title.Contains("string", StringComparison.OrdinalIgnoreCase);
If you do not need case sensitivity, convert your lines to upper case using string.ToUpper
before comparison.
精彩评论