Compare two csv file using in C# [duplicate]
I need to find out difference between two csv files programatically. Is there any way to find out the difference without using any loops?
Please help me.
How much information do you need about the differences? If all you need is the fact that they are different and the requirement to have no loops is fixed you could try taking an MD5 hash and comparing the two hashes. If you don't care about memory usage you could just dump the whole stream into a MemoryStream
call Getbytes
and then pass the two arrays into Enumerable.SequenceEqual
private static byte[] GetFileHash(string filename)
{
using(var stream = new FileStream(filename, FileMode.Open))
{
var md5Hasher = new MD5CryptoServiceProvider();
return md5Hasher.ComputeHash(stream);
}
}
var file1hash = GetFileHash("file1.ext");
var file2hash = GetFileHash("file2.ext");
var areEqual = Enumerable.SequenceEqual(file1hash, file2hash);
Now there are loops being used, just not by you.
Have you looked at the following links ?
If not, then you should.
- C# - Comparing two CSV Files and giving an output
- Comparing 2 CSV files in C# advice?
No, there is no way without using loops. How do you expect any compare algorithm to iterate over the characters / words / tokens / lines of the file without using loops?
Anyway, assuming both CSV are sorted by an ID column:
- Try splitting the files into rows
- In a loop
- Split each row as
List<string>
or as array - Compare the lists of both files (ignore trailing empty columns etc.)
- When differences in data columns were found Save a new row containing the differences into a
List<List<string>>
- When different IDs were found, compare both ID: Save the row with the smaller ID (which identifies the additional row) and get the next row of this file
- Split each row as
Check the code below to CompareTwoCSVFile and report in another .csv file
class CompareTwoCSVFile
{
public bool ReportErrorOnCompareCSV(string filePathOne, string filePathTwo)
{
var csv = new StringBuilder();
string[] fileContentsOne = File.ReadAllLines(filePathOne);
string[] fileContentsTwo = File.ReadAllLines(filePathTwo);
if (!fileContentsOne.Length.Equals(fileContentsTwo.Length))
return false;
string[] columnshead1 = fileContentsOne[0].Split(new char[] { ';' });
List<string> heading1 = new List<string>();
Dictionary<string, string>[] dict1 = new Dictionary<string, string>[fileContentsOne.Length];
Dictionary<string, string>[] dict2 = new Dictionary<string, string>[fileContentsTwo.Length];
string[] headingsplit = columnshead1[0].Split(',');
for (int i=0;i< headingsplit.Length;i++)
{
heading1.Add(headingsplit[i]);
}
var newLine = "";
newLine = string.Format("{0},{1},{2},{3}", "File1_ColumnName", "File1_ColumnValue", "File2_ColumnName", "File2_ColumnValue");
csv.AppendLine(newLine);
for (int i = 0; i < fileContentsOne.Length-1; ++i)
{
string[] columnsOne = fileContentsOne[i+1].Split(new char[] { ';' });
string[] columnsTwo = fileContentsTwo[i+1].Split(new char[] { ';' });
string[] cellOne = columnsOne[0].Split(',');
string[] cellTwo = columnsTwo[0].Split(',');
dict1[i] = new Dictionary<string, string>();
dict2[i] = new Dictionary<string, string>();
for(int j=0;j< headingsplit.Length;j++)
{
dict1[i].Add(heading1[j],cellOne[j]);
}
for (int j = 0; j < headingsplit.Length; j++)
{
dict2[i].Add(heading1[j], cellTwo[j]);
}
foreach (KeyValuePair<string, string> entry in dict1[i])
{
if (dict2[i][entry.Key].Equals(entry.Value)!=true)
{
Console.WriteLine("Mismatch Values on row "+i+":\n File1 "+entry.Key + "-" + entry.Value+"\n File2 "+entry.Key+"-"+ dict2[i][entry.Key]);
newLine = string.Format("{0},{1},{2},{3}", entry.Key, entry.Value, entry.Key, dict2[i][entry.Key]);
csv.AppendLine(newLine);
}
}
}
File.WriteAllText("D:\\Errorlist.csv", csv.ToString());
return true;
}
}
精彩评论