Skip first column and get distinct from other columns
I need to select distinct rows from Textfile display below.
TextFile
123| one| two| three <br/>
124| one| two| four <br/>
125| one |two| three <br/>
Output should like this
123| one| two| three <br/>
124| one| two| four <br/>
OR
124| one| two| four <br/>
125| one |two| three <br/>
I am using this code to work out this problem
var readfile = File.ReadAllLines(" text file location ");
var spiltfile = (from f in readfile
let line = f.Split('|')
let y = line.Skip(1)
select (from 开发者_开发问答str in y
select str).FirstOrDefault()).Distinct()
Thanks
The unclear spacing in the question doesn't help (especially around the |two|
, which has different spacing than the rest, implying we need to use trimming), but here's some custom LINQ methods that do the job. I've used the anon-type purely as a simple way of flattening out the inconsistent spacing (I could also have rebuilt a string, but it seemed unnecessary)
Note that without the odd spacing, this can be simply:
var qry = ReadLines("foo.txt")
.DistinctBy(line => line.Substring(line.IndexOf('|')));
Full code:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
static class Program
{
static void Main()
{
var qry = (from line in ReadLines("foo.txt")
let parts = line.Split('|')
select new
{
Line = line,
Key = new
{
A = parts[1].Trim(),
B = parts[2].Trim(),
C = parts[3].Trim()
}
}).DistinctBy(row => row.Key)
.Select(row => row.Line);
foreach (var line in qry)
{
Console.WriteLine(line);
}
}
static IEnumerable<TSource> DistinctBy<TSource, TValue>(
this IEnumerable<TSource> source,
Func<TSource, TValue> selector)
{
var found = new HashSet<TValue>();
foreach (var item in source)
{
if (found.Add(selector(item))) yield return item;
}
}
static IEnumerable<string> ReadLines(string path)
{
using (var reader = File.OpenText(path))
{
string line;
while ((line = reader.ReadLine()) != null)
{
yield return line;
}
}
}
}
Check out this, this will do what you want to do
static void Main(string[] args)
{
string[] readfile = System.IO.File.ReadAllLines(@"D:\1.txt");
var strList = readfile.Select(x => x.Split('|')).ToList();
IEnumerable<string[]> noduplicates =strList.Distinct(new StringComparer());
foreach (var res in noduplicates)
Console.WriteLine(res[0] + "|" + res[1] + "|" + res[2] + "|" + res[3]);
}
And implement the IEqualityComparer this way
class StringComparer : IEqualityComparer<string[]>
{
public bool Equals(string[] x, string[] y)
{
if (Object.ReferenceEquals(x, y)) return true;
if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
return false;
return x[1].Trim() == y[1].Trim() && x[2].Trim() == y[2].Trim() && x[3].Trim() == y[3].Trim() ;
}
public int GetHashCode(string[] data)
{
if (Object.ReferenceEquals(data, null)) return 0;
int hash1 = data[1] == null ? 0 : data[1].Trim().GetHashCode();
int hash2 = data[2] == null ? 0 : data[2].Trim().GetHashCode();
int hash3 = data[3] == null ? 0 : data[3].Trim().GetHashCode();
return hash1 ^ hash2 * hash3;
}
}
It will give u the output as you expected.
精彩评论