LINQ string[] against multiple fields
Lets say I have a table dataContext.Customer with the following fields
FName varchar
LName varchar
Phone varchar
DOB datetime
Address varchar
The table is filled with some sample data, lets say:
John | Smith | 3051112222 | 01/01/1978 | Roosevelt Av 787
Aron | Frank | 7871112222 | 01/01/1979 | Lambda Street 305
Dick | Bush | 9512221111 | 01/01/1980 | John Street 1
John | Allen | 7872222222 | 01/01/1981 | Liberty Av 555
We also have a string array with an arbitrary number of elements, for example:
search[0] = "1978"
search[1] = "John"
I need a LINQ query that will compare each field of the table against each item in the string array incrementally using "contains" or "any" (meaning LIKE in SQL) and only return the rows that match all the given conditions in a record, based on the previous search[] example the LINQ query should return only record #1.
Another example can be:
search[0] = "Bush"
search[1] = "111"
search[2] = "John"
and only record #3 should by returned. Finally for:
search[0] = "John"
Records #1, #3 and #4 should be returned (I think the idea is clear)
There's a question on how to compare a s开发者_Python百科tring[] against field in: LINQ: Entity string field contains any of an array of strings
If the answer is a 50 lines C# routine I prefer to solve this directly in the database via stored procedure.
It will be awesome if there is some kind of "reflection" trick to iterate all the field on dataContext.Customers while doing the query (obviously the real table don't have 5 fields).
Performance is not a problem.
I'm pretty sure this cannot be done in a single LINQ line because of the logic needed for the multiple match, but it never hurt to ask, much less to learn anything new :)
UPDATE: Ok, here's a simple SQL code that will accomplish the task. Note that I have cut the amount of search variables to just 2 for clarity. In the real life scenario we can limit the amount of arguments to 10 search parameters. I deliberately did not use functions (well, except for CONVERT) to keep the SQL as simple as possible to see if there is any way to accomplish this in LINQ. Here's the SQL:
declare @_SEARCH1 varchar(1000)
select @_SEARCH1 = 'John'
declare @_SEARCH2 varchar(1000)
select @_SEARCH2 = '111'
select *
from CUSTOMER
where
FName + ' ' + LName + ' ' + Phone + ' ' + CONVERT(varchar, DOB, 101) + ' ' + Address like '%'+@_SEARCH1+'%'
and FName + ' ' + LName + ' ' + Phone + ' ' + CONVERT(varchar, DOB, 101) + ' ' + Address like '%'+@_SEARCH2+'%'
So the question is, is there a way to write a LINQ that will generate this simple SQL? (please note that the comparison is done in the database via 'LIKE', not in the application)
UPDATE 2: Although solutions like the one from Francisco will generate the "LIKE" statement it will fail doing the comparison. The other solutions that pulls all the data from the table to the webserver will do the match correctly, but is totally impractical.
Accepted answer to RUNE FS since is the cleanest solution and will work will any number of fields.
Using the PredicateBuilder
void Main()
{
var search = new string[] { "Romania","RO"};
var query = from c in countries.AllAny(search)
orderby c.name
select c;
query.Dump();
}
public static class QueryExtensions
{
public static IQueryable<T> AllAny<T>(this IQueryable<T> query, string[] search)
{
var properties = typeof(T).GetProperties().Where(p => p.GetCustomAttributes(typeof(System.Data.Linq.Mapping.ColumnAttribute),true).Any()).Select(n=>n.Name);
var andPredicate = PredicateBuilder.True<T>();
foreach ( var term in search )
{
var orPredicate = PredicateBuilder.False<T>();
foreach (var property in properties )
orPredicate = orPredicate.Or(CreateLike<T>(property,term));
andPredicate = andPredicate.And(orPredicate);
}
return query.Where(andPredicate);
}
private static Expression<Func<T,bool>> CreateLike<T>( PropertyInfo prop, string value)
{
var parameter = Expression.Parameter(typeof(T), "f");
var propertyAccess = Expression.MakeMemberAccess(parameter, prop);
var toString = Expression.Call(propertyAccess, "ToString", null, null);
var like = Expression.Call(toString, "Contains", null, Expression.Constant(value,typeof(string)));
return Expression.Lambda<Func<T, bool>>(like, parameter);
}
private static Expression<Func<T,bool>> CreateLike<T>( string propertyName, string value)
{
var prop = typeof(T).GetProperty(propertyName);
return CreateLike<T>(prop, value);
}
}
// http://www.albahari.com/nutshell/predicatebuilder.aspx
public static class PredicateBuilder
{
public static Expression<Func<T, bool>> True<T> () { return f => true; }
public static Expression<Func<T, bool>> False<T> () { return f => false; }
public static Expression<Func<T, bool>> Or<T> (this Expression<Func<T, bool>> expr1,
Expression<Func<T, bool>> expr2)
{
var invokedExpr = Expression.Invoke (expr2, expr1.Parameters.Cast<Expression> ());
return Expression.Lambda<Func<T, bool>>
(Expression.OrElse (expr1.Body, invokedExpr), expr1.Parameters);
}
public static Expression<Func<T, bool>> And<T> (this Expression<Func<T, bool>> expr1,
Expression<Func<T, bool>> expr2)
{
var invokedExpr = Expression.Invoke (expr2, expr1.Parameters.Cast<Expression> ());
return Expression.Lambda<Func<T, bool>>
(Expression.AndAlso (expr1.Body, invokedExpr), expr1.Parameters);
}
}
Update This code is a generic solution for the following query
from c in countries
where (c.name.ToString().Contains(search[0]) || c.name.ToString().Contains(search[1]))
&& (c.iso_code.ToString().Contains(search[0]) || c.iso_code.ToString().Contains(search[1]))
/*&& ...*/
orderby c.name
select c
This code can be improved in many ways. For sample, for the string properties, there is no need to call ToString before Contains ( this will generate a convert(nvarchar)) and I really think someone who needs this will only want to look at the varchar, nvarchar columns.
Assuming that '\t' would never be part of the data you could do the following. You can of course substitute with any other character. With that assumption you could do as below:
public static IEnumerable<T> Where<T>(this IEnumerable<T> sequence,
string[] criteria){
var properties = typeof(T).GetProperties()
.Where(p=>p.GetGetMethod() != null);
return from s in sequence
let text = properties.Aggregate("",(acc,prop) =>
acc +
"\t" +
prop.GetValue(s,null)
)
where criteria.All(c => text.Contains(c))
select s;
}
EDIT
I originally didn't include the usage since I found no collection in the original post but assuming the sequence is defined as IEnumerabl<Person>
and can be accessed as a property called Persons on a variable db. the code would look similar to:
IEnumerable<Person> persons = db.Persons.Where(criteria);
I don't think linq to sql can do this efficiently, but linq to objects might be able to help if you can stomach moving the entire table over the wire between your application code and the database exactly one time per search.
The first step is to get a DataReader that will read through all of the records in your table. It's important to use a datareader, because you probably don't want to fill up memory with an entire table (and then again, maybe you do -- see my note about caching near the end).
Once you have that, you need to convert it to an IEnumerable for use with linq-to-objects. You can do that with a simple two-line utility method, like so:
IEnumerable<IDataRecord> EnumerableFromDataReader(IDataReader reader)
{
while (reader.Read())
yield return reader;
}
The actual code I normally use is a little more complicated, but what I do might not fit in well with how the rest of your project is structured so I'll leave it at this short method for now.
Once we have the Enumerable, we take advantage of the composable nature of linq to get some (relatively) simple (or at least brief) code like this:
IEnumerable<IDataRecord> SearchMyTable(string[] filters)
{
var results = EnumerableFromDataReader(GetMyTable());
foreach (string filter in filters)
{
results = results.Where(r => String.Join("",r.GetValues().Cast<string>().ToArray()).Contains(filter));
}
return results;
}
It's not a single linq query, but it's not exactly 50 lines of code either.
Ignoring network latency, the performance of this code is actually quite good. If you want to play with some caching for all or part of the table after the columns are joined, the performance is amazing.
Update: There is (at least) one flaw in this code. For each filter, I re-convert all the rows that survive to that filter to a string... even if I've already done so for a previous filter. The way to fix this is to first do a projection to pair the rows with the string version. But as it's after 11pm my time, I'll leave the code as it stands for now. The good news is the the final fixed code should be the same length: just add a .Select()
call to the first and last lines and change up the middle of the foreach loop a bit.
I don't have a C# compiler at hand right now but I have this idea:
Declare a lambda Expression like this:
public Expression<Func<Customer, bool>> GetFilterFromString(string input)
{
return p=> p.FName.Contains(input) ||
p.LName.Contains(input) ||
p.Phone.Contains(input) ||
p.DOB.ToString().Contains(input) ||
p.Address.Contains(input) ||
}
Implementation may vary depending on your needs (like concatenating all fields, like you did with your SQL query).
Then in your main query function:
IQueryable<Customer> customers = dataContext.Customers;
foreach(string inputSearch in search)
{
customers = customers.Where(GetFilterFromString(inputSearch));
}
IEnumerable<Customer> results = customers.AsEnumerable();
I think the main advantage of this approach is that you have to declare GetFilterFromString once. Hope it is what you're looking for.
Edit:
Ok, so I read the SQL statement you were looking for (kind of late... but anyway). I think it's easy to adapt my solution. We will have to tweak the lambda expression a little:
public Expression<Func<Customer, bool>> GetFilterFromString(string input)
{
return p => (p.FName + " " +
p.LName + " " +
p.Phone + " " +
p.DOB.ToString() + " " +
p.Address)
.Contains(input)
}
May be what i might do is, first ill get the records from DB having first name compared with the array item i have.
Once i get the the subset of the table data (assuming that the structure of DB table is same and the array structure as well meaning that array[0] is always first name), then i search for what ever pattern i am looking for in the memory.
I am sure this is not the exactly solution your hoping for. But lemme think further, mean while i am open for more ideas folks :)
Similar to Francisco's answer but with a single application of the where clause:
string[] search = new string[] { "Bush", "111", "John" };
var customers = new[] {
new {FName = "Dick", Surname = "Bush", Phone = "9512221111", DOB = new DateTime(1980,01,01), Address = "John Street 1" },
new {FName = "John", Surname = "Smith", Phone = "3051112222", DOB = new DateTime(1978,01,01), Address = "Roosevelt Av 787"}
};
var result = customers.Where(customer => search.All(term =>
customer.FName.Contains(term)
|| customer.Surname.Contains(term)
|| customer.DOB.ToString().Contains(term)
|| customer.Phone.Contains(term)
|| customer.Address.Contains(term)
));
精彩评论