开发者

c# finding matching words in table column using Linq2Sql

I am trying to use Linq2Sql to return all rows that contain values from a list of strings. The linq2sql class object 开发者_开发百科has a string property that contains words separated by spaces.

public class MyObject
{
    public string MyProperty { get; set; }
}

Example MyProperty values are:

MyObject1.MyProperty = "text1 text2 text3 text4"
MyObject2.MyProperty = "text2"

For example, using a string collection, I pass the below list

var list = new List<>() { "text2", "text4" }

This would return both items in my example above as they both contain "text2" value.

I attempted the following using the below code however, because of my extension method the Linq2Sql cannot be evaluated.

public static IQueryable<MyObject> WithProperty(this IQueryable<MyProperty> qry,
    IList<string> p)
{
    return from t in qry
        where t.MyProperty.Contains(p, ' ')
        select t;
}

I also wrote an extension method

public static bool Contains(this string str, IList<string> list, char seperator)
{
    if (str == null) return false;
    if (list == null) return true;

    var splitStr = str.Split(new char[] { seperator },
        StringSplitOptions.RemoveEmptyEntries);

    bool retval = false;
    int matches = 0;

    foreach (string s in splitStr)
    {
        foreach (string l in list)
        {
            if (String.Compare(s, l, true) == 0)
            {
                retval = true;
                matches++;
            }
        }
    }

    return retval && (splitStr.Length > 0) && (list.Count == matches);
 }

Any help or ideas on how I could achieve this?


Youre on the right track. The first parameter of your extension method WithProperty has to be of the type IQueryable<MyObject>, not IQueryable<MyProperty>.

Anyways you dont need an extension method for the IQueryable. Just use your Contains method in a lambda for filtering. This should work:

List<string> searchStrs = new List<string>() { "text2", "text4" }

IEnumerable<MyObject> myFilteredObjects = dataContext.MyObjects
                   .Where(myObj => myObj.MyProperty.Contains(searchStrs, ' '));

Update:

The above code snippet does not work. This is because the Contains method can not be converted into a SQL statement. I thought a while about the problem, and came to a solution by thinking about 'how would I do that in SQL?': You could do it by querying for each single keyword, and unioning all results together. Sadly the deferred execution of Linq-to-SQL prevents from doing that all in one query. So I came up with this compromise of a compromise. It queries for every single keyword. That can be one of the following:

  • equal to the string
  • in between two seperators
  • at the start of the string and followed by a seperator
  • or at the end of the string and headed by a seperator

This spans a valid expression tree and is translatable into SQL via Linq-to-SQL. After the query I dont defer the execution by immediatelly fetch the data and store it in a list. All lists are unioned afterwards.

public static IEnumerable<MyObject> ContainsOneOfTheseKeywords(
        this IQueryable<MyObject> qry, List<string> keywords, char sep)
{
    List<List<MyObject>> parts = new List<List<MyObject>>();

    foreach (string keyw in keywords)
        parts.Add((
            from obj in qry
            where obj.MyProperty == keyw ||
                  obj.MyProperty.IndexOf(sep + keyw + sep) != -1 ||
                  obj.MyProperty.IndexOf(keyw + sep) >= 0 ||
                  obj.MyProperty.IndexOf(sep + keyw) ==
                      obj.MyProperty.Length - keyw.Length - 1
            select obj).ToList());

    IEnumerable<MyObject> union = null;
    bool first = true;
    foreach (List<MyObject> part in parts)
    {
        if (first)
        {
            union = part;
            first = false;
        }
        else
            union = union.Union(part);
    }

    return union.ToList();
}

And use it:

List<string> searchStrs = new List<string>() { "text2", "text4" };

IEnumerable<MyObject> myFilteredObjects = dataContext.MyObjects
                    .ContainsOneOfTheseKeywords(searchStrs, ' ');

That solution is really everything else than elegant. For 10 keywords, I have to query the db 10 times and every time catch the data and store it in memory. This is wasting memory and has a bad performance. I just wanted to demonstrate that it is possible in Linq (maybe it can be optimized here or there, but I think it wont get perfect).

I would strongly recommend to swap the logic of that function into a stored procedure of your database server. One single query, optimized by the database server, and no waste of memory.

Another alternative would be to rethink your database design. If you want to query contents of one field (you are treating this field like an array of keywords, seperated by spaces), you may simply have chosen an inappropriate database design. You would rather want to create a new table with a foreign key to your table. The new table has then exactly one keyword. The queries would be much simpler, faster and more understandable.


I haven't tried, but if I remember correctly, this should work:

from t in ctx.Table
where list.Any(x => t.MyProperty.Contains(x))
select t

you can replace Any() with All() if you want all strings in list to match

EDIT:

To clarify what I was trying to do with this, here is a similar query written without linq, to explain the use of All and Any

where list.Any(x => t.MyProperty.Contains(x))

Translates to:

where t.MyProperty.Contains(list[0]) || t.MyProperty.Contains(list[1]) ||
      t.MyProperty.Contains(list[n])

And

where list.Any(x => t.MyProperty.Contains(x))

Translates to:

where t.MyProperty.Contains(list[0]) && t.MyProperty.Contains(list[1]) &&
      t.MyProperty.Contains(list[n])
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜