开发者

Django query case-insensitive list match

I 开发者_高级运维have a list of names that I want to match case insensitive, is there a way to do it without using a loop like below?

a = ['name1', 'name2', 'name3']
result = any([Name.objects.filter(name__iexact=name) for name in a])


Unfortunatley, there are no __iin field lookup. But there is a iregex that might be useful, like so:

result = Name.objects.filter(name__iregex=r'(name1|name2|name3)')

or even:

a = ['name1', 'name2', 'name3']
result = Name.objects.filter(name__iregex=r'(' + '|'.join(a) + ')')

Note that if a can contain characters that are special in a regex, you need to escape them properly.

NEWS: In Django 1.7+ it is possible to create your own lookups, so you can actually use filter(name__iin=['name1', 'name2', 'name3']) after proper initialization. See documentation reference for details.


In Postgresql you could try creating a case insensitive index as described here:

https://stackoverflow.com/a/4124225/110274

Then run a query:

from django.db.models import Q
name_filter = Q()
for name in names:
    name_filter |= Q(name__iexact=name)
result = Name.objects.filter(name_filter)

Index search will run faster than the regex matching query.


Another way to this using django query functions and annotation

from django.db.models.functions import Lower
Record.objects.annotate(name_lower=Lower('name')).filter(name_lower__in=['two', 'one']


Adding onto what Rasmuj said, escape any user-input like so

import re
result = Name.objects.filter(name__iregex=r'(' + '|'.join([re.escape(n) for n in a]) + ')')


Keep in mind that at least in MySQL you have to set utf8_bin collation in your tables to actually make them case sensitive. Otherwise they are case preserving but case insensitive. E.g.

>>> models.Person.objects.filter(first__in=['John', 'Ringo'])
[<Person: John Lennon>, <Person: Ringo Starr>]
>>> models.Person.objects.filter(first__in=['joHn', 'RiNgO'])
[<Person: John Lennon>, <Person: Ringo Starr>]

So, if portability is not crucial and you use MySQL you may choose to ignore the issue altogether.


I am expanding Exgeny idea into an two liner.

import functools
Name.objects.filter(functools.reduce(lambda acc,x: acc | Q(name_iexact=x)), names, Q()))


Here is an example of custom User model classmethod to filter users by email case-insensitive

from django.db.models import Q

@classmethod
def get_users_by_email_query(cls, emails):
    q = Q()
    for email in [email.strip() for email in emails]:
        q = q | Q(email__iexact=email)
    return cls.objects.filter(q)


After trying many methods, including annotate, which resulted in duplicate objects, I discovered transformers (https://docs.djangoproject.com/en/4.1/howto/custom-lookups/#a-transformer-example) which allow for a simple solution.

Add the following to models.py before model declarations:

class LowerCase(models.Transform):
    lookup_name = "lower"
    function = "LOWER"


models.CharField.register_lookup(LowerCase)
models.TextField.register_lookup(LowerCase)

You can now use the __lower transformer alongside any lookup, in this case: field__lower__in. You can also add bilateral = True to the transformer class for it to apply to both the field and the list items, which should be functionally equivalent to __iin.


If this is a common use case for anyone, you can implement this by adapting the code from Django's In and IExact transformers.

Make sure the following code is imported before all model declarations:

from django.db.models import Field
from django.db.models.lookups import In


@Field.register_lookup
class IIn(In):
    lookup_name = 'iin'

    def process_lhs(self, *args, **kwargs):
        sql, params = super().process_lhs(*args, **kwargs)

        # Convert LHS to lowercase
        sql = f'LOWER({sql})'

        return sql, params

    def process_rhs(self, qn, connection):
        rhs, params = super().process_rhs(qn, connection)

        # Convert RHS to lowercase
        params = tuple(p.lower() for p in params)

        return rhs, params

Example usage:

result = Name.objects.filter(name__iin=['name1', 'name2', 'name3'])
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜