Unique model field in Django and case sensitivity (postgres)
Consider the following situation: -
Suppose my app allows users to create the states / provinces in their country. Just for clarity, we are considering only ASCII characters here.
In the US, a user could create the state called "Texas". If this app is being used internally, let's say the user doesn't care if it is spelled "texas" or "Texas" or "teXas"
But importantly, the system should prevent creation of "texas" if "Texas" is already in the database.
If the model is like the following:
class State(models.Model):
name = models.CharField(max_length=50, unique=True)
The uniqueness would be case-sensitive in postgres; that is, postgres would allow the user to create both "texas" and "Texas" as they are considered unique.
What can be done in this situation to prevent such behavior. How does one go about providing case-insenstitive uniqueness with Django and Postgres
Right now I'm doing the following to prevent creation of case- insensitive duplicates.
class CreateStateForm(forms.ModelForm):
def clean_name(self):
name = self.cleaned_data['name']
try:
State.objects.get(name__iexact=name)
except ObjectDoesNotExist:
return name
raise forms.ValidationError('State already exists.')
class Meta:
model = State
There are a numb开发者_开发百科er of cases where I will have to do this check and I'm not keen on having to write similar iexact checks everywhere.
Just wondering if there is a built-in or better way? Perhaps db_type would help? Maybe some other solution exists?
You could define a custom model field derived from models.CharField
.
This field could check for duplicate values, ignoring the case.
Custom fields documentation is here http://docs.djangoproject.com/en/dev/howto/custom-model-fields/
Look at http://code.djangoproject.com/browser/django/trunk/django/db/models/fields/files.py for an example of how to create a custom field by subclassing an existing field.
You could use the citext module of PostgreSQL https://www.postgresql.org/docs/current/static/citext.html
If you use this module, the the custom field could define "db_type" as CITEXT for PostgreSQL databases.
This would lead to case insensitive comparison for unique values in the custom field.
Alternatively you can change the default Query Set Manager to do case insensitive look-ups on the field. In trying to solve a similar problem I came across:
http://djangosnippets.org/snippets/305/
Code pasted here for convenience:
from django.db.models import Manager
from django.db.models.query import QuerySet
class CaseInsensitiveQuerySet(QuerySet):
def _filter_or_exclude(self, mapper, *args, **kwargs):
# 'name' is a field in your Model whose lookups you want case-insensitive by default
if 'name' in kwargs:
kwargs['name__iexact'] = kwargs['name']
del kwargs['name']
return super(CaseInsensitiveQuerySet, self)._filter_or_exclude(mapper, *args, **kwargs)
# custom manager that overrides the initial query set
class TagManager(Manager):
def get_query_set(self):
return CaseInsensitiveQuerySet(self.model)
# and the model itself
class Tag(models.Model):
name = models.CharField(maxlength=50, unique=True, db_index=True)
objects = TagManager()
def __str__(self):
return self.name
a very simple solution:
class State(models.Model):
name = models.CharField(max_length=50, unique=True)
def clean(self):
self.name = self.name.capitalize()
Explicit steps for Mayuresh's answer:
in postgres do: CREATE EXTENSION citext;
in your models.py add:
from django.db.models import fields class CaseInsensitiveTextField(fields.TextField): def db_type(self, connection): return "citext"
reference: https://github.com/zacharyvoase/django-postgres/blob/master/django_postgres/citext.py
in your model use: name = CaseInsensitiveTextField(unique=True)
On the Postgres side of things, a functional unique index will let you enforce unique values without case. citext is also noted, but this will work with older versions of PostgreSQL and is a useful technique in general.
Example:
# create table foo(bar text);
CREATE TABLE
# create unique index foo_bar on foo(lower(bar));
CREATE INDEX
# insert into foo values ('Texas');
INSERT 0 1
# insert into foo values ('texas');
ERROR: duplicate key value violates unique constraint "foo_bar"
Besides already mentioned option to override save, you can simply store all text in lower case in database and capitalize them on displaying.
class State(models.Model):
name = models.CharField(max_length=50, unique=True)
def save(self, force_insert=False, force_update=False):
self.name = self.name.lower()
super(State, self).save(force_insert, force_update)
You can use lookup='iexact' in UniqueValidator on serializer, like this:
class StateSerializer(serializers.ModelSerializer):
name = serializers.CharField(validators=[
UniqueValidator(
queryset=models.State.objects.all(),lookup='iexact'
)]
django version: 1.11.6
If you don't want to use a postgres-specific solution, you can create a unique index on the field with upper()
to enforce uniqueness at the database level, then create a custom Field
mixin that overrides get_lookup()
to convert case-sensitive lookups to their case-insensitive versions. The mixin looks like this:
class CaseInsensitiveFieldMixin:
"""
Field mixin that uses case-insensitive lookup alternatives if they exist.
"""
LOOKUP_CONVERSIONS = {
'exact': 'iexact',
'contains': 'icontains',
'startswith': 'istartswith',
'endswith': 'iendswith',
'regex': 'iregex',
}
def get_lookup(self, lookup_name):
converted = self.LOOKUP_CONVERSIONS.get(lookup_name, lookup_name)
return super().get_lookup(converted)
And you use it like this:
from django.db import models
class CICharField(CaseInsensitiveFieldMixin, models.CharField):
pass
class CIEmailField(CaseInsensitiveFieldMixin, models.EmailField):
pass
class TestModel(models.Model):
name = CICharField(unique=True, max_length=20)
email = CIEmailField(unique=True)
You can read more about this approach here.
You can do this by overwriting the Model's save method - see the docs. You'd basically do something like:
class State(models.Model):
name = models.CharField(max_length=50, unique=True)
def save(self, force_insert=False, force_update=False):
if State.objects.get(name__iexact = self.name):
return
else:
super(State, self).save(force_insert, force_update)
Also, I may be wrong about this, but the upcoming model-validation SoC branch will allow us to do this more easily.
Solution from suhail worked for me without the need to enable citext, pretty easy solution only a clean function and instead of capitalize I used upper()
. Mayuresh's solution also works but changed the field from CharField
to TextField
.
class State(models.Model):
name = models.CharField(max_length=50, unique=True)
def clean(self):
self.name = self.name.upper()
精彩评论