Want to find records with no associated records in Rails
Consider a simple association...
class Person
has_many :friends
end
class Friend
belongs_to :person
end
What is the cleanest way to get all persons that have NO friends in ARel and/or meta_where?
And then what about a 开发者_如何学Pythonhas_many :through version
class Person
has_many :contacts
has_many :friends, :through => :contacts, :uniq => true
end
class Friend
has_many :contacts
has_many :people, :through => :contacts, :uniq => true
end
class Contact
belongs_to :friend
belongs_to :person
end
I really don't want to use counter_cache - and I from what I've read it doesn't work with has_many :through
I don't want to pull all the person.friends records and loop through them in Ruby - I want to have a query/scope that I can use with the meta_search gem
I don't mind the performance cost of the queries
And the farther away from actual SQL the better...
Update 4 - Rails 6.1
Thanks to Tim Park for pointing out that in the upcoming 6.1 you can do this:
Person.where.missing(:contacts)
Thanks to the post he linked to too.
Update 3 - Rails 5
Thanks to @Anson for the excellent Rails 5 solution (give him some +1s for his answer below), you can use left_outer_joins
to avoid loading the association:
Person.left_outer_joins(:contacts).where(contacts: { id: nil })
I've included it here so people will find it, but he deserves the +1s for this. Great addition!
Update 2
Someone asked about the inverse, friends with no people. As I commented below, this actually made me realize that the last field (above: the :person_id
) doesn't actually have to be related to the model you're returning, it just has to be a field in the join table. They're all going to be nil
so it can be any of them. This leads to a simpler solution to the above:
Person.includes(:contacts).where(contacts: { id: nil })
And then switching this to return the friends with no people becomes even simpler, you change only the class at the front:
Friend.includes(:contacts).where(contacts: { id: nil })
Update
Got a question about has_one
in the comments, so just updating. The trick here is that includes()
expects the name of the association but the where
expects the name of the table. For a has_one
the association will generally be expressed in the singular, so that changes, but the where()
part stays as it is. So if a Person
only has_one :contact
then your statement would be:
Person.includes(:contact).where(contacts: { person_id: nil })
Original
Better:
Person.includes(:friends).where(friends: { person_id: nil })
For the hmt it's basically the same thing, you rely on the fact that a person with no friends will also have no contacts:
Person.includes(:contacts).where(contacts: { person_id: nil })
smathy has a good Rails 3 answer.
For Rails 5, you can use left_outer_joins
to avoid loading the association.
Person.left_outer_joins(:contacts).where( contacts: { id: nil } )
Check out the api docs. It was introduced in pull request #12071.
This is still pretty close to SQL, but it should get everyone with no friends in the first case:
Person.where('id NOT IN (SELECT DISTINCT(person_id) FROM friends)')
Persons that have no friends
Person.includes(:friends).where("friends.person_id IS NULL")
Or that have at least one friend
Person.includes(:friends).where("friends.person_id IS NOT NULL")
You can do this with Arel by setting up scopes on Friend
class Friend
belongs_to :person
scope :to_somebody, ->{ where arel_table[:person_id].not_eq(nil) }
scope :to_nobody, ->{ where arel_table[:person_id].eq(nil) }
end
And then, Persons who have at least one friend:
Person.includes(:friends).merge(Friend.to_somebody)
The friendless:
Person.includes(:friends).merge(Friend.to_nobody)
Both the answers from dmarkow and Unixmonkey get me what I need - Thank You!
I tried both out in my real app and got timings for them - Here are the two scopes:
class Person
has_many :contacts
has_many :friends, :through => :contacts, :uniq => true
scope :without_friends_v1, -> { where("(select count(*) from contacts where person_id=people.id) = 0") }
scope :without_friends_v2, -> { where("id NOT IN (SELECT DISTINCT(person_id) FROM contacts)") }
end
Ran this with a real app - small table with ~700 'Person' records - average of 5 runs
Unixmonkey's approach (:without_friends_v1
) 813ms / query
dmarkow's approach (:without_friends_v2
) 891ms / query (~ 10% slower)
But then it occurred to me that I don't need the call to DISTINCT()...
I'm looking for Person
records with NO Contacts
- so they just need to be NOT IN
the list of contact person_ids
. So I tried this scope:
scope :without_friends_v3, -> { where("id NOT IN (SELECT person_id FROM contacts)") }
That gets the same result but with an average of 425 ms/call - nearly half the time...
Now you might need the DISTINCT
in other similar queries - but for my case this seems to work fine.
Thanks for your help
Unfortunately, you're probably looking at a solution involving SQL, but you could set it in a scope and then just use that scope:
class Person
has_many :contacts
has_many :friends, :through => :contacts, :uniq => true
scope :without_friends, where("(select count(*) from contacts where person_id=people.id) = 0")
end
Then to get them, you can just do Person.without_friends
, and you can also chain this with other Arel methods: Person.without_friends.order("name").limit(10)
A NOT EXISTS correlated subquery ought to be fast, particularly as the row count and ratio of child to parent records increases.
scope :without_friends, where("NOT EXISTS (SELECT null FROM contacts where contacts.person_id = people.id)")
Also, to filter out by one friend for instance:
Friend.where.not(id: other_friend.friends.pluck(:id))
Here is an option using a subquery:
# Scenario #1 - person <-> friend
people = Person.where.not(id: Friend.select(:person_id))
# Scenario #2 - person <-> contact <-> friend
people = Person.where.not(id: Contact.select(:person_id))
The above expressions should generate the following SQL:
-- Scenario #1 - person <-> friend
SELECT people.*
FROM people
WHERE people.id NOT IN (
SELECT friends.person_id
FROM friends
)
-- Scenario #2 - person <-> contact <-> friend
SELECT people.*
FROM people
WHERE people.id NOT IN (
SELECT contacts.person_id
FROM contacts
)
精彩评论