开发者

Fighting cartesian product (x-join) when using NHibernate 3.0.0

I'm bad at math but I kind get idea what cartesian product is.

Here is my situation (simplified):

public class Project{
 public IList<Partner> Partners{get;set;}
}
public class Partner{
 public IList<PartnerCosts> Costs{get;set;}
 public IList<Address> Addresses{get;set;}
}
public class PartnerCosts{
 public Money Total{get;set;}
}
public class Money{
 public decimal Amount{get;set;}
 public i开发者_如何学运维nt CurrencyCode{get;set;}
}
public class Address{
 public string Street{get;set;}
}

My aim is to effectively load entire Project.

Problem of course is:

  • If I try to eager load partners and their costs, query returns gazillion rows
  • If I lazy load Partner.Costs, db gets request spammed (which is a bit faster than first approach)

As I read, common workaround is to use MultiQueries, but I kind a just don't get it.

So I'm hoping to learn through this exact example.

How to effectively load whole Project?

P.s. I'm using NHibernate 3.0.0.

Please, do not post answers with hql or string fashioned criteria api approaches.


Ok, I wrote an example for myself reflecting your structure and this should work:

int projectId = 1; // replace that with the id you want
// required for the joins in QueryOver
Project pAlias = null;
Partner paAlias = null;
PartnerCosts pcAlias = null;
Address aAlias = null;
Money mAlias = null;

// Query to load the desired project and nothing else    
var projects = repo.Session.QueryOver<Project>(() => pAlias)
    .Where(p => p.Id == projectId)
    .Future<Project>();

// Query to load the Partners with the Costs (and the Money)
var partners = repo.Session.QueryOver<Partner>(() => paAlias)
    .JoinAlias(p => p.Project, () => pAlias)
    .Left.JoinAlias(() => paAlias.Costs, () => pcAlias)
    .JoinAlias(() => pcAlias.Money, () => mAlias)
    .Where(() => pAlias.Id == projectId)
    .Future<Partner>();

// Query to load the Partners with the Addresses
var partners2 = repo.Session.QueryOver<Partner>(() => paAlias)
    .JoinAlias(o => o.Project, () => pAlias)
    .Left.JoinAlias(() => paAlias.Addresses, () => aAlias)
    .Where(() => pAlias.Id == projectId)
    .Future<Partner>();

// when this is executed, the three queries are executed in one roundtrip
var list = projects.ToList();
Project project = list.FirstOrDefault();

My classes had different names but reflected the exact same structure. I replaced the names and I hope there are no typos.

Explanation:

The aliases are required for the joins. I defined three queries to load the Project you want, the Partners with their Costs and the Partners with their Addresses. By using the .Futures() I basically tell NHibernate to execute them in one roundtrip at the moment when I actually want the results, using projects.ToList().

This will result in three SQL statements that are indeed executed in one roundtrip. The three statements will return the following results: 1) 1 row with your Project 2) x rows with the Partners and their Costs (and the Money), where x is the total number of Costs for the Project's Partners 3) y rows with the Partners and their Addresses, where y is the total number of Addresses for the Project's Partners

Your db should return 1+x+y rows, instead of x*y rows, which would be a cartesian product. I do hope that your DB actually supports that feature.


If you are using Linq on your NHibernate, you can simplify cartesian-prevention with this:

int projectId = 1;
var p1 = sess.Query<Project>().Where(x => x.ProjectId == projectId);


p1.FetchMany(x => x.Partners).ToFuture();

sess.Query<Partner>()
.Where(x => x.Project.ProjectId == projectId)
.FetchMany(x => x.Costs)
    .ThenFetch(x => x.Total)
.ToFuture();

sess.Query<Partner>()
.Where(x => x.Project.ProjectId == projectId)
.FetchMany(x => x.Addresses)
.ToFuture();


Project p = p1.ToFuture().Single();

Detailed explanation here: http://www.ienablemuch.com/2012/08/solving-nhibernate-thenfetchmany.html


Instead of eager fetch multiple collections and get a nasty Cartesian product:

Person expectedPerson = session.Query<Person>()
    .FetchMany(p => p.Phones)
        .ThenFetch(p => p.PhoneType)
    .FetchMany(p => p.Addresses)
    .Where(x => x.Id == person.Id)
    .ToList().First();

You should batch children objects in one database call:

// create the first query
var query = session.Query<Person>()
      .Where(x => x.Id == person.Id);
// batch the collections
query
   .FetchMany(x => x.Addresses)
   .ToFuture();
query
   .FetchMany(x => x.Phones)
   .ThenFetch(p => p.PhoneType)
   .ToFuture();
// execute the queries in one roundtrip
Person expectedPerson = query.ToFuture().ToList().First();

I just wrote a blog post about it which explains how to avoid that using Linq, QueryOver or HQL http://blog.raffaeu.com/archive/2014/07/04/nhibernate-fetch-strategies/


I just wanted to contribute to the really helpful answer by Florian. I found out the hard way that the key to all of this is are the aliases. The aliases determines what goes in to the sql and are used as "identifiers" by NHibernate. The minimal Queryover to successfully load a three level object graph is this:

Project pAlias = null;
Partner paAlias = null;

IEnumerable<Project> x = session.QueryOver<Project>(() => pAlias)
 .Where(p => p.Id == projectId)
 .Left.JoinAlias(() => pAlias.Partners, () => paAlias)
 .Future<Project>();


session.QueryOver(() => paAlias).Fetch(partner => partner.Costs).
 .Where(partner => partner.Project.Id == projectId)
 .Future<Partner>();

The first query loads project and its child partners. The important part is the alias for Partner. The partner alias is used to name the second query. The second query loads partners and costs. When this executed as a "Multiquery", Nhibernate will "know" that the first and second query are connected by the paAlias (or rather the sqls generated will have column aliases that are "identical"). So the second query will continue the loading of Partners that was already started in the first query.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜