Linq-to-SQL: How to shape the data with group by?
I have an example database, it contains tables for Movies, People and Credits. The Movie table contains a Title and an Id. The People table contains a Name and an Id. The Credits table relates Movies to the People that worked on those Movies, in a particular role. The table looks like this:
CREATE TABLE [dbo].[Credits] (
[Id] [int] IDENTITY (1, 1) NOT NULL PRIMARY KEY,
[PersonId] [int] NOT NULL FOREIGN KEY REFERENCES People(Id),
[MovieId] [int] NOT NULL FOREIGN KEY REFERENCES Movies(Id),
[Role] [char] (1) NULL
In this simple example, the [Role] column is a single character, by my convention either 'A' to indicate the person was an actor on that particular movie, or 'D' for director.
I'd like to perform a query on a particular person that returns the person's name, plus a list of all the movies the person has worked on, and the roles in those movies.
If I were to serialize it to json, it might look like this:
{
"name" : "Clint Eastwood",
"movies" : [
{ "title": "Unforgiven", "roles": ["actor", "director"] },
{ "title": "Sands of Iwo Jima", "roles": ["director"] },
{ "title": "Dirty Harry", "roles": ["actor"] },
...
]
}
How can I write a LINQ-to-SQL query that shapes the output like that?
I'm having trouble doing it efficiently.
Try #1
if I use this query:
int personId = 10007;
var persons =
from p in db.People
where p.Id == personId
select new
{
name = p.Name,
movies =
(from m in db.Movies
join c in db.Credits on m.Id equals c.MovieId
where (c.PersonId == personId)
select new {
title = m.Title,
role = (c.Role=="D"?"director":"actor")
})
};
I get something like this:
{
"name" : "Clint Eastwood",
"movies" : [
{ "title": "Unforgiven", "role": "actor" },
{ "title": "Unforgiven", "role": "director" },
{ "title": "Sands of Iwo Jima", "role": "director" },
{ "title": "Dirty Harry", "role": "actor" },
...
]
}
That's not quite right. As you can see there's a duplicate of each movie for which Eastwood played multiple roles. I would expect that because there are multiple rows in the credits table for that movie+person combination, one for each role.
Try #2
I thought I'd use a group by, like this:
var persons =
from p in db.People
where p.Id == personId
select new
{
name = p.Name,
movies =
(from m in db.Movies
join c in db.Credits on m.Id equals c.MovieId
where (c.PersonId == personId)
orderby m.Year
group ((c.Role == "A")? "actor":"director")
by m.Id
into g
开发者_如何学编程 select new {roles = g })
};
The output is pretty close to what I want. It looks like this:
{
"name" : "Clint Eastwood",
"movies" : [
{ "roles": ["actor", "director"]},
{ "roles": ["director"]},
{ "roles": ["actor"]},
...
]
}
That's close, but of course I don't have the movie titles.
Try #3
If I use a group by and include the movie title, like this:
var persons =
from p in db.People
where p.Id == personId
select new
{
name = p.Name,
movies =
(from m in db.Movies
join c in db.Credits on m.Id equals c.MovieId
where (c.PersonId == personId)
orderby m.Year
group ((c.Role == "A")? "actor":"director")
by m.Id
into g
select new { title = m.Title, roles = g })
};
...then it won't compile, due to
error CS0103: The name 'm' does not exist in the current context
How can I shape the output the way I want?
It's a lot easier to reason about if you start from the relation table (credits):
var query =
from c in context.Credits
where c.PersonId == 1
group c by c.Person into g
select new
{
PersonName = g.Key.Name,
Credits = from cr in g
group cr by cr.Movie into g2
select new
{
MovieTitle = g2.Key.Name,
Roles = g2.Select(ci =>
(ci.Role == 'A') ? "Actor" : "Director")
}
};
Here's code that will display the results:
foreach (var result in query)
{
Console.WriteLine(result.PersonName);
foreach (var credit in result.Credits)
{
string roles = string.Join(",", credit.Roles.ToArray());
Console.WriteLine(" " + credit.MovieTitle + ": " + roles);
}
}
I believe that you'll need to materialize the query, then group by name and title and use string.Join to collate the roles.
int personId = 10007;
var persons = db.People.Where( p => p.Id == personId );
var movies = db.Movies
.Join( db.Credits.Where( c => c.PersonId == personId),
m => m.Id,
c => c.MovieId,
(m,c) => new {
personid = c.PersonId,
title = m.title,
role = c.Role == "D" : "director", "actor"
})
.GroupBy( g => new { g.personid, g.title } )
.ToList()
.Select( g => new {
personid = g.Key.personid,
title = g.Key.title
roles = string.Join( ",", g.Select( g => g.role ).ToArray() )
});
var personsWithMovies = people.Join( movies, p => p.PersonId, m => m.personid, (p,m) => new {
name = p.Name,
movies = m
});
Thanks to the hint from tvanfosson, I was able to come up with this, which works for me!
var persons =
from p in db.People
where p.Id == personId
select new
{
name = p.Name,
movies =
(from m in db.Movies
join c in db.Credits on m.Id equals c.MovieId
where (c.PersonId == personId)
group ((c.Role =="A")?"actor":"director") by m into sg
orderby sg.Key.year
select new { title = sg.Key.Title, roles = sg } )
};
I also took some hints from Aaronaught and tried starting with the Credits table, and using the generated associations. That made things simpler. This code also works:
var persons =
from c in db.Credits
where c.PersonId == arg
group c by c.People into g
select new
{
name = g.Key.Name,
credits = from cr in g
group ((cr.Role == "A") ? "actor" : "director")
by cr.Movies into g2
orderby g2.Key.Year
select new { title = g2.Key.Title, roles = g2 }
};
...and it produces the same (or equivalent) output when serialized the the JavaScriptSerializer.
The key realization for me, that allowed me to get this done, was that I could use a compound key for the group, and that I could select on the fields within the key. The second key realization was that I should use the generated associations.
精彩评论