开发者

why does entity framework+mysql provider enumeration returns partial results with no exceptions

I'm trying to make sense of a situation I have using entity framework on .net 3.5 sp1 + MySQL 6.1.2.0 as the provider. It involves the following code:

Response.Write("Products: " + plist.Count() + "<br />");
var total = 0;
foreach (var p in plist)
{
//... some actions
    total++;
//... other actions
}
Response.Write("Total Products Checked: " + total + "<br />");

Basically the total products is varying on each run, and it isn't matching the full total in plist. Its varies widely, fro开发者_如何学Pythonm ~ 1/5th to half.

There isn't any control flow code inside the foreach i.e. no break, continue, try/catch, conditions around total++, anything that could affect the count. As confirmation, there are other totals captured inside the loop related to the actions, and those match the lower and higher total runs.

I don't find any reason to the above, other than something in entity framework or the mysql provider that causes it to end the foreach when retrieving an item.

The body of the foreach can have some good variation in time, as the actions involve file & network access, my best shot at the time is that when the .net code takes beyond certain threshold there is some type of timeout in the underlying framework/provider and instead of causing an exception it is silently reporting no more items for enumeration.

Can anyone give some light in the above scenario and/or confirm if the entity framework/mysql provider has the above behavior?


Update 1: I can't reproduce the behavior by using Thread.Sleep in a simple foreach in a test project, not sure where else to look for this weird behavior :(.


Update 2: in the example above the .Count() always returns the same + correct amount of items. Using ToList or ToArray as suggested gets around of the issue as expected (no flow control statements in the foreach body) and both counts match + don't vary on each run.

What I'm interested in is what causes this behavior in entity framework + mysql. Would really prefer not having to change the code in all the projects that use entity framework + mysql to do .ToArray before enumerating the results because I don't know when it'll swallow some results. Or if I do it, at least know what/why it happened.


If the problem is related to the provider or whatever, then you can solve/identify that by realising the enumerable before you iterate over it:

var realisedList = plist.ToArray();
foreach(var p in realisedList)
{
  //as per your example
}

If, after doing this, the problem still persists then

a) One of the actions in the enumerator is causing an exception that is getting swallowed somewhere

b) The underlying data really is different every time.

UPDATE: (as per your comment)

[deleted - multiple enumerations stuff as per your comment]

At the end of the day - I'd be putting the ToArray() call in to have the problem fixed in this case (if the Count() method is required to get a total, then just change it to .Length on the array that's constructed).

Perhaps MySql is killing the connection while you're enumerating, and doesn't throw an error to EF when the next MoveNext() is called. EF then just dutifully responds by saying that the enumerable is simply finished. If so, until such a bug in the provider is fixed, the ToArray() is the way forward.


I think actually that you hit on the answer in your question, but it may be the data that is causing the problem not the timeout. Here is the theory:

One (or several) row(s) in the result set has some data that causes an exception / problem, when it hits that row the system thinks that it has reached the last row.

To test this you could try:

  • Ordering the data and see if the number returned in the for each statement is the same each time.
  • Select only the id column and see if the problem goes away
  • Remove all rows from the table, add them back a few at a time to see if a specific row is causing the problem

If it is a timeout problem, have you tried changing the timeout in the connection string.


I believe it has to do with the way the EF handles lazy loading. You might have to use either Load() or Include() and also check using IsLoaded property within your processing loop. Check out these two links for more information:
http://www.singingeels.com/Articles/Entity_Framework_and_Lazy_Loading.aspx
http://blogs.msdn.com/jkowalski/archive/2008/05/12/transparent-lazy-loading-for-entity-framework-part-1.aspx

I apologize I don't know more about EF to be more specific. Hopefully the links will provide enough info to get you started and others can chime in with any questions you might have.


The issue, cause and workaround is described exactly in this mysql bug.

As suspected it Is a timeout related error in the provider, but its not the regular timeout i.e. net_write_timeout. That's why the simple reproduction in a test project didn't work, since the timeout relates to All the cycles of the foreach and not just a particularly long body between the read of 2 rows.

As of now, the issue is present in the latest version of the MySql provider and under normal conditions would only affect scenarios where rows are being read with a connection maintained for a long time (which might or not involve a slow query). This is great, because it doesn't affect all of the previous projects where I have used MySql / applying the workaround to the sources also means it doesn't fail silently.

Ps. couple of what seem to be related mysql bugs: 1, 2

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜