Mass filtering with protobuf-net
I have serialized a l开发者_StackOverflow中文版ist of objects with protobuf-net.
Theoretically, the .bin file can contain millions of objects.
Let's assume the objects are of a class containing the following:
public string EventName;
I have to take a query and create a list containing the objects matching the query. What is the correct way to extract the matching objects from the serialized file using LINQ?
The protobuf format is a linear sequence of items; any indexing etc you way can only be applies separately. However, IEnumerable<T>
is available; you might find that:
var item = Serializer.DeserializeItems<YourType>(source)
.First(item => item.Id == id);
does the job nicely; this:
- is lazily spooled; each item is yielded individually, so you don't need a glut of memory
- is short-circuited; if the item is found near the start, it'll exit promptly
Or for multiple items:
var list = Serializer.DeserializeItems<YourType>(source)
.Where(item => item.Foo == foo);
(add a ToList to te end of the above if you want to buffer the matching items in memory, or use without a ToList if you just want to parse it once in a forwards-only way)
If you want to add some projection over the selected list of elements you should try a library of mine, https://github.com/Scooletz/protobuf-linq. They are available on NuGet as well. The library lowers overhead of deserialization greatly. In some cases it can drop to 50% of the original query.
Unfortunately, there isn't one. In order to use LINQ, your object must implement either IQueryable<T>
or IEnumerable<T>
. Unless there is a LINQ provider that can provide an IQueryable<T>
interface into your .bin file, you'll either have to:
- Deserialize the file into memory and use LINQ-to-objects
IEnumerable<T>
- Write your LINQ provider that can provide an
IQueryable<T>
(and this is likely the only practical option if your file is HUGE) that can process the file without loading the whole thing.
protobuf can give you the contents of the files as a streaming IEnumerable<T>
, so you can easily do that. Unfortunately I do not know how the method is called but it is easy to find in the docs.
精彩评论