What is the use of Enumerable.Zip extension method in Linq?
What is the use of Enumerable.开发者_开发问答Zip
extension method in Linq?
The Zip operator merges the corresponding elements of two sequences using a specified selector function.
var letters= new string[] { "A", "B", "C", "D", "E" };
var numbers= new int[] { 1, 2, 3 };
var q = letters.Zip(numbers, (l, n) => l + n.ToString());
foreach (var s in q)
Console.WriteLine(s);
Ouput
A1
B2
C3
Zip
is for combining two sequences into one. For example, if you have the sequences
1, 2, 3
and
10, 20, 30
and you want the sequence that is the result of multiplying elements in the same position in each sequence to obtain
10, 40, 90
you could say
var left = new[] { 1, 2, 3 };
var right = new[] { 10, 20, 30 };
var products = left.Zip(right, (m, n) => m * n);
It is called "zip" because you think of one sequence as the left-side of a zipper, and the other sequence as the right-side of the zipper, and the zip operator will pull the two sides together pairing off the teeth (the elements of the sequence) appropriately.
It iterates through two sequences and combines their elements, one by one, into a single new sequence. So you take an element of sequence A, transform it with the corresponding element from sequence B, and the result forms an element of sequence C.
One way to think about it is that it's similar to Select
, except instead of transforming items from a single collection, it works on two collections at once.
From the MSDN article on the method:
int[] numbers = { 1, 2, 3, 4 };
string[] words = { "one", "two", "three" };
var numbersAndWords = numbers.Zip(words, (first, second) => first + " " + second);
foreach (var item in numbersAndWords)
Console.WriteLine(item);
// This code produces the following output:
// 1 one
// 2 two
// 3 three
If you were to do this in imperative code, you'd probably do something like this:
for (int i = 0; i < numbers.Length && i < words.Length; i++)
{
numbersAndWords.Add(numbers[i] + " " + words[i]);
}
Or if LINQ didn't have Zip
in it, you could do this:
var numbersAndWords = numbers.Select(
(num, i) => num + " " + words[i]
);
This is useful when you have data spread into simple, array-like lists, each with the same length and order, and each describing a different property of the same set of objects. Zip
helps you knit those pieces of data together into a more coherent structure.
So if you have an array of state names and another array of their abbreviations, you could collate them into a State
class like so:
IEnumerable<State> GetListOfStates(string[] stateNames, int[] statePopulations)
{
return stateNames.Zip(statePopulations,
(name, population) => new State()
{
Name = name,
Population = population
});
}
DO NOT let the name Zip
throw you off. It has nothing to do with zipping as in zipping a file or a folder (compressing). It actually gets its name from how a zipper on clothes works: The zipper on clothes has 2 sides and each side has a bunch of teeth. When you go in one direction, the zipper enumerates (travels) both sides and closes the zipper by clenching the teeth. When you go in the other direction it opens the teeth. You either end with an open or closed zipper.
It is the same idea with the Zip
method. Consider an example where we have two collections. One holds letters and the other holds the name of a food item which starts with that letter. For clarity purposes I am calling them leftSideOfZipper
and rightSideOfZipper
. Here is the code.
var leftSideOfZipper = new List<string> { "A", "B", "C", "D", "E" };
var rightSideOfZipper = new List<string> { "Apple", "Banana", "Coconut", "Donut" };
Our task is to produce one collection which has the letter of the fruit separated by a :
and its name. Like this:
A : Apple
B : Banana
C : Coconut
D : Donut
Zip
to the rescue. To keep up with our zipper terminology we will call this result closedZipper
and the items of the left zipper we will call leftTooth
and the right side we will call righTooth
for obvious reasons:
var closedZipper = leftSideOfZipper
.Zip(rightSideOfZipper, (leftTooth, rightTooth) => leftTooth + " : " + rightTooth).ToList();
In the above we are enumerating (travelling) the left side of the zipper and the right side of the zipper and performing an operation on each tooth. The operation we are performing is concatenating the left tooth (food letter) with a :
and then the right tooth (food name). We do that using this code:
(leftTooth, rightTooth) => leftTooth + " : " + rightTooth)
The end result is this:
A : Apple
B : Banana
C : Coconut
D : Donut
What happened to the last letter E?
If you are enumerating (pulling) a real clothes zipper and one side, does not matter the left side or the right side, has less teeth than the other side, what will happen? Well the zipper will stop there. The Zip
method will do exactly the same: It will stop once it has reached the last item on either side. In our case the right side has less teeth (food names) so it will stop at "Donut".
A lot of the answers here demonstrate Zip
, but without really explaining a real life use-case that would motivate the use of Zip
.
One particularly common pattern that Zip
is fantastic for iterating over successive pairs of things. This is done by iterating an enumerable X
with itself, skipping 1 element: x.Zip(x.Skip(1))
. Visual Example:
x | x.Skip(1) | x.Zip(x.Skip(1), ...)
---+-----------+----------------------
| 1 |
1 | 2 | (1, 2)
2 | 3 | (2, 1)
3 | 4 | (3, 2)
4 | 5 | (4, 3)
These successive pairs are useful for finding the first differences between values. For example, successive pairs of IEnumable<MouseXPosition>
can be used to produce IEnumerable<MouseXDelta>
. Similarly, sampled bool
values of a button
can be interpretted into events like NotPressed
/Clicked
/Held
/Released
. Those events can then drive calls to delegate methods. Here's an example:
using System;
using System.Collections.Generic;
using System.Linq;
enum MouseEvent { NotPressed, Clicked, Held, Released }
public class Program {
public static void Main() {
// Example: Sampling the boolean state of a mouse button
List<bool> mouseStates = new List<bool> { false, false, false, false, true, true, true, false, true, false, false, true };
mouseStates.Zip(mouseStates.Skip(1), (oldMouseState, newMouseState) => {
if (oldMouseState) {
if (newMouseState) return MouseEvent.Held;
else return MouseEvent.Released;
} else {
if (newMouseState) return MouseEvent.Clicked;
else return MouseEvent.NotPressed;
}
})
.ToList()
.ForEach(mouseEvent => Console.WriteLine(mouseEvent) );
}
}
Prints:
NotPressesd
NotPressesd
NotPressesd
Clicked
Held
Held
Released
Clicked
Released
NotPressesd
Clicked
I don't have the rep points to post in the comments section, but to answer the related question :
What if I want zip to continue where one list run out of elements? In which case the shorter list element should take default value. Output in this case to be A1, B2, C3, D0, E0. – liang Nov 19 '15 at 3:29
What you would do is to use Array.Resize() to pad-out the shorter sequence with default values, and then Zip() them together.
Code example :
var letters = new string[] { "A", "B", "C", "D", "E" };
var numbers = new int[] { 1, 2, 3 };
if (numbers.Length < letters.Length)
Array.Resize(ref numbers, letters.Length);
var q = letters.Zip(numbers, (l, n) => l + n.ToString());
foreach (var s in q)
Console.WriteLine(s);
Output:
A1
B2
C3
D0
E0
Please note that using Array.Resize() has a caveat : Redim Preserve in C#?
If it is unknown which sequence will be the shorter one, a function can be created that susses it:
static void Main(string[] args)
{
var letters = new string[] { "A", "B", "C", "D", "E" };
var numbers = new int[] { 1, 2, 3 };
var q = letters.Zip(numbers, (l, n) => l + n.ToString()).ToArray();
var qDef = ZipDefault(letters, numbers);
Array.Resize(ref q, qDef.Count());
// Note: using a second .Zip() to show the results side-by-side
foreach (var s in q.Zip(qDef, (a, b) => string.Format("{0, 2} {1, 2}", a, b)))
Console.WriteLine(s);
}
static IEnumerable<string> ZipDefault(string[] letters, int[] numbers)
{
switch (letters.Length.CompareTo(numbers.Length))
{
case -1: Array.Resize(ref letters, numbers.Length); break;
case 0: goto default;
case 1: Array.Resize(ref numbers, letters.Length); break;
default: break;
}
return letters.Zip(numbers, (l, n) => l + n.ToString());
}
Output of plain .Zip() alongside ZipDefault() :
A1 A1
B2 B2
C3 C3
D0
E0
Going back to the main answer of the original question, another interesting thing that one might wish to do (when the lengths of the sequences to be "zipped" are different) is to join them in such a way so that the end of the list matches instead of the top. This can be accomplished by "skipping" the appropriate number of items using .Skip().
foreach (var s in letters.Skip(letters.Length - numbers.Length).Zip(numbers, (l, n) => l + n.ToString()).ToArray())
Console.WriteLine(s);
Output:
C1
D2
E3
As others have stated, Zip lets you combine two collections for use in further Linq statements or a foreach loop.
Operations that used to require a for loop and two arrays can now be done in a foreach loop using an anonymous object.
An example I just discovered, that is kind of silly, but could be useful if parallelization were beneficial would be a single line Queue traversal with side effects:
timeSegments
.Zip(timeSegments.Skip(1), (Current, Next) => new {Current, Next})
.Where(zip => zip.Current.EndTime > zip.Next.StartTime)
.AsParallel()
.ForAll(zip => zip.Current.EndTime = zip.Next.StartTime);
timeSegments represents the current or dequeued items in a queue (the last element is truncated by Zip). timeSegments.Skip(1) represents the next or peek items in a queue. The Zip method combines these two into a single anonymous object with a Next and Current property. Then we filter with Where and make changes with AsParallel().ForAll. Of course the last bit could just be a regular foreach or another Select statement that returns the offending time segments.
The Zip method allows you to "merge" two unrelated sequences, using a merging function provider by you, the caller. The example on MSDN is actually pretty good at demonstrating what you can do with Zip. In this example, you take two arbitrary, unrelated sequences, and combine them using an arbitrary function (in this case, just concatenating items from both sequences into a single string).
int[] numbers = { 1, 2, 3, 4 };
string[] words = { "one", "two", "three" };
var numbersAndWords = numbers.Zip(words, (first, second) => first + " " + second);
foreach (var item in numbersAndWords)
Console.WriteLine(item);
// This code produces the following output:
// 1 one
// 2 two
// 3 three
string[] fname = { "mark", "john", "joseph" };
string[] lname = { "castro", "cruz", "lopez" };
var fullName = fname.Zip(lname, (f, l) => f + " " + l);
foreach (var item in fullName)
{
Console.WriteLine(item);
}
// The output are
//mark castro..etc
精彩评论