Map Select for Conditional Query in Mathematica
Using the following list,
list = {{a, b, c, d}, {1, 2, 3, 4}, {5, 6, 7, 8}};
Is it possible to select the lists where the second value is >3. Desired Output below
{5, 6, 7, 8}
adjusting the following code that currently extract all the values >2 in a list
Sele开发者_开发技巧ct[#, # > 2 &] & /@ list[[2 ;;]
Sophisticated solutions for queries can be found here Conditional Data Manipulation in Mathematica
Alternatively, using Select
Select[list, #[[2]] > 3 &]
Output
{{5, 6, 7, 8}}
In this case Select
is the simplest method, but Pick
can also be useful in related problems.
list = {{a, b, c, d}, {1, 2, 3, 4}, {5, 6, 7, 8}};
Pick[list, #>3& /@ list[[All, 2]] ]
To explain, Pick
takes two lists (or nested lists) of the same shape, and returns every element from the first for which the corresponding element of the second is True
. (It can also accept a third argument to match for elements other than True
.)
Here, the second column is extracted with list[[All, 2]]
and then the test #>3&
is mapped to each element. This is then used as the selection list.
Responding to the comments by 500 requesting a generalization of the Select
method:
selectByColumn[array_, index_, value_, range_] :=
Select[array, #[[index]] > value &][[All, range]]
This allows one to specify:
array
: input array to extract fromindex
: column index for comparevalue
: the value to compare torange
: thePart
specification to extract from each result row
Speed comparison of different solutions
Interestingly, Select
always works faster with unpacked arrays and for large arrays the difference is about one order of magnitude! Click the following table to enlarge:
The code for timing tests:
solutions = Hold[
Select[list, #[[2]] > 3 &],
Cases[list, _List?(#[[2]] > 3 &)],
Cases[list, x_List /; x[[2]] > 3],
Cases[list, {_, x_, ___} /; x > 3],
Cases[list, {_, _?(# > 3 &), ___}],
Cases[list, {x___} /; List[x][[2]] > 3],
Pick[list, UnitStep[list[[All, 2]] - 3], 1],
Pick[list, # > 3 & /@ list[[All, 2]]]
];
testCases = Hold[
{"Packed Reals", RandomReal[{0, 5}, {dim, dim}]},
{"Unpacked Reals",
Developer`FromPackedArray@RandomReal[{0, 5}, {dim, dim}]},
{"Packed Integers", RandomInteger[{0, 5}, {dim, dim}]},
{"Unpacked Integers",
Developer`FromPackedArray@RandomInteger[{0, 5}, {dim, dim}]},
{"Rationals",
Rationalize[#, .001] & /@ RandomReal[{0, 5}, {dim, dim}]}
];
timing :=
Function[Null,
If[(time = First[Timing[Do[#;, {n}]]]) < .3,
Print["Too small timing for ", n, " iterations (dim=", dim,
") of ", HoldForm[#], ": ", time, " seconds!"]; time, time]/n,
HoldFirst];
generateTable :=
Labeled[TableForm[
Transpose@
Table[list = testCases[[i, 2]];
tmgs = List @@ (timing /@ solutions);
d = Last@MantissaExponent[Min[tmgs]] - 3;
Row[{Round[10^-d*#], ".\[Times]", Superscript[10, d]}] & /@
tmgs, {i, 1, Length[testCases]}],
TableHeadings -> {List @@ (HoldForm /@ solutions),
List @@ testCases[[All, 1]]}, TableAlignments -> Right],
Row[{"Average timings for ", dim, "\[Times]", dim, " list"}], Top]
Column[{dim = 5; n = 30000; generateTable, dim = 100; n = 3000;
generateTable, dim = 1000; n = 150; generateTable}, Left, 1,
Frame -> All, FrameStyle -> Gray]
An alternate Cases syntax. The first item is skipped, the second is tested to be > 3, and we don't care about the rest of the list:
In[45]:= Cases[list, {_, _?(# > 3 &), ___}]
Out[45]= {{5, 6, 7, 8}}
I doubt this will be faster than the Select, but sometimes is clearer, especially if the test involves different data types or match some substructure.
EDIT:
As noted by Alexey, the following constructs are more idiomatic in Mathematica than my earlier solution
Cases[list, x_List /; x[[2]] > 2]
and
Cases[list, _List?(#[[2]] > 2 &)]
One solution would be to use Cases
Cases[list, {_, x_, ___} /; x > 2]
Out[1] = {{5, 6, 7, 8}}
This is not helpful if you want to (say) check if the 95th element is > 2. So this is a better approach where you can specify the position easily:
Cases[list, {x___} /; List[x][[2]] > 2]
Out[2] = {{5, 6, 7, 8}}
精彩评论