Arbitrarily Deep Nested Pattern Matching
Is there a way to create a Mathematica pattern 开发者_如何学Pythonwhich matches expressions whose heads may be arbitrarily deep, i.e. something like f[___][___][___]...
?
The suggested solution
There seems to be no built-in construct to pattern-test nested heads automatically. We can achieve the goal by writing a function which would, for any given (sub)expression of the form f[___]...[___]
, efficiently determine f
(which, with a slight abuse of terminology, we may call a symbolic head for the expression). Here is the code:
ClearAll[shead];
SetAttributes[shead, HoldAllComplete];
shead[expr_] := Scan[Return, Unevaluated[expr], {-1}, Heads -> True];
Here is how it can be used (I will use the same set of tests as @Sasha):
In[105]:= Cases[{f[1], g[f[1]], f[1, 2, 3][1], f[1][2][3][4]}, x_ /; shead[x] === f]
Out[105]= {f[1], f[1, 2, 3][1], f[1][2][3][4]}
The pattern syntax
If you prefer to use the syntax suggested by @Sasha, that version would look like
Clear[headPattern];
headPattern[head_] := _?(Function[Null, shead[#] === head, HoldFirst]);
In[108]:= Cases[{f[1], g[f[1]], f[1, 2, 3][1], f[1][2][3][4]}, headPattern[f]]
Out[108]= {f[1], f[1, 2, 3][1], f[1][2][3][4]}
Further explanations and comments
How it works
Here are some hints for the logic that lead to this solution, and how it works. The solution will be most concise and efficient if we manage to leverage some of the built-in expression-traversal functions. Some that come to mind are Map
, Scan
,Cases
,MapIndexed
,Position
. Given that we need the heads, we'd need to pass the Heads->True
option. I used Scan
, since this one is easy to stop at any point (unlike other mentioned constructs, for which you'd typically need to throw an exception to stop them "in the middle", which is rather inelegant and induces some overhead as well) as soon as we find what we want. Our result will be the very first thing Scan
finds on its depth-first expression traversal, so it is expected to be very efficient (it does not traverse the entire expression).
Avoiding the evaluation leaks
Another comment is on evaluation. You can see that HoldAllComplete
attribute is used in shead
, and Unevaluated
is used in its body. These are very important - they serve to prevent possible evaluation of expressions passed to the function. It may matter in cases like this:
In[110]:= m = n = 0;
g[x_] := n++;
h[x_] := m++;
{Cases[Hold[f[g[1]][h[2]]], x_ /; shead[x] === f :> Hold[x], Infinity], {m, n}}
Out[113]= {{Hold[f[g[1]][h[2]]]}, {0, 0}}
Here, we see what we'd expect - even though Cases
has been traversing the entire expression and feeding its (sub)parts to shead
, no evaluation of sub-parts was triggered by shead
. Now we define a naive version of shead
which "leaks evaluation":
sheadEval[expr_] := Scan[Return, expr, {-1}, Heads -> True]
And now,
In[114]:= {Cases[Hold[f[g[1]][h[2]]], x_ /; sheadEval[x] === f :> Hold[x], Infinity], {m, n}}
Out[114]= {{Hold[f[g[1]][h[2]]]}, {2, 1}}
The latter behavior is unsatisfactory generally. The whole code-is-data paradigm, so useful in meta - programming, is very powerful in Mathematica because you can use rules to destructure code. Possible (unwanted) evaluation during the pattern- matching would greatly impair it. The whole problem is in the sub-parts. Wrapping Hold
only prevents the whole expression from evaluation. Functions like Cases
and other similar functions for code destructuring are so great because they don't evaluate sub-parts when doing the structural (syntactic) matching.
Comment on symbolic heads
The last comment here (mostly about definitions) is that the shead
function returns not exactly what is normally called symbolic head in Mathematica. The difference is for atomic expressions. For example, shead[f]
returns f
, while for atomic expressions, the true symbolic head should coincide with the head of an expression (Symbol
in this case). I have developed the symbolicHead
function with this behavior here, and that one can also be successfully used in place of shead
in the above, although shead
is more efficient.
A recursive matching strategy could be used here:
curried[head_] := _head | (x_[___] /; MatchQ[Hold[x], _[curried[head]]])
Usage:
In[26]:= $testCases = {f, f[1], g[f[1]], f[1,2,3][1], f[1][2][3][4]};
Cases[$testCases, curried[f]]
Out[27]= {f[1],f[1,2,3][1],f[1][2][3][4]}
Update
At Leonid's suggestion, Unevaluated
can be used as a clearer and faster way to avoid evaluation leaks in the pattern condition:
curried[head_] := _head | (x_[___] /; MatchQ[Unevaluated[x], curried[head]])
How about the following:
In[277]:=
ArbitrarilyDeepHeadPattern[
head_Symbol] := _?(Function[
MemberQ[
Position[#, _head, {0, Infinity}, Heads -> True], {0 ...}]])
In[281]:= Cases[{f[1], g[f[1]], f[1, 2, 3][1], f[1][2][3][4]},
ArbitrarilyDeepHeadPattern[f]]
Out[281]= {f[1], f[1, 2, 3][1], f[1][2][3][4]}
WReach's answer made me reexamine a recursive definition, which I tried yesterday but gave up on.
I realize now that what I had actually works, it just throws an error. It is a toy compared to Leonid's fine method, but I have a fondness for terse code, so I post it here for interest or amusement. Make sure you do not have $RecursionLimit
set to Infinity before running this.
Cases[
{f, f[1], g[f[1]], f[1, 2, 3][1], f[1][2][3][4]},
f // Blank@#|#0[#]@_&
]
Or even:
Cases[
{f, f[1], g[f[1]], f[1, 2, 3][1], f[1][2][3][4]},
p=_f|p@_
]
Here is an alternative version of @Leonid's shead
, to find the symbolic head of an expression. (You should use the rest of his solution as is.) My function doesn't involve any recursion, but instead it uses that Level has a special case where setting the levelspec to {-1}
returns all atomic expressions, the first of which is the head itself:
shead2[expr_] := First@Level[expr, {-1}, Heads -> True];
It works in pattern matching just the same as shead
does:
In[264]:= Cases[{f[1], g[f[1]], f[1, 2, 3][1], f[1][2][3][4]},
x_ /; shead2[x] === f]
Out[264]= {f[1], f[1, 2, 3][1], f[1][2][3][4]}
And to help understand how it works, here is how Level
behaves with levelspec set to {-1}
:
In[263]:=
Level[#, {-1}, Heads -> True] & /@ {f[1], g[f[1]], f[1, 2, 3][1],
f[1][2][3][4]}
Out[263]= {{f, 1}, {g, f, 1}, {f, 1, 2, 3, 1}, {f, 1, 2, 3, 4}}
精彩评论