开发者

Creating a valid function declaration from a complex tuple/list structure

Is there a generic way, given a complex object in Erlang, to come up with a valid function declaration for it besides eyeballing it? I'm maintaining some code previously written by someone who was a big fan of giant structures, and it's proving to be error prone doing it manually. I don't need to iterate the whole thing, just grab the top level, per se.

For example, I'm working on this right now -

[[["SIP",47,"2",46,"0"],32,"407",32,"Pro开发者_JAVA技巧xy Authentication Required","\r\n"],
 [{'Via',
        [{'via-parm',
              {'sent-protocol',"SIP","2.0","UDP"},
              {'sent-by',"172.20.10.5","5060"},
              [{'via-branch',"z9hG4bKb561e4f03a40c4439ba375b2ac3c9f91.0"}]}]},
  {'Via',
        [{'via-parm',
              {'sent-protocol',"SIP","2.0","UDP"},
              {'sent-by',"172.20.10.15","5060"},
              [{'via-branch',"12dee0b2f48309f40b7857b9c73be9ac"}]}]},
  {'From',
        {'from-spec',
             {'name-addr',
                  [[]],
                  {'SIP-URI',
                        [{userinfo,{user,"003018CFE4EF"},[]}],
                        {hostport,"172.20.10.11",[]},
                        {'uri-parameters',[]},
                        []}},
             [{tag,"b7226ffa86c46af7bf6e32969ad16940"}]}},
  {'To',
        {'name-addr',
             [[]],
             {'SIP-URI',
                  [{userinfo,{user,"3966"},[]}],
                  {hostport,"172.20.10.11",[]},
                  {'uri-parameters',[]},
                  []}},
        [{tag,"a830c764"}]},
  {'Call-ID',"90df0e4968c9a4545a009b1adf268605@172.20.10.15"},
  {'CSeq',1358286,"SUBSCRIBE"},
  ["date",'HCOLON',
    ["Mon",44,32,["13",32,"Jun",32,"2011"],32,["17",58,"03",58,"55"],32,"GMT"]],
  {'Contact',
        [[{'name-addr',
                [[]],
                {'SIP-URI',
                     [{userinfo,{user,"3ComCallProcessor"},[]}],
                     {hostport,"172.20.10.11",[]},
                     {'uri-parameters',[]},
                     []}},
          []],
         []]},
  ["expires",'HCOLON',3600],
  ["user-agent",'HCOLON',
    ["3Com",[]],
    [['LWS',["VCX",[]]],
     ['LWS',["7210",[]]],
     ['LWS',["IP",[]]],
     ['LWS',["CallProcessor",[['SLASH',"v10.0.8"]]]]]],
  ["proxy-authenticate",'HCOLON',
    ["Digest",'LWS',
     ["realm",'EQUAL',['SWS',34,"3Com",34]],
     [['COMMA',["domain",'EQUAL',['SWS',34,"3Com",34]]],
      ['COMMA',
        ["nonce",'EQUAL',
         ['SWS',34,"btbvbsbzbBbAbwbybvbxbCbtbzbubqbubsbqbtbsbqbtbxbCbxbsbybs",
          34]]],
      ['COMMA',["stale",'EQUAL',"FALSE"]],
      ['COMMA',["algorithm",'EQUAL',"MD5"]]]]],
  {'Content-Length',0}],
 "\r\n",
 ["\n"]]


Maybe https://github.com/etrepum/kvc


I noticed your clarifying comment. I'd prefer to add a comment myself, but don't have enough karma. Anyway, the trick I use for that is to experiment in the shell. I'll iterate a pattern against a sample data structure until I've found the simplest form. You can use the _ match-all variable. I use an erlang shell inside an emacs shell window.

First, bind a sample to a variable:

A = [{a,b},[{c,d}, {e,f}]].

Now set the original structure against the variable:

[{a,b},[{c,d},{e,f}]] = A.

If you hit enter, you'll see they match. Hit alt-p (forget what emacs calls alt, but it's alt on my keyboard) to bring back the previous line. Replace some tuple or list item with an underscore:

[_,[{c,d},{e,f}]].

Hit enter to make sure you did it right and they still match. This example is trivial, but for deeply nested, multiline structures it's trickier, so it's handy to be able to just quickly match to test. Sometimes you'll want to try to guess at whole huge swaths, like using an underscore to match a tuple list inside a tuple that's the third element of a list. If you place it right, you can match the whole thing at once, but it's easy to misread it.

Anyway, repeat to explore the essential shape of the structure and place real variables where you want to pull out values:

[_, [_, _]] = A.

[_, _] = A.

[_, MyTupleList] = A. %% let's grab this tuple list

[{MyAtom,b}, [{c,d}, MyTuple]] = A. %% or maybe we want this atom and tuple

That's how I efficiently dissect and pattern match complex data structures.

However, I don't know what you're doing. I'd be inclined to have a wrapper function that uses KVC to pull out exactly what you need and then distributes to helper functions from there for each type of structure.


If I understand you correctly you want to pattern match some large datastructures of unknown formatting.

Example:

Input: {a, b} {a,b,c,d} {a,[],{},{b,c}}

function({A, B}) -> do_something;
function({A, B, C, D}) when is_atom(B) -> do_something_else;
function({A, B, C, D}) when is_list(B) -> more_doing.

The generic answer is of course that it is undecidable from just data to know how to categorize that data.

First you should probably be aware of iolists. They are created by functions such as io_lib:format/2 and in many other places in the code.

One example is that

 [["SIP",47,"2",46,"0"],32,"407",32,"Proxy Authentication Required","\r\n"]

will print as

 SIP/2.0 407 Proxy Authentication Required

So, I'd start with flattening all those lists, using a function such as

 flatten_io(List) when is_list(List) ->
     Flat = lists:map(fun flatten_io/1, List),
     maybe_flatten(Flat);
 flatten_io(Tuple) when is_tuple(Tuple) ->
      list_to_tuple([flatten_io(Element) || Element <- tuple_to_list(Tuple)];
 flatten_io(Other) -> Other.

 maybe_flatten(L) when is_list(L) ->
      case lists:all(fun(Ch) when Ch > 0 andalso Ch < 256 -> true;
                   (List) when is_list(List) ->
                        lists:all(fun(X) -> X > 0 andalso X < 256 end, List);
                   (_) -> false
                end, L) of
          true -> lists:flatten(L);
          false -> L
      end.

(Caveat: completely untested and quite inefficient. Will also crash for inproper lists, but you shouldn't have those in your data structures anyway.)

On second thought, I can't help you. Any data structure that uses the atom 'COMMA' for a comma in a string should be taken out and shot.

You should be able to flatten those things as well and start to get a view of what you are looking at.

I know that this is not a complete answer. Hope it helps.


Its hard to recommend something for handling this.

Transforming all the structures in a more sane and also more minimal format looks like its worth it. This depends mainly on the similarities in these structures.

Rather than having a special function for each of the 100 there must be some automatic reformatting that can be done, maybe even put the parts in records.

Once you have records its much easier to write functions for it since you don't need to know the actual number of elements in the record. More important: your code won't break when the number of elements changes.

To summarize: make a barrier between your code and the insanity of these structures by somehow sanitizing them by the most generic code possible. It will be probably a mix of generic reformatting with structure speicific stuff.

As an example already visible in this struct: the 'name-addr' tuples look like they have a uniform structure. So you can recurse over your structures (over all elements of tuples and lists) and match for "things" that have a common structure like 'name-addr' and replace these with nice records.

In order to help you eyeballing you can write yourself helper functions along this example:

eyeball(List) when is_list(List) ->
    io:format("List with length ~b\n", [length(List)]);
eyeball(Tuple) when is_tuple(Tuple) ->
    io:format("Tuple with ~b elements\n", [tuple_size(Tuple)]).

So you would get output like this:

2> eyeball({a,b,c}).
Tuple with 3 elements
ok
3> eyeball([a,b,c]).
List with length 3
ok

expansion of this in a useful tool for your use is left as an exercise. You could handle multiple levels by recursing over the elements and indenting the output.


Use pattern matching and functions that work on lists to extract only what you need.

Look at http://www.erlang.org/doc/man/lists.html: keyfind, keyreplace, L = [H|T], ...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜