How to decode BERT when BERT is binary string
I have BERT passed in to Erlang via a query string. I'm reading it via gen_tcp with the http_bin option so it arrives like this <<"131,104,1,100,0,2,104,105">>. Which is only almost right because I want to decode it with binary_to_term/2. But binary_to_term/2 wants a binary binary, not a string binary (it wants <<131,104,1,100,0,2,104,105>> not <<"131,104,1,100,0,2,104,105">>).
I can parse it to the right form like this.
parse(Source) ->
Bins = binary:split(Source, <<",">>, [global]),
parse(Bins, []).
parse([H | T], Acc) ->
parse(T, [list_to_integer(binary_to_list(H)) | Acc]);
parse([], Acc) ->
list_to_binary(lists:reverse(Acc)).
But this seems convoluted and is slower than I hoped (~5k/sec with each being 200 bytes). Also came up with something based on io_lib:fread/2 but it wasn't much better and still looks awkward.
Is there a BIF or NIF somewhere that might do this?
If not, is there a better way to do the above开发者_JAVA百科 to speed it up?
For what it's worth, an alternative solution - presumable slower, but possibly less ad-hoc, depending on taste - is to look at it as a problem of parsing a subset of Erlang, for which tools exist:
parse(Source) ->
case erl_scan:string(Source++" .") of
{ok, Tokens, _} ->
case erl_parse:parse_term(Tokens) of
{ok, Bin} when is_binary(Bin) -> % Only accept binary literals.
Bin;
_ -> error(badarg)
end;
_ -> error(badarg)
end.
Possibly overkill in this context, but no more code than the original solution.
Using this code you can parse up to 75MB/s in native (HiPE) and up to 17MB/s in byte code:
-module(str_to_bin).
-export([str_to_bin/1]).
str_to_bin(Bin) when is_binary(Bin) ->
str_to_bin(Bin, <<>>).
-define(D(X), X >= $0, X =< $9 ).
-define(C(X), (X band 2#1111)).
str_to_bin(<<X,Y,Z,Rest/binary>>, Acc)
when ?D(X), ?D(Y), ?D(Z) ->
str_to_bin_(Rest, <<Acc/binary, (?C(X)*100 + ?C(Y)*10 + ?C(Z))>>);
str_to_bin(<<Y,Z,Rest/binary>>, Acc)
when ?D(Y), ?D(Z) ->
str_to_bin_(Rest, <<Acc/binary, (?C(Y)*10 + ?C(Z))>>);
str_to_bin(<<Z,Rest/binary>>, Acc)
when ?D(Z) ->
str_to_bin_(Rest, <<Acc/binary, ?C(Z)>>).
-compile({inline, [str_to_bin_/2]}).
str_to_bin_(<<>>, Acc) -> Acc;
str_to_bin_(<<$,, Rest/binary>>, Acc) -> str_to_bin(Rest, Acc).
精彩评论