开发者

PHP preg_match_all expression

I have virtually no experience of regx, but trying my best.

I have a string like this:

$fString = "Name=Sök,Value=2,Title=Combine me,Options=[Item1=1,Item2=2,Item3=3]";

I want to get an array looking like this:

Array[0] = "Name=Sök"
Array[1] = "Value=2"
Array[2] = "Title=Combine me"
Array[3] = "Options=[Item1=1,Item2=2,Item3=3]"

What I have managed to do so far is:

preg_match_all("/[^,]*[\w\d]*=[^,]*/",$fString,$Data);

But it I can't figure out how to fix the last "Option".

Array ( [0] => Array ( [0] => Name=S�k [1] => Value=2 [2] => Title=Combine me [3] => Options=[Item1=1 [4] => Item2=2 [5] => Item3=3] ) )

...and why is the result an array inside an array?!?


[EDIT]

I guess I need to explain the whole idea of what I'm trying to do here, I'm not sure I'm on the right track any more.

I have created some classes where I store all the "persistent" variables in an array. I have a function that serializes this array so I can be stored in a database.

I know all about the serialize() function, but I'm doing some filtering so I can't use it as it is, and I also prefer to have it more readable for manual editing. This array can have nested arrays within, that needs to be preserved. When I read it all back from the database, the original array must be created again.

I had it all working with the eval() command but stumbled into trouble where I had nested arrays because of the " or ' characters was breaking the main outer string. So this approach was an attempt to serialize everything without nested strings that needed to be preserved.

So if I can solve the nested data with preg_match_all I'm there, otherwise I need to 开发者_StackOverflowcome up with another solution.

I guess the data needs to be escaped as well, such as the , and [ ]


Here is a function that will do basically what you need:

function explode_me($str) {
    $a = array();
    $v = "";
    $ignore = false;
    for ($i = 0; $i < strlen($str); $i++) {
        if ($str[$i] == ',' && !$ignore) {
            $a[] = $v;
            $v = "";
        }
        else if ($str[$i] == '[' && !$ignore) {
            $ignore = true;
            $v .= $str[$i];
        }
        else if ($str[$i] == ']' && $ignore) {
            $ignore = false;
            $v .= $str[$i];
        }
        else {
            $v .= $str[$i];
        }
    }
    $a[] = $v;
    return $a;
}

To test it:

$str = "Name=Sök,Value=2,Title=Combine me,Options=[Item1=1,Item2=2,Item3=3]";
$a = explode_me($str);

print_r($a);

which prints:

Array
(
    [0] => Name=Sök
    [1] => Value=2
    [2] => Title=Combine me
    [3] => Options=[Item1=1,Item2=2,Item3=3]
)


(\w+)=(\[[^\]]+\]|[^,]+)

This breaks down as:

(\w+)        # a word (store in match group 1)
=            # the "=" character
(            # begin match group 2
  \[         #   a "[" character
  [^\]]+     #   anything but "]" character
  \]         #   a "]" character
  |          #   or...
  [^,]+      #   anything but a comma
)            # end match group 1

Apply with preg_match_all():

$fString = "Name=Sök,Value=2,Title=Combine me,Options=[Item1=1,Item2=2,Item3=3]";

$matches = array();
preg_match_all("/(\\w+)=(\\[[^\\]]+\\]|[^,]+)/", $fString, $matches);

Which results in something even more detailed than you wanted to have:

Array
(
    [0] => Array
        (
            [0] => Name=Sök
            [1] => Value=2
            [2] => Title=Combine me
            [3] => Options=[Item1=1,Item2=2,Item3=3]
        )

    [1] => Array
        (
            [0] => Name
            [1] => Value
            [2] => Title
            [3] => Options
        )

    [2] => Array
        (
            [0] => Sök
            [1] => 2
            [2] => Combine me
            [3] => [Item1=1,Item2=2,Item3=3]
        )

)

$result[0] is what you wanted. $result[1] and $result[2] are property names and values separately, which enables you to use them right away instead of making an extra step that splits things like "Options=[Item1=1,Item2=2,Item3=3]" at the correct =.


If you could change the separators between the items (where it says Item1=1,Item2=2,Item3=3 to something like Item1=1|Item2=2|Item3=3) you could easily use explode(',',$fString) to convert a string to an array.

I can also offer this piece of code that will change the separators, as I have no experience with regex:

$newstr = str_replace(',Item','|Item',$fString);
$newarray = explode(',',$newstr);

$newarray will look like this:

Array[0] = "Name=Sök"
Array[1] = "Value=2"
Array[2] = "Title=Combine me"
Array[3] = "Options=[Item1=1|Item2=2|Item3=3]"


This is a problem that lends itself more to parsing than regex extraction. Bout you can separate the special case to make it work:

preg_match_all("/(\w+)=( \w[^,]+ | \[[^\]]+\] )/x", $str, $m);
$things = array_combine($m[1], $m[2]);

Will give you a PHP variable like (but you can access $m[0] for the unparsed strings):

[Name] => Sök
[Title] => Combine me
[Options] => [Item1=1,Item2=2,Item3=3]

You can reapply the function on Options to explode that too.

The trick again is differentiating between \w anything that starts with a letter, and the \[...\] enclosed options. There you have to just make it match ^] all non-closing-brackets, and that's it.


So, here is another approach. It's a mini parser for nested structures. Adapt the regex if you need escape codes.

function parse(&$s) {
    while (strlen($s) && preg_match("/^(.*?)([=,\[\]])/", $s, $m)) {
        $s = substr($s, 1 + strlen($m[1]));
        switch ($m[2]) {
            case "=":
               $key = $m[1];
               break;
            case ",":
               if (!isset($r[$key])) {
                  $r[$key] = $m[1];
               }
               break;
            case "[":
               $r[$key] = parse($s);
               break;
            case "]":
               return $r;
        }
    }
    if ($s) { $r[$key] = $s; } // remainder
    return $r;
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜