开发者

Method (e.g. via bash script) to turn php array indexes currently using constants into array indexes using single quoted strings?

I have a huge pile of php scripts with lots of constants being used in place of proper single-quoted array strings.

For example:

$row_rsCatalogsItems[Name]

(bad)

instead of

$row_rsCatalogsItems['Name']

(good)

How would I create a script (bash, php, whatever is most usable) that I can run on scripts to convert them to the more sensible method?

Ideally it wouldn't just match the [something], but also the $variable_name[someIndex].

I'm actually wondering whether it's even viable considering the potential to screw up the interiors of str开发者_StackOverflow社区ings or html... (maybe if I just use single quotes, it won't matter because they're interpolated anyway...)


This sounds like a job for the Tokenizer!

You can fetch all of the parsed tokens from a PHP source file using token_get_all. You can then go through the resulting array, evaluating each token one at a time. The token name comes back as a number you can look up using token_name.

A small demo at the PHP interactive prompt:

php > $str = '<?php echo $face[fire]; echo $face[\'fire\']; ?>';
php > $t = token_get_all($str);
php > foreach($t as $i => $j) { if(is_array($j)) $t[$i][0] = token_name($j[0]); }

And here's the output in a different code block, as it's a bit tall and it'll be good to reference the source string while scrolling through it.

php > print_r($t);
Array
(
    [0] => Array
        (
            [0] => T_OPEN_TAG
            [1] => <?php
            [2] => 1
        )

    [1] => Array
        (
            [0] => T_ECHO
            [1] => echo
            [2] => 1
        )

    [2] => Array
        (
            [0] => T_WHITESPACE
            [1] =>
            [2] => 1
        )

    [3] => Array
        (
            [0] => T_VARIABLE
            [1] => $face
            [2] => 1
        )

    [4] => [
    [5] => Array
        (
            [0] => T_STRING
            [1] => fire
            [2] => 1
        )

    [6] => ]
    [7] => ;
    [8] => Array
        (
            [0] => T_WHITESPACE
            [1] =>
            [2] => 1
        )

    [9] => Array
        (
            [0] => T_ECHO
            [1] => echo
            [2] => 1
        )

    [10] => Array
        (
            [0] => T_WHITESPACE
            [1] =>
            [2] => 1
        )

    [11] => Array
        (
            [0] => T_VARIABLE
            [1] => $face
            [2] => 1
        )

    [12] => [
    [13] => Array
        (
            [0] => T_CONSTANT_ENCAPSED_STRING
            [1] => 'fire'
            [2] => 1
        )

    [14] => ]
    [15] => ;
    [16] => Array
        (
            [0] => T_WHITESPACE
            [1] =>
            [2] => 1
        )

    [17] => Array
        (
            [0] => T_CLOSE_TAG
            [1] => ?>
            [2] => 1
        )

)

As you can see, our evil array indexes are a T_VARIABLE followed by an open bracket, then a T_STRING that is not quoted. Single-quoted indexes come through as T_CONSTANT_ENCAPSED_STRING, quotes and all.

With this knowledge in hand, you can go through the list of tokens and actually rewrite the source to eliminate all of the unquoted array indexes -- most of them should be pretty obvious. You can simply add single quotes around the string when you write the file back out.

Just keep in mind that you'll want to not quote any numeric indexes, as that will surely have undesirable side-effects.

Also keep in mind that expressions are legal inside of indexes:

$pathological[ some_function('Oh gods', 'why me!?') . '4500' ] = 'Teh bad.';

You'll have a teeny tiny, slightly harder time dealing with these with an automated tool. By which I mean trying to handle them may cause you to fly into a murderous rage. I suggest only trying to fix the constant/string problem now. If done correctly, you should be able to get the Notice count down to a more manageable level.

(Also note that the Tokenizer deals with the curly string syntax as an actual token, T_CURLY_OPEN -- this should make those pesky inlined array indexes easier to deal with. Here's the list of all tokens once again, just in case you missed it.)


Here is the approach that I ended up taking, just for reference:

All commmon includes (header, footer, sidebars) get all their notices squashed, and receive elevated reporting settings (e.g. they log notices).

Main content that's old and has lots of notices already gets ignored, and notices are not displayed/logged.

New main content that I write will have elevated reporting settings.


I also inherited legacy PHP code, and I created a short PHP script that would take a source file, and replace unquoted array indexes. It essentially does what Charles suggests in the other answer.

Included in the comments is a bash commanline script that will invoke the array index fixer on all PHP source files in a folder and subfolders.

You can get a copy of the script here:

https://github.com/GustavBertram/php-array-index-fixer/blob/master/aif.php

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜