PHP Tokens From a String
Let's say you have a string that looks like this:
token1 token2 tok3
And you want to get all of the tokens (specifically the strings between the spaces), AND ALSO their position (offset) and length).
So I would want a result that looks something like this:
array(
array(
'value'=&g开发者_高级运维t;'token1'
'offset'=>0
'length'=>6
),
array(
'value'=>'token2'
'offset'=>7
'length'=>6
),
array(
'value'=>'tok3'
'offset'=>14
'length'=>4
),
)
I know that this can be done by simply looping through the characters of the string and I can simpy write a function to do this.
I am wondering, does PHP have anything built-in that will do this efficiently or at least help with part of this?
I am looking for suggestions and appreciate any help given. Thanks
You can use preg_match_all
with the PREG_OFFSET_CAPTURE flag:
$str = 'token1 token2 tok3';
preg_match_all('/\S+/', $str, $matches, PREG_OFFSET_CAPTURE);
var_dump($matches);
Then you just need to replace the items in $matches[0]
like this:
function update($match) {
return array( 'value' => $value[0], 'offset' => $value[1], 'length' => strlen($value[0]));
}
array_map('update', $matches[0]);
var_dump($matches[0]);
There's a simpler way, in most respects. You'll have a more basic result, but with much less work put in.
Assuming you have tokena tokenb tokenc
stored in $data
$tokens = explode(' ', $data);
Now you have an array of tokens separated by spaces. They will be in order, so $tokens[0] = tokena, $tokens[1] = tokenb, etc. You can very easily get the length of any given item by doing strlen($tokens[$index]);
If you need to know how many tokens you were passed, use $token_count = count($tokens);
Not as sophisticated, but next to no work to get it.
You could use explode()
, which will give you an array of tokens from the string, and strlen()
to count the number of characters in the string. As far as I know, I don't think there is a PHP function to tell you where an element is in an array.
To get around the last problem, you could use a counter variable that loops through the explod()
ed array (foreach()
for for()
) and gives each sub-array in the new data it's position.
Someone please correct my if I'm wrong.
James
I like the first answer the most - to use PREG_OFFSET_CAPTURE. In case anyone else is interested, I ended up writing something that does this as well, although I am going to accept the first answer.
Thank you everybody for helping!
function get_words($string) {
$string_chars = str_split($string);
$words = array();
$curr_offset = 0;
foreach($reduced_string_chars as $offset=>$char) {
if ($char == ' ') {
if ($length) $words[] = array('offset'=>$curr_offset,'length'=>$length,'value'=>implode($value_array));
$curr_offset = $offset;
$length = 0;
$value_array = array();
}
else {
$length++;
$value_array[] = $char;
}
}
return $words;
}
精彩评论