开发者

Measure string size in Bytes in php

I am doin开发者_如何学运维g a real estate feed for a portal and it is telling me the max length of a string should be 20,000 bytes (20kb), but I have never run across this before.

How can I measure byte size of a varchar string. So I can then do a while loop to trim it down.


You can use mb_strlen() to get the byte length using a encoding that only have byte-characters, without worring about multibyte or singlebyte strings. For example, as drake127 saids in a comment of mb_strlen, you can use '8bit' encoding:

<?php
    $string = 'Cién cañones por banda';
    echo mb_strlen($string, '8bit');
?>

You can have problems using strlen function since php have an option to overload strlen to actually call mb_strlen. See more info about it in http://php.net/manual/en/mbstring.overload.php

For trim the string by byte length without split in middle of a multibyte character you can use:

mb_strcut(string $str, int $start [, int $length [, string $encoding ]] )


You have to figure out if the string is ascii encoded or encoded with a multi-byte format.

In the former case, you can just use strlen.

In the latter case you need to find the number of bytes per character.

the strlen documentation gives an example of how to do it : http://www.php.net/manual/en/function.strlen.php#72274


Do you mean byte size or string length?

Byte size is measured with strlen(), whereas string length is queried using mb_strlen(). You can use substr() to trim a string to X bytes (note that this will break the string if it has a multi-byte encoding - as pointed out by Darhazer in the comments) and mb_substr() to trim it to X characters in the encoding of the string.


PHP's strlen() function returns the number of ASCII characters.

strlen('borsc') -> 5 (bytes)

strlen('boršč') -> 7 (bytes)

$limit_in_kBytes = 20000;

$pointer = 0;
while(strlen($your_string) > (($pointer + 1) * $limit_in_kBytes)){
    $str_to_handle = substr($your_string, ($pointer * $limit_in_kBytes ), $limit_in_kBytes);
    // here you can handle (0 - n) parts of string
    $pointer++;
}

$str_to_handle = substr($your_string, ($pointer * $limit_in_kBytes), $limit_in_kBytes);
// here you can handle last part of string

.. or you can use a function like this:

function parseStrToArr($string, $limit_in_kBytes){
    $ret = array();

    $pointer = 0;
    while(strlen($string) > (($pointer + 1) * $limit_in_kBytes)){
        $ret[] = substr($string, ($pointer * $limit_in_kBytes ), $limit_in_kBytes);
        $pointer++;
    }

    $ret[] = substr($string, ($pointer * $limit_in_kBytes), $limit_in_kBytes);

    return $ret;
}

$arr = parseStrToArr($your_string, $limit_in_kBytes = 20000);


Further to PhoneixS answer to get the correct length of string in bytes - Since mb_strlen() is slower than strlen(), for the best performance one can check "mbstring.func_overload" ini setting so that mb_strlen() is used only when it is really required:

$content_length = ini_get('mbstring.func_overload') ? mb_strlen($content , '8bit') : strlen($content);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜