fastest way to remove whitespace from a rendered PHP file

2023-02-08 06:58 问答作者：

I tried a performance check tool "DOM Monster" to analyze my php site. There is one information which says "50% of nodes are whitespace-only text nodes". Ok I unterstand the problem but what is the fastest way to cleanup whitespace in php?

I think a good start is to use the "Output Control" like ob_start() and开发者_如何学Go then replace the whitespace before releasing it with ob_end_flush(). In the moment I do everything with echo echo ... I never read much about this ob_* things is it useful?

I guess using preg_replace() is a performance killer for this job or? So what is the best practice for this?

The fastest way to remove whitespace-only nodes is to not create them in the first place. Just remove all the whitespace immediately before and after each HTML tag.

You certainly could remove the spaces from your code after the fact using an output handler (look at the callback bit in ob_start), but if your goal is performance, then that kind of defeats the purpose.

A whitespace-only node is in the DOM tree parsed by the browser when it reads your HTML. It's where there's an HTML tag, then nothing but whitespace, then another HTML tag. It's a waste of browser resources, but not a huge deal.

The function trim() will solve your problem, isn't it?

http://www.php.net/manual/en/function.trim.php

Well, I guess you talk about HTML, and HTML is as is a meta language full of whitespace (attributes, texts). By the way, you probably use newlines for readability.

I rather advise you to compress your page with deflate/gzip and webserver rules, ie an .htaccess rule:

<FilesMatch "\\.(js|css|html|htm|php|xml)$">
SetOutputFilter DEFLATE
</FilesMatch>

You can also take a look at Tidy which is a library to help you to check and cleanup your HTML code.

preg_replace will of course slow things down a little. But probably it's the fastest way anyway. The problem is more that preg_replace may be unreliable because it is very hard to write regular expression that works on all possible cases. If you are createing XML/XHTML output, you could parse all your data using a fast stream parser SAX or StAX, php has both builtin usually, and then write the data back to the output without the whitespaces. That's simple, effective, reliable und at least medium fast. It's still not going to blow you off with speed.

Another option would be to just use gzip. (ob_handler('gz_handler') is the call in php if I remember correctly). This will compress your data and compression works extremely well on problems with data that repeats a lot within a document. That come with a litte performance penalty as well, but the reduced size of the output document may make up for it. Though beware that the output will not be send to the browser before all output is available. This makes partial loading of webpages much harder ;-).

The problem with using ob_* and then trimming whitespace is that you’ll have to make sure to not remove displayed whitespace like in <pre> tags or <textarea>s etc. You’ll need a syntactical parser which understands where it should not trim.

With an (performance-)expensive parser you should also cache output where possible.

The following is code to remove all space characters but the first of a sequence of spaces. So 1 space will be kept, 3 spaces pruned to 1, etc.

at the top of you php file do

ob_start();

At the end do

function StripExtraSpace($s)
{
  $newstr = "";
  for($i = 0; $i < strlen($s); $i++)
  {
    $newstr = $newstr . substr($s, $i, 1);
    if(substr($s, $i, 1) == ' ')
      while(substr($s, $i + 1, 1) == ' ')
        $i++;
  }

  return $newstr;
}

$content = ob_get_clean();
echo StripExtraSpace($content);

继续阅读：php

fastest way to remove whitespace from a rendered PHP file

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？