开发者

PHP: How are comments skipped?

Well if I comment something it's skipped in all languages, but how are the开发者_运维技巧y skipped and what is readed?

Example:

// This is commented out

Now does PHP reads the whole comment to go to next lines or just reads the //?


The script is parsed and split into tokens.

You can actually try this out yourself on any valid PHP source code using token_get_all(), it uses PHP's native tokenizer.

The example from the manual shows how a comment is dealt with:

<?php
$tokens = token_get_all('<?php echo; ?>'); /* => array(
                                                  array(T_OPEN_TAG, '<?php'), 
                                                  array(T_ECHO, 'echo'),
                                                  ';',
                                                  array(T_CLOSE_TAG, '?>') ); */

/* Note in the following example that the string is parsed as T_INLINE_HTML
   rather than the otherwise expected T_COMMENT (T_ML_COMMENT in PHP <5).
   This is because no open/close tags were used in the "code" provided.
   This would be equivalent to putting a comment outside of <?php ?> 
   tags in a normal file. */

$tokens = token_get_all('/* comment */'); 
// => array(array(T_INLINE_HTML, '/* comment */'));
?>


There is a tokenization phase while compiling. During this phase, it see the // and then just ignores everything to the end of the line. Compilers CAN get complicated, but for the most part are pretty straight forward.

http://compilers.iecc.com/crenshaw/


Your question doesn't make sense. Having read the '//', it then has to keep reading to the newline to find it. There's no choice about this. There is no other way to find the newline.

Conceptually, compiling has several phases that are logically prior to parsing:

  1. Scanning.
  2. Screening.
  3. Tokenization.

(1) basically means reading the file character by character from left to right. (2) means throwing things away of no interest, e.g. collapsing multiple newline/whitespace sequences to a single space. (3) means combining what's left into tokens, e.g. identifiers, keywords, literals, punctuation.

Comments are screened out during (2). In modern compilers this is all done at once by a deterministic automaton.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜