开发者

how to validate the number of opened and closed tags?

I thought to do a preg_count for each "/<[a-z0-9]+>/i" and then count if exists the same number with the closed tags ie: "/</[a-z0-9]+>开发者_运维知识库;/i"

But I am not too sure. How would you count all opened tags and check if exists all closed tags?

Ps. i don't need to check for attribute and for xml /> single close tag. I just need a count on plain simple html tag

Thanks


I wrote this handy functions. I think it could be faster if I search both opened/closed tags within one preg_match_all but as this it's more readable:

<?php

//> Will count number of <[a-z]> tag and </[a-z]> tag (will also validate the order)
//> Note br should be in the form of <br /> for not causing problems
function validHTML($html,$checkOrder=true) {
    preg_match_all( '#<([a-z]+)>#i' , $html, $start, PREG_OFFSET_CAPTURE );
    preg_match_all( '#<\/([a-z]+)>#i' , $html, $end, PREG_OFFSET_CAPTURE );
    $start = $start[1];
    $end = $end[1];

    if (count($start) != count($end) )
        throw new Exception('Check numbers of tags');

    if ($checkOrder) { 
        $is = 0;
        foreach($end as $v){
            if ($v[0] != $start[$is][0] || $v[1] < $start[$is][1] )
                throw new Exception('End tag ['.$v[0].'] not opened');

            $is++;
        }
    }

    return true;
}

//> Usage::

try {
    validHTML('<p>hello</p><li></li></p><p>');

} catch (Exception $e) { 
    echo $e->getMessage();
}

Note if you need to catch even h1 or any other tag with numbers you need to add 0-9 within pattern of preg


The proper way to validate HTML is using a HTML parser. Using Regexes to deal with HTML is very wrong - see RegEx match open tags except XHTML self-contained tags


My case

function checkHtml($html) {
    $level = 0;
    $map = []; 
    $length = strlen($html);
    $open = false;
    $tag = '';        
    for($i = 0; $i < $length; $i ++) {
        $c = substr($html, $i, 1);

        if($c == '<') {
            $open = true;
            $tag = '';
        }  else if($open && ($c == '>' || ord($c) == 32)) {
            $open = false;
            if(in_array($tag, ['br', 'br/', 'hr/', 'img/', 'hr', 'img'])) {
                continue;
            }
            if(strpos($tag, '/') === 0) {
                if(!isset($map[$tag.($level-1)])) {
                    return false;
                }
                $level --;
                unset($map[$tag.$level]); 
            } else {
                $map['/'.$tag.$level] = true;
                $level ++;
            } 
        } else if($open) {
           $tag .=  $c;
        }
    }
    return $level == 0;
}


ok, one solution would be:

function open_tags($page)
{
    $arr=array();
    $page // your html/xml/somthing content
       $i=0; 
    while ($i<strlen($page))
        {

            $i=strpos($page,'<',$i); //position of starting the tag
            $end=strpos($page,'>',$i); //position of ending the tag
            if(strpos($page,'/')<$end) //if it's an end tag
             {
                if (array_pop($arr)!=substr($page,$i,$end-$i)); // pop the last value inserted into the stack, and check if it's the same as this one
                  return FALSE;
             }

            else
            {
              array_push($arr,substr($page,$i,$end-$i)); // push the new tag value into the stack
            }

        }

return $arr;
}

this will return opened tags by order, or false if error.

edit:

function open_tags($page)
{
    $arr=array();
    $page // your html/xml/somthing content
       $i=0; 
    while ($i<strlen($page))
        {

            $i=strpos($page,'<',$i); //position of starting the tag
            $end=strpos($page,'>',$i); //position of ending the tag
            if($end>strpos($page,'<',$i))
                return false;
            if(strpos($page,'/')<$end) //if it's an end tag
             {
                if (array_pop($arr)!=substr($page,$i,$end-$i)); // pop the last value inserted into the stack, and check if it's the same as this one
                  return FALSE;
             }

            else
            {
              array_push($arr,substr($page,$i,$end-$i)); // push the new tag value into the stack
            }

        }

return $arr;
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜