开发者

Regular expression for words between 3 and 16 chars in PHP

I'm completely new to regular expressions, and I need to filter all the words of at least 3 characters (and a maximum size of 16) out of a text. (so I can enter those data into a MySQL database)

Currently, everything works, except for the regular expression:

/^.{3,16}$/

(I constructed this from a tutorial found using Google ;-) )

Thanks! Yv开发者_JAVA技巧an

Sample Data:

rjm1986 * SinuhePalma * excel2010 * Jimineedles * 209663603 * C6A7XR * Snojog * XmafiaX * Cival2 * HitmanPirrie * MAX * 4163016 * Dredd23 * Daddy420 * mattpauley * Mykillurdeath * 244833585 * KCKnight * Greystoke * Fatbastard * Fucku4 * Davkar * Banchy2 * ET187 * Slayr69 * Nik1236 * SeriousAl * 315791 * 216996334 * K1ra * Koops1 * LastFallout * zmileben * bismark * Krlssi * FuckOff1 * 1owni * Ulme * Rxtvjq * halfdeadman * Jamacola * LBTG1008 * toypark * Magicman6497 * Tyboe187 * Bob187 * Zetrox

PHP Code (yeah, I know - it's kind of sloppy - this is only used to generate the queries...)

<?php
    //regexer.php

    $text = @$_REQUEST['fText'];
    if ($text == '') {
?>
<form method="post" action="">
    <input type="text" name="regex" />
    <textarea name="fText"></textarea>
    <br />
    <input type="submit"></input>
</form>
<?php 
    } else {
        preg_match_all($_REQUEST['regex'], $_REQUEST['fText'], $matches);
        header ("Content-type: text/plain");
        foreach ($matches as $match) {
            //print_r($match);
            echo ("INSERT INTO maf_codes (Code, GameID) VALUES ('$match', %GAMEID%);\n");
        }
    }
?>

Found a solution: replace the $_REQUEST['regex'] with the regex did work ;)


Try this:

/\b\w{3,16}\b/

Explained:

  • \b matches a word boundary
  • \w matches a word character
  • {3,16} applies to the \w and it indicates that at least 3 and at most 16 characters should be matched.

FYI: I omitted the start anchor (^) and end anchor ($) from the regex you noted in your question because it seems like you want to find matches with longer strings of text as input and the anchors would restrict the matching to only instances where the entire input string matched.

UPDATE:

Here is the proof that this regex works:

<?php

$input = 'rjm1986 * SinuhePalma * excel2010 * Jimineedles * 209663603 * C6A7XR * Snojog * XmafiaX * Cival2 * HitmanPirrie * MAX * 4163016 * Dredd23 * Daddy420 * mattpauley * Mykillurdeath * 244833585 * KCKnight * Greystoke * Fatbastard * Fucku4 * Davkar * Banchy2 * ET187 * Slayr69 * Nik1236 * SeriousAl * 315791 * 216996334 * K1ra * Koops1 * LastFallout * zmileben * bismark * Krlssi * FuckOff1 * 1owni * Ulme * Rxtvjq * halfdeadman * Jamacola * LBTG1008 * toypark * Magicman6497 * Tyboe187 * Bob187 * Zetrox';

$matches = array();

preg_match_all('/\b\w{3,16}\b/', $input, $matches);

print_r($matches);

?>

Outputs:

Array
(
    [0] => Array
        (
            [0] => rjm1986
            [1] => SinuhePalma
            [2] => excel2010
            [3] => Jimineedles
            [4] => 209663603
            [5] => C6A7XR
            [6] => Snojog
            [7] => XmafiaX
            [8] => Cival2
            [9] => HitmanPirrie
            [10] => MAX
            [11] => 4163016
            [12] => Dredd23
            [13] => Daddy420
            [14] => mattpauley
            [15] => Mykillurdeath
            [16] => 244833585
            [17] => KCKnight
            [18] => Greystoke
            [19] => Fatbastard
            [20] => Fucku4
            [21] => Davkar
            [22] => Banchy2
            [23] => ET187
            [24] => Slayr69
            [25] => Nik1236
            [26] => SeriousAl
            [27] => 315791
            [28] => 216996334
            [29] => K1ra
            [30] => Koops1
            [31] => LastFallout
            [32] => zmileben
            [33] => bismark
            [34] => Krlssi
            [35] => FuckOff1
            [36] => 1owni
            [37] => Ulme
            [38] => Rxtvjq
            [39] => halfdeadman
            [40] => Jamacola
            [41] => LBTG1008
            [42] => toypark
            [43] => Magicman6497
            [44] => Tyboe187
            [45] => Bob187
            [46] => Zetrox
        )

)


Can you tell what exactly is not working? But anyway I think in your regex you should use the word boundary metacharacter \b:

/\b\w{3,16}\b/

Update: It works for me. This:

<?php
$a = array();

preg_match_all('/\b\w{3,16}\b/', "rjm1986 * SinuhePalma * excel2010 * Jimineedles * 209663603 * C6A7XR * Snojog * XmafiaX * Cival2 * HitmanPirrie * MAX * 4163016 * Dredd23 * Daddy420 * mattpauley * Mykillurdeath * 244833585 * KCKnight * Greystoke * Fatbastard * Fucku4 * Davkar * Banchy2 * ET187 * Slayr69 * Nik1236 * SeriousAl * 315791 * 216996334 * K1ra * Koops1 * LastFallout * zmileben * bismark * Krlssi * FuckOff1 * 1owni * Ulme * Rxtvjq * halfdeadman * Jamacola * LBTG1008 * toypark * Magicman6497 * Tyboe187 * Bob187 * Zetrox", $a);

print_r($a);

gives me:

Array
(
    [0] => Array
        (
            [0] => rjm1986
            [1] => SinuhePalma
            [2] => excel2010
            [3] => Jimineedles
            [4] => 209663603
            //.... lot more here...
            [45] => Bob187
            [46] => Zetrox
        )

)

Also note that the matches are in the first entry of the result array, so you have to do:

 foreach ($matches[0] as $match) {
        print_r($match);
        //...
 }

And you have to declare $matches before you use it:

$matches = array();
preg_match_all($_REQUEST['regex'], $_REQUEST['fText'], $matches);


As others have said, the following will do it.

/\b\w{3,16}\b/g

The reason your original line (below) didn't work is because:

/^.{3,16}$/
  1. The ^ and $ stand for the beginning and end of a line. It looks like you want to extract words from within a line.
  2. The . will match any character at all, including spaces or special characters.


you can just use strlen().

$mystr="rjm1986 * SinuhePalma * excel2010 * Jimineedles * 209663603 * C6A7XR * Snojog * XmafiaX * Cival2 * HitmanPirrie * MAX * 4163016 * Dredd23 * Daddy420 * mattpauley * Mykillurdeath * 244833585 * KCKnight * Greystoke * Fatbastard * Fucku4 * Davkar * Banchy2 * ET187 * Slayr69 * Nik1236 * SeriousAl * 315791 * 216996334 * K1ra * Koops1 * LastFallout * zmileben * bismark * Krlssi * FuckOff1 * 1owni * Ulme * Rxtvjq * halfdeadman * Jamacola * LBTG1008 * toypark * Magicman6497 * Tyboe187 * Bob187 * Zetrox";
$s = explode(" ",$mystr);
foreach($s as $v){
    $len=strlen($v);
    if($len>=3 && $len<=16){
        echo "found: $v\n";
    }
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜