
amazon short url regex... why can't i get this to work

here is a regex i got from: a blog i can't link to because i am new... just google amazon short url and click on the blog post by noah coad

as you can see from this page... it is supposed to extract the unique product id from any amazon url so you can shorten it... or use it to pull info from amazon apis.

here is the sample code i am trying to use to get it to work:

$example_url = 'http://www.amazon.com/dp/1430219483/?tag=codinghorror-20';    

$reg = '(?:http://(?:www\.){0,1}amazon\.com(?:/.*){0,1}(?:/dp/|/gp/product/))(.*?)(?:/.*|$)';

echo 'test<br/>';

echo preg_match($reg,$example_url);

and here is my output:


Warning: preg_match() [function.preg-match]: Unknown modifier '(' in /Users/apple/Sites/amazon/asin_extract.php on line 14

thanks so much! this is my first time posting on this site where i have found countless answers already

on second hand... take back some of my thanks for this painful first time submission process... i had to trim this question since it thinks my regex patterns are urls

Your regex probably needs delimiters : a character that will be present at the beginning and the end of it.
This comment on the PHP manual is interested, about this :-)

'/' is often used ; but some people prefer '#' -- the second one being nice for URLs

So :

$reg = '#(?:http://(?:www\.){0,1}amazon\.com(?:/.*){0,1}(?:/dp/|/gp/product/))(.*?)(?:/.*|$)#';

And, with the full code, a bit modified to capture the results :

$example_url = 'http://www.amazon.com/Professional-Visual-Studio-System-Programmer/dp/0764584367/ref=sr_1_1/104-4732806-7470339?ie=UTF8&s=books&qid=1179873697&sr=8-1';
$reg = '#(?:http://(?:www\.){0,1}amazon\.com(?:/.*){0,1}(?:/dp/|/gp/product/))(.*?)(?:/.*|$)#';
echo 'test<br/>';

$matches = array();
echo preg_match($reg,$example_url, $matches);


The output you get from the var_dump is :

  0 => string 'http://www.amazon.com/Professional-Visual-Studio-System-Programmer/dp/0764584367/ref=sr_1_1/104-4732806-7470339?ie=UTF8&s=books&qid=1179873697&sr=8-1' (length=149)
  1 => string '0764584367' (length=10)

And $matches[1] is 0764584367.

Looks like the problem is that it's trying to use parenthesis as your begin/end regular expression delimiter. Here's a sample from the man page:

$pattern = '/^def/';

If you use slash as your begin/end expression delimiter it'll be rough to write your regular expression. I suggest using the pound sign ('#') for regular expression as you'll have to escape less characters.

Here's what I ended up with:


$example_url = 'http://www.amazon.com/Server-Side-Programming-Techniques-Performance-Scalability/dp/0201704293';

$reg = "#(?:http://(?:www\.){0,1}amazon\.com(?:/.*){0,1}(?:/dp/|/gp/product/))(.*?)(?:/.*|$)#";

echo 'test<br/>';

echo preg_match($reg, $example_url);





验证码 换一张
取 消

