开发者

Regular expression, matching anything or nothing?

I'm trying to do a regex for embedding YouTube videos.

Right now, this is the basic working thing I have:

http:\/\/www\.youtube\.com\/watch\?.*v=([a-z0-9-_]+)

It has the normal, recognized YouTube URL and then matches the unique video code. Unfortunately, this proves to be asking too much of my users. They input it missing the HTTP, missing the WWW, missing both HTTP and WWW, sometimes they input it replacing the /watch?v=* with /v/* and under all of these scenarios it breaks the format..

What I'm trying to do is allow anything 开发者_StackOverflow中文版before and after 'youtube.com' so that it would be near perfect regardless of the input. Examples below..

http://www.youtube.com/watch?v=([a-z0-9-_]+) --- the normal, unmolested input
http://youtube.com/watch?v=([a-z0-9-_]+) --- missing WWW
www.youtube.com/watch?v=([a-z0-9-_]+) --- missing HTTP
youtube.com/watch?v=([a-z0-9-_]+) --- missing HTTP and WWW
http://www.youtube.com/v/([a-z0-9-_]+) --- substituted watch?v= for /v/
http://youtube.com/v/([a-z0-9-_]+) --- substituted watch?v= for /v/ AND missing WWW 
www.youtube.com/v/([a-z0-9-_]+) --- substituted watch?v= for /v/ AND missing HTTP
youtube.com/v/([a-z0-9-_]+) --- substituted watch?v= for /v/ AND missing HTTP and WWW

This is one alteration which I thought it should work (allow any character) but maybe I'm missing something?

[.]+\youtube\.com\/[.]+([a-z0-9-_]+)

I apologize if I'm vague or ignorant, however I've tried several alterations and it never works. Perhaps what I'm seeking is impossible. I've honestly tried understanding regex, maybe it's the hour or maybe it's just me, but I cannot decipher it. It's beyond cryptic from my point of view.

My sincere thanks to anyone who spares a minute..


This should do it...

(?:http://)?(?:www\.)?youtube\.com/(?:watch\?v=|v/)([\w-]+)

RegExr.

This will match the URLs, and place the YouTube video id into capturing group 1.

It matches a possible http://, then a possible www., then always youtube.com/, then matches either watch?v= or v/, then it matches the \w character class and -.


In a character class, . is not a special character, it means literally a dot. [.]+ thus means "one or more dots". I don't know about any other problems you may have, but it should be .+ (or probably .*, since "youtube" can be the string's start) instead.


let said i have something like that

{e114dgfg084-4ddf1-21aea7}
and something like that :

{}

to catch both :

{[0-9a-f\-]*}

another solution is

({}|{[a-zA-Z0-9\-]+})
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜