开发者

Removing numbers and separators from a numbered list

I am often dealing with list that users submit to a web site. A list usually looks like:

  1. Item
  2. Item

The pattern is usually a number followed by the separator (it can be "-" or "\" or "." or any other typical separators). There can be one or more spaces between the number and separator and between the separator and the list item. Sometimes there is no number in front of 开发者_如何学编程the list item and in that case nothing needs to be done. Sometimes there is a number but no separator.

Is there a way to take out the number and/or the separator all together using regular expression?


This will match numbers and separators and the beginning of a line:

^\d+\s*[-\\.)]?\s+

Use it to replace it with an empty strings (depends on the language you are using).

You might have to add more characters to the character class, to match possible separators.

Good source to learn regular expressions: http://www.regular-expressions.info/


(?=\d*\s*[-\\.]?\s*)([a-zA-Z\s*]+)

You can view the answer here: RegExr

Explanation:

\d*- matches 0 or more occurrence digits
\s* - matches 0 or more occurrence whitespace after the number
[-\\.]? - matches 0 or 1 occurrence of '-' '.' '\'
\s* - matches 0 or more occurrence whitespace after that
([a-zA-Z\s*]+) -matches any characters after that (items you need to extract)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜