How do I split between numbers and characters with regex?
I have a string containing on weekdays and opening hours, how do I split these into lines using a regex expression? An example of a string is:
Mån - 开发者_如何学运维Tor6:30 - 22:00Fre6:30 - 20:00Lör9:00 - 18:00Sön10:00 - 19:00
I want to split between a lower letter and a number, and between a number and a capital letter
Mån - Tor
6:30 - 22:00
Fre
6:30 - 20:00
Lör
9:00 - 18:00
Sön
10:00 - 19:00
Thanks in advance!
Split on
(?<=\d)(?=\p{L})|(?<=\p{L})(?=\d)
For example, in C#:
splitArray = Regex.Split(subjectString, @"(?<=\d)(?=\p{L})|(?<=\p{L})(?=\d)");
or in PHP:
$result = preg_split('/(?<=\d)(?=\p{L})|(?<=\p{L})(?=\d)/u', $subject);
or in Java:
String[] splitArray = subjectString.split("(?<=\\d)(?=\\p{L})|(?<=\\p{L})(?=\\d)");
or in Perl:
@result = split(m/(?<=\d)(?=\p{L})|(?<=\p{L})(?=\d)/, $subject);
If and only if a number is a code point with the \pN
property, than a nonnumber is any code point lacking said property, which one writes \PN
for.
Some regex dialects pusillanimously insist on embracing those, as \p{N}
or \P{N}
— which is bunk, but you’re a prisoner of your language designer’s whims and foibles, insecurities or ignorance.
In those regex dialects of a more readable bent, you may write those in a more liberal and more legible fashion, as \p{Number}
and \P{Number}
, respectively.
If you mean a decimal number, which is not the same as a number, you may write that as \p{Nd}
, with its complement therefore \P{Nd}
. The legible version of those is \p{Decimal_Number}
and \P{Decimal_Number}
. In some programming languages, this is what the \d
regex convenience abbreviation stands for.
There are four general categories related to numbers:
N Number
Nd Decimal_Number (also Digit)
Nl Letter_Number
No Other_Number
and there are numerous other categories related to numbers:
Alnum InCommonIndicNumberForms Numeric_Type:Numeric Numeric_Value:18 Numeric_Value:38 Numeric_Value:400 Numeric_Value:60000 Bidi_Class:Arabic_Number InCountingRodNumerals Numeric_Value:0 Numeric_Value:19 Numeric_Value:39 Numeric_Value:500 Numeric_Value:70000 Bidi_Class:European_Number InCuneiformNumbersAndPunctuation Numeric_Value:NaN Numeric_Value:20 Numeric_Value:40 Numeric_Value:600 Numeric_Value:80000 Block:Aegean_Numbers InEnclosedAlphanumerics Numeric_Value:1 Numeric_Value:21 Numeric_Value:41 Numeric_Value:700 Numeric_Value:90000 Block:Ancient_Greek_Numbers InEnclosedAlphanumericSupplement Numeric_Value:2 Numeric_Value:22 Numeric_Value:42 Numeric_Value:800 Numeric_Value:100000 Block:Common_Indic_Number_Forms InMathematicalAlphanumericSymbols Numeric_Value:3 Numeric_Value:23 Numeric_Value:43 Numeric_Value:900 Numeric_Value:100000000 Block:Counting_Rod_Numerals InNumberForms Numeric_Value:4 Numeric_Value:24 Numeric_Value:44 Numeric_Value:1000 Numeric_Value:1000000000000 Block:Cuneiform_Numbers_And_Punctuation InRumiNumeralSymbols Numeric_Value:5 Numeric_Value:25 Numeric_Value:45 Numeric_Value:2000 Other_Number Block:Enclosed_Alphanumeric_Supplement Letter_Number Numeric_Value:6 Numeric_Value:26 Numeric_Value:46 Numeric_Value:3000 PosixAlnum Block:Enclosed_Alphanumerics Line_Break:Infix_Numeric Numeric_Value:7 Numeric_Value:27 Numeric_Value:47 Numeric_Value:4000 Sentence_Break:Numeric Block:Mathematical_Alphanumeric_Symbols Line_Break:Numeric Numeric_Value:8 Numeric_Value:28 Numeric_Value:48 Numeric_Value:5000 Word_Break:ExtendNumLet Block:Number_Forms Line_Break:Postfix_Numeric Numeric_Value:9 Numeric_Value:29 Numeric_Value:49 Numeric_Value:6000 Word_Break:MidNum Block:Rumi_Numeral_Symbols Line_Break:Prefix_Numeric Numeric_Value:10 Numeric_Value:30 Numeric_Value:50 Numeric_Value:7000 Word_Break:MidNumLet Decimal_Number Number Numeric_Value:11 Numeric_Value:31 Numeric_Value:60 Numeric_Value:8000 Word_Break:Numeric General_Category:Decimal_Number Numeric_Type:De Numeric_Value:12 Numeric_Value:32 Numeric_Value:70 Numeric_Value:9000 XPosixAlnum General_Category:Letter_Number Numeric_Type:Decimal Numeric_Value:13 Numeric_Value:33 Numeric_Value:80 Numeric_Value:10000 General_Category:Number Numeric_Type:Di Numeric_Value:14 Numeric_Value:34 Numeric_Value:90 Numeric_Value:20000 General_Category:Other_Number Numeric_Type:Digit Numeric_Value:15 Numeric_Value:35 Numeric_Value:100 Numeric_Value:30000 InAegeanNumbers Numeric_Type:None Numeric_Value:16 Numeric_Value:36 Numeric_Value:200 Numeric_Value:40000 InAncientGreekNumbers Numeric_Type:Nu Numeric_Value:17 Numeric_Value:37 Numeric_Value:300 Numeric_Value:50000
So. . . just which particular sort of “numbers” did you happen to be interested in? :)
This works in ruby: (\D+)(\d+:\d+ - \d+:\d+)
Your example on Rubular: http://rubular.com/r/0XqCYmNdnJ
If you search for ([a-z])(\d)
and replace it with $1\n$2
it should work, but without knowing your programming language and environment, it's hard to give you a direct answer.
精彩评论