RegEx in cfml to match whole word in uppercase followed by line feed

2023-03-05 20:33 问答作者：

I've been struggling with this all day, as regular expressions aren’t my most favourite topic.

I’m trying to find when the following happens:

Complete word that is in uppercase Followed by a space Followed by a line feed Followed by another space Followed by another word that starts with an uppercase letter

While testing I found that if I defined what the capital letter should be (in this case S):

[A-Z][A-Z]+ \n S

It would match, however if I change it to something like

[A-Z][A-Z]+ \n [A-Z]

It now picks up any text that contains a line feed regardless if it is preceded by an uppercase word.

Am I missing something obvious?

Below is some sample text I’m using (hopefully it pastes ok without losing it's line feeds). I’m trying to find the headings (in uppercase) so that I can make some changes to them.

 People who have a disability that would prevent them from performing required 
 basic life support skills are advised that they will not be able to achieve the 
 unit of competency. 
 ENROLLING IN FIRST AID UNITS OF COMPETENCY 
 If you are seeking to enro开发者_如何学运维l in a First Aid unit of competency e.g. HLTFA301B 
 Apply first aid, you are advised that to complete the unit you must be able to 
 perform basic life support skills, for example control bleeding and perform 
 cardiopulmonary resuscitation (CPR). If you have a disability that would prevent 
 you from performing required basic life support skills you are advised that you 
 will not be able to achieve the unit of competency. 
 REQUIREMENTS AND ADVICE FOR STUDENTS PARTICIPATING IN WORK PLACEMENT 
 Some or all of the following advice will apply to you, depending on your course 
 and the type of organisation where you will be undertaking work placement.

Cheers Mark

There are two primary problems. The lines have spaces and possibly other characters. You will need to at least use more than [A-Z] to search for these. You will at least need to include a space in the set [A-Z ]. If there are other characters such as numbers or some punctuation you will need to add them here as well. And as karora mentioned you will need to check for variations on the breaks.

Here is an example that also includes a positive look ahead to prevent it from coming back in the result, so you can then probably just use the match results array directly in the next step of your code.

<cfset matches = reMatch(" [A-Z ]+(?= \r?\n [A-Z])", teststring) />
<cfdump var="#matches#" />

When you are matching a line break, make sure you consider that line breaks may (or may not) have carriage-returns preceding them. Especially on text files from Windows.

So you might want something like:

"[ ][A-Z]+\r?\n[A-Z]"

Make sure you don't leave random spaces in your regex, because these will very likely be treated as literal spaces. I've enclosed the (only) space in the expression above in [ ] to make it clearer that it's part of the regex, and I've enclose the whole regex in " characters because you probably want that. The [ ] around that space should not be needed, though.

The ? following a match means "0 or more of the preceding", so in this case we want a \n optionally preceded by a \r.

继续阅读：coldfusion regex

RegEx in cfml to match whole word in uppercase followed by line feed

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？