Regular Expressions Using Python's Re

2023-02-25 17:02 问答作者：

I have the following file full of lines similar to this:

line = 'Weclome - MIsiti International,0,0,-9,0,'

I want to replace 'Weclome - MIsiti International' with the string '1'

here is my code:

exp=re.compile(r"([\./A-Za-z\s\-]+)")
print exp.sub("1",line)

Unfortunately I get the following output:

1,0,0,19,0,

Which is incorrect. i thought this would work:

exp=re.compile(r"([\./A-Za-z\s\-[^0-9]]+)")
print exp.sub("1",line)

But it does not:

开发者_如何学JAVA

[]

Can someone tell me what I am doing wrong here?

Why do you need a regular expression?

>>> line = 'Weclome - MIsiti International,0,0,-9,0,'
>>> s=line.split(",")
>>> s[0]="1"
>>> ','.join(s)
'1,0,0,-9,0,'

exp=re.compile(r"([\./A-Za-z\s\-]+)"

No need to put '\' before '-' between brackets. Put '-' at a place between brackets where it can't have its special meaning.

Also, no need to put '\' before the dot '.' between brackets because a dot between brackets looses its special meaning.

So, instead of exp=re.compile(r"([\./A-Za-z\s\-]+)") , write exp=re.compile(r"([./A-Za-z\s-]+)")

Concerning exp=re.compile(r"([\./A-Za-z\s\-[^0-9]]+)") , it doesn't match at all because it is the same for '[' than for '-' : if placed in a position where it can't have a meaning, then it looses its special meaning and is considered simply as the character.

So the '[' before '^0-9]' is the bracket, not the beginninge of a class. Consequently, the ']' at the end of '^0-9]' is the ending bracket of the first left bracket in '[\./A-Z...' AND the last right bracket followed by '+' means "the character ] at least one time and possibly more"

import re

line = 'Weclome - MIsiti International,0,0,-9,0,'

exp=re.compile(r"(^[./A-Za-z\s-]+)")
print exp.sub("1",line)

# or

exp=re.compile(r"([./A-Za-z\s-]+(?=,))")
print exp.sub("1",line)

result

1,0,0,-9,0,
1,0,0,-9,0,

Character classes cannot be nested. The later example will eat '[', '^', etc. Would it not work if you simply did r"(^[^,0-9]+)", i.e. anything at the start not being commaor 0-9?

You're first regex is good but you need to anchor it to the beginning of the line and add the 'm' multiline modifier like so:

import re
line = 'Weclome - MIsiti International,0,0,-9,0,'
exp = re.compile(r"^([./A-Za-z\s\-]+)", re.M)
print (exp.sub("1",line))

Note that this solution fixes an entire file full of lines in one operation.

Most people are giving you answers <snark>often qualified with "Don't use regex! Regex is evil and comes from Perl! We Python users have trancended mere text manipulation!"</snark> but no one is explaining why you're experiencing this problem.

Your regex is working. It takes any alphabet, whitespace, or hyphen character and turns it into the number 1. The problem is that it thinks the negative sign in -9 is "evil text" to turn into a number.

One way to approach this is to provide an anchor for your regex - Make it match the commas (or beginning/ending of the string) surrounding the text. So it would see ,text, and turn it into ,1, but would see ,-9, and know that it's not text.

Another approach is to filter based on "does it not contain digits" instead of "does it contain these things I need" - because what if, later, you need to filter out other punctuation marks? Using ,[^0-9,]+, would match "things that aren't digits or commas", which would turn ,text, into ,1, but keep ,-9, the same.

A third approach is to split the string on commas, then test and change each individual segment - probably to see if it contains digits - and then join them back together.

If you choose the first or second approaches, I leave it up to you to write a regex that either matches a leading comma or the beginning of a string (and a trailing comma or the end of the string - both are similar). It's not terribly difficult.

继续阅读：python regex

Regular Expressions Using Python's Re

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？