开发者

Remove whitespace from a regex search in a file

I'm attempting to remove all whitespace from a selected string search using regexp. The code works but it continues to return an error that I'm not sure how to resolve ...?

elif searchType =='2':
      print "  Directory to be searched: c:\Python27 "
      directory = os.path.join("c:\\","SQA_log")
      userstring = raw_input("Enter a string name to search: ")
      userStrHEX = userstring.encode('hex')
      userStrASCII = ' '.join(str(ord(char)) for char in userstring)
      regex = re.compile(r"(%s|%s|%s)" % ( re.escape( userstring ), re.escape( userStrHEX ), re.escape( userStrASCII )))
      choice = raw_input("Type 1 to search with whitespace. Type 2 to search ignoring whitespace: ")
      if choice == '1':
           for root,dirname, files in os.walk(directory):
              for file in files:
                  if file.endswith(".log") or file.endswith(".txt"):
                     f=open(os.path.join(root, file))
                     for i,line in enumerate(f.readlines()):
                         result = regex.search(line)
                         if regex.search(line):
                            print " "
                            print "Line: " + str(i)
                            print "File: " + os.path.join(root,file)
                            print "String Type: " + result.group()
                            print " "


                     f.close()
      re.purge()              
      if choice == '2':
         for root,dirname, files in os.walk(directory):
             for file in files:
                 if file.endswith(".log") or file.endswith(".txt"):
                    f=open(os.path.join(root, file))
                    for i,line in enumerate(f.readlines()):
                        result = regex.search(re.sub(r'\s', '',line))
                        if regex.search(line):
                           print " "
                           print "Line: " + str(i)
                           print "File: " + os.path.join(root,file)
                           print "String Type: " + result.group()
                           print " "

                    f.close()  


                        re.purge()

This is the error it returns:

Line: 9160
File: c:\SQA_log\13.00.log
String Type: Rozelle07

Line: 41
File: c:\SQA_log\NEWS.txt
String Type: 526f7a656c6c653037

Line: 430
File: c:\SQA_log\README.txt

Traceback (most recent call last):
  File "C:\SQA_log\cmd_simple.py", line 226, in <module&g开发者_C百科t;
    SQAST().cmdloop()
  File "C:\Python27\lib\cmd.py", line 142, in cmdloop
    stop = self.onecmd(line)
  File "C:\Python27\lib\cmd.py", line 219, in onecmd
    return func(arg)
  File "C:\SQA_log\cmd_simple.py", line 147, in do_search
    print "String Type: " + result.group()
AttributeError: 'NoneType' object has no attribute 'group'


It appears that regex.search fails on line with whitespaces stripped, but succeeds when whitespaces are there. You haven't given regex's definition, so I can't tell you why that happens, but if you replace if regex.search(line) with if result: you shouldn't get that error.

The reason for the error is that re.search returns a special value, None, when it doesn't find any match, instead of a match object. None always evaluates to False inside boolean expressions, so you can use it in an if statement, but it doesn't have any attributes, and that's why result.group() fails when result is None.

BTW: You probably want to use re.gsub(r'\s+', '', line) instead of re.sub(r'\s', '', line) if you want to remove all occurrences of whitespace in line and not just the first one.

Fixed code:

for i,line in enumerate(f.readlines()):
    result = regex.search(re.gsub(r'\s+', '', line))
    if result:
       print ...


I don't understand what you are doing here:

result = regex.search(re.sub(r'\s', '',line))

you substituted one space with blank, and then what are you searching for ? The error message is quite clear. result.group() has nothing.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜