开发者

Count Parenthesis in a file with a Python program?

I Wanna fix a function through it i can count how many times are used the:(,),[,] if the counts of ( are equal to those of ) and if the counts of [ are equal to those of ] then i have valid syntax!

my first -dissapointed- try:

filename=input("Give a file name:")

def  parenthesis(filename):
    try:
        f=open(filename,'r')
    except (IOError):
        print("The file",filename,"does not exist!False! Try again!")
    else:
         while True:
            for line in filename:
                line=f.readline()
                if line=='(':
                     c1=line.count('(')
                e开发者_如何学JAVAlif line==')':
                     c2=line.count(')')
                elif line=='[':
                     c3=line.count('[')
                elif line==']':
                     c4=line.count(']')
                elif line=='':
                    break

                if c1==c2:
                     print("Line: Valid Syntax")
                elif c1!=c2:
                     print("Line: InValid Syntax")
                elif c3==c4:
                     print("Line: Valid Syntax")
                elif c3!=c4:
                     print("Line: InValid Syntax")
    finally:
        f.close()

parenthesis(filename)


Please excuse the length of this reply.

If I understand you correctly, you want to do simple syntax checking of parentheses to make sure they are balanced correctly. In your question you specify a test based on simple counting, but as others have pointed out, this does not catch things like "([)]".

I'd also like to offer some constructive criticism on other aspects of your code.

To start with, it is better to get the filename from the command line, and not to prompt for it. This is so that you can easily run the program repeatedly, when developing it, without having to type in the filename all the time. This is your way:

$ python foo.py
Give a file name:data
[some output]
$ python foo.py
Give a file name:data
[some output]
$ python foo.py
Give a file name:data
[some output]

You need to type in the filename every time. You don't need to type in the command to run the program more than once. After the first time, you can use the arrow key to get it from the shell's command history. If you get the filename from the command line, you can do this instead:

$ python foo.py testfile
[some output]
$ python foo.py testfile
[some output]
$ python foo.py testfile
[some output]

This way, when you test the second time, you don't need to type more than the up arrow and Enter keys. This is a small convenience, but it is important: when you're developing software, even small things can start annoying. It's like a large grain of sand under your foot when going on a long walk: you won't even notice it for the first couple of kilometers, but after a few more, you're bleeding.

In Python, to access the command line arguments, you need the sys.argv list. The relevant changes to your program:

import sys
filename = sys.argv[1]

If you do want to prompt, you should use something else than the built-in input function. It interprets whatever the user types as a Python expression, and that causes all sorts of problems. You could read using sys.stdin.readline.

Anyway, we've now got the name of the file safely stored in the filename variable. It's time to do something with it. Your parentheses function does pretty much everything, and experience has shown that that's often not the best way of doing things. Every function should, instead, do just one thing, but do that well.

I suggest that you should keep the parts of actually opening and closing a file separate from the actual counting. This will simplify the logic of the counting, since it does not need to worry about the rest. In code:

import sys

def check_parentheses(f):
    pass # we'll come to this later

def main():
    filename = sys.argv[1]
    try:
        f = file(filename)
    except IOError:
        sys.stderr.write('Error: Cannot open file %s' % filename)
        sys.exit(1)
    check_parentheses(f)
    f.close()

main()

I changed a couple of other things, too, in addition to rearranging things. First, I write the error message to the standard error output. This is the proper way to do it, and means fewer surprises to shell users to redirect the output. (If that doesn't make any sense to you, don't worry about it, just accept it as a given for now.)

Second, if there's an error, I exit the program with sys.exit(1). This tells whoever started the program that it failed. In Unix shell, this lets you do things like the following:

if python foo.py inputfile
then
    echo "inputfile is OK!"
else
    echo "inputfile is BAD!"
fi

The shell script might do something more interesting than just reporting success or failure, of course. It might, for example, remove all broken files, or e-mail whoever wrote them to ask them to fix them. The beauty is that you, who write the checker program, do not need to care. You just set the program exit code properly, and let whoever writes the shell script to worry about the rest.

The next step is to actually read the contents of the file. This can be done in various ways. The easiest way is to do it line-by-line, like this:

for line in f:
    # do something with the line

We then need to look at each character in the line:

for line in f:
    for c in line:
        # do something with the character

We're now ready to actually start checking parentheses. As suggested by others, a stack is the appropriate data structure for this. A stack is basically a list (or array) where you add items to one end, and take them out in reverse order. Think of it as a stack of coins: you can add a coin to the top, and you can remove the topmost coin, but you can't remove one from the middle or bottom.

(Well, you can, and it's a neat trick if you do, but computers are simple beasts and get upset by magic tricks.)

We will use a Python list as a stack. To add an item, we use the list's append method, and to remove we use the pop method. An example:

stack = list()
stack.append('(')
stack.append('[')
stack.pop() # this will return '['
stack.pop() # this will return '('

To look at the topmost item in the stack, we use stack[-1] (in other words, the last item in the list).

We use the stack as follows: when we find an opening parentheses ('('), bracket ('['), or brace ('{'), we put it on the stack. When we find a closing one, we check the topmost item on the stack, and make sure that it matches the closing one. If not, we print an error. Like this:

def check_parentheses(f):
    stack = list()
    for line in f:
        for c in line:
            if c == '(' or c == '[' or c == '{':
                stack.append(c)
            elif c == ')':
                if stack[-1] != '(':
                    print 'Error: unmatched )'
                else:
                    stack.pop()
            elif c == ']':
                if stack[-1] != '[':
                    print 'Error: unmatched ]'
                else:
                    stack.pop()
            elif c == '}':
                if stack[-1] != '{':
                    print 'Error: unmatched }'
                else:
                    stack.pop()

This now does find unmatched parentheses of various kinds. We can improve it a little bit by reporting the line and column where we find the problem. We need a line number and column number counter.

def error(c, line_number, column_number):
    print 'Error: unmatched', c, 'line', line_number, 'column', column_number

def check_parentheses(f):
    stack = list()
    line_number = 0
    for line in f:
        line_number = line_number + 1
        column_number = 0
        for c in line:
            column_number = column_number + 1
            if c == '(' or c == '[' or c == '{':
                stack.append(c)
            elif c == ')':
                if stack[-1] != '(':
                    error(')', line_number, column_number)
                else:
                    stack.pop()
            elif c == ']':
                if stack[-1] != '[':
                    error(']', line_number, column_number)
                else:
                    stack.pop()
            elif c == '}':
                if stack[-1] != '{':
                    error('}', line_number, column_number)
                else:
                    stack.pop()

Note also how I added a helper function, error, to do the actual printing of the error message. If you want to change the error message, you now only need to do it in one place.

Another thing to notice is that the cases for handling the closing symbols are all very similar. We could make that to a function, too.

def check(stack, wanted, c, line_number, column_number):
    if stack[-1] != wanted:
        error(c, line_number, column_number)
    else:
        stack.pop()

def check_parentheses(f):
    stack = list()
    line_number = 0
    for line in f:
        line_number = line_number + 1
        column_number = 0
        for c in line:
            column_number = column_number + 1
            if c == '(' or c == '[' or c == '{':
                stack.append(c)
            elif c == ')':
                check(stack, '(', ')', line_number, column_number)
            elif c == ']':
                check(stack, '[', ']', line_number, column_number)
            elif c == '}':
                check(stack, '{', '}', line_number, column_number)

The program can be refined further, but this should suffice for now. I'll include the whole code at the end.

Note that this program only cares about parentheses of various kinds. If you really want to check a whole Python program for syntactic correctness, you'll need to parse all of Python's syntax, and that is pretty complicated, and too much for one Stack Overflow answer. If that's what you really do want, please ask a followup question.

The whole program:

import sys

def error(c, line_number, column_number):
    print 'Error: unmatched', c, 'line', line_number, 'column', column_number

def check(stack, wanted, c, line_number, column_number):
    if stack[-1] != wanted:
        error(c, line_number, column_number)
    else:
        stack.pop()

def check_parentheses(f):
    stack = list()
    line_number = 0
    for line in f:
        line_number = line_number + 1
        column_number = 0
        for c in line:
            column_number = column_number + 1
            if c == '(' or c == '[' or c == '{':
                stack.append(c)
            elif c == ')':
                check(stack, '(', ')', line_number, column_number)
            elif c == ']':
                check(stack, '[', ']', line_number, column_number)
            elif c == '}':
                check(stack, '{', '}', line_number, column_number)

def main():
    filename = sys.argv[1]
    try:
        f = file(filename)
    except IOError:
        sys.stderr.write('Error: Cannot open file %s' % filename)
        sys.exit(1)
    check_parentheses(f)
    f.close()

main()


Remove the 'while True' line and this bit:

elif line=='':
    break

and then replace this:

for line in filename:
    line=f.readline()

with this:

for line in f:

Now you'll be looping over the lines in the file.

Next, replace all of these sorts of things:

if line=='(':
    c1=line.count('(')

with:

    c1+=line.count('(')

The if and elif lines are just preventing you from counting when you should. If the line doesn't have what you're looking for the count will be 0, which is fine.

That should at least get you closer to a solution.


I believe, You are looking for balanced symbol checker. It is better to use stack.

  1. Make an empty stack,

  2. For each symbol in the string:

    2.1 If the symbol is an opening symbol, push it on the stack.

    2.2 If it is a closing symbol, then

    2.2.1 If the stack is empty, then it in false.

    2.2.2 If the top of the stack does not match the closing symbol, return false. [check this step if you are checking for matched braces]

    2.2.3 Pop the stack.

  3. Return true if the stack is empty, otherwise false.

hth.


I think if you change the:

            if line=='(':
                 c1=line.count('(')
            elif line==')':
                 c2=line.count(')')
            elif line=='[':
                 c3=line.count('[')
            elif line==']':
                 c4=line.count(']')
            elif line=='':
                break

to something like:

SearchFor = ['(', ')', '[', ']']
d = {}
for itm in SearchFor:
    d[itm] = line.count(itm)


# Then do the comparison
if d['['] == d[']'] and  d['('] == d[')']:
     print "Valid Syntax"
else:
     print "Invalid Syntax" #You could look at each to find the exact cause.

and the While True: as mentioned by others. I had missed that. :0)


Do you want to ensure that they parens & braces are matched? That "[(])" fails? If not, you're along the right path, except that you need to change your "=" to "+=". You're discarding the values of previous lines.


All of these answers are wrong and will not work in all cases, so either use python parser e.g. tokenize etc or just use this

count = min(text.count("("), text.count(")"))


My solution will try to help you understand a bit more precise way of doing this, and hopefully you'll learn a bit about data structures in the process.

To properly do this, you're going to want to use a stack. You'll want to pull out all instances of (, ), [, and ] (perhaps using a regular expression... hint) and go through the array that that generates:

Say your file is like this:

(this [is] foobar)

Your regular expression will yield this array:

['(', '[', ']', ')'] 

You will pop(0) off of this array into a stack.

Algorithmically:

1) Put all tags {(,),[,]} in an array.

2) For each element in the array, pop(0) from it and push it onto your stack. Test it against the element before it. If it closes the element before it, then pop() twice from the array (eg, if you have '(' on the stack, and the next element to be pushed onto the stack is a ')', ')' closes '(', so you pop them both.) If it doesn't, continue.

3) If your array is empty and your stack is empty when this is over, then your file is well formed. If it's not, then you have a poorly formed file {something like (foo[bar)] }.

Bonus: regular expression: REGEX = re.compile(r"\)\(\[\]"), REGEX.findall(your string to search). See more about regexes in Python here.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜