开发者

Try Except in Python

I want to take a path for a file, open the file and read the data within it. Upon doing so, I would like to count the number of occurrences of each letter in the alphabet.

Of what I have read and heard, using try/except would be best here. I've tried my best in this, but I only managed to count the occurrences of what letters were in a string within the program, and not within 开发者_如何学编程the file.

I haven't a clue how to do this now, and my brain is starting to hurt....this is what I have so far:

import sys
print "Enter the file path:"
thefile = raw_input()
f = open(thefile, "r")
chars = {}
for c in f:
    try:
        chars[c]+=1
    except:
        chars[c]=1
print chars

Any help will be highly appreciated. Thank you.

EDIT: I forgot to say that the result I get at the minute says that the whole file is one character. The file consists of "abcdefghijklmnopqrstuvwxyz" and the resulting output is: {'"abcdefghijklmnopqrstuvwxyz"\n': 1} which it shouldn't be.


A slightly more elegant approach is this:

from __future__ import with_statement

from collections import defaultdict

print "Enter the file path:"
thefile = raw_input()

with open(thefile, "r") as f:
    chars = defaultdict(int)

    for line in f:
        for c in line:
            chars[c] += 1

    print dict(chars)

This uses a defaultdict to simplify the counting process, uses two loops to make sure we read each character separately without needing to read the entire file into memory, and uses a with block to ensure that the file is closed properly.

Edit:

To compute a histogram of the letters, you can use this version:

from __future__ import with_statement

from string import ascii_letters

print "Enter the file path:"
thefile = raw_input()

chars = dict(zip(ascii_letters, [0] * len(ascii_letters)))

with open(thefile, "r") as f:

    for line in f:
        for c in line:
            if c in ascii_letters:
                chars[c] += 1

for c in ascii_letters:
    print "%s: %d" % (c, chars[c])

This uses the handy string.ascii_letters constant, and shows a neat way to build the empty dictionary using zip() as well.


The for c in f: statement is processing your file line by line (that's what the for operation on a file object is designed to do). Since you want to process it character by character, try changing that to:

data = f.read()
for c in data:

The .read() method reads the entire contents of the file into one string, assigns it to data, then the for loop considers each individual character of that string.


You're almost there, actually; the most important thing you're missing is that your c is not a character, instead it's a line: iterating through a Python file gives you a line at a time. You can solve the problem by adding another loop:

print "Enter the file path:"
thefile = raw_input()
f = open(thefile, "r")
chars = {}
for line in f:
    for c in line:
        try:
            chars[c]+=1
        except:
            chars[c]=1
print chars

(Reading the entire file into a string also works, as another answer mentions, if your file is small enough to fit in memory.)

While it does work in this case, it's not a terribly good idea to use a raw except: unless you're actually trying to catch all possible errors. Instead, use except KeyError:.

What you're trying to do is pretty common, so there's a Python dictionary method and data type that can remove the try/except from your code entirely. Take a look at the setdefault method and the defaultdict type. With either, you can essentially specify that missing values start at 0.


Let's put a more pythonic way for PEP8's sake:

import collections 
with open(raw_input(), 'rb') as f:
    count = collections.Counter(f.read())
    print count

Batteries included! :)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜