开发者

Is this an acceptable pythonic idiom?

I have a class that assists in importing a special type of file, and a 'factory' class that allows me to do these in batch. The factory class uses a generator so the client can iterate through the importers. My question is, did I use the iterator correctly? Is this an acceptable idiom? I've just started using Python.

class FileParser:
  """ uses an open filehandle to do stuff """

class BatchImporter:
  def __init__(self, files):
    self.files=files

  def parsers(self):
    for file in self.files:
      try:
        fh = open(开发者_运维知识库file, "rb")
        parser = FileParser(fh)
        yield parser
      finally:
        fh.close()

  def verifyfiles(
  def cleanup(

---

importer = BatchImporter(filelist)
for p in BatchImporter.parsers():
  p.method1()
  ...


You could make one thing a little simpler: Instead of try...finally, use a with block:

with open(file, "rb") as fh:
    yield FileParser(fh)

This will close the file for you automatically as soon as the with block is left.


It's absolutely fine to have a method that's a generator, as you do. I would recommend making all your classes new-style (if you're on Python 2, either set __metaclass__ = type at the start of your module, or add (object) to all your base-less class statements), because legacy classes are "evil";-); and, for clarity and conciseness, I would also recomment coding the generator differently...:

  def parsers(self):
    for afile in self.files:
        with open(afile, "rb") as fh:
            yield FileParser(fh)

but neither of these bits of advice condemns in any way the use of generator methods!-)

Note the use of afile in lieu of file: the latter is a built-in identifier, and as a general rule it's better to get used to not "hide" built-in identifiers with your own (it doesn't bite you here, but it will in many nasty ways in the future unless you get into the right habit!-).


The design is fine if you ask me, though using finally the way you use it isn't exactly idiomatic. Use catch and maybe re-raise the exception (using the raise keyword alone, otherwise you mess the stacktrace up), and for bonus points, don't catch: but catch Exception: (otherwise, you catch SystemExit and KeyboardInterrupt).

Or simply use the with-statement as shown by Tim Pietzcker.


In general, it isn't safe to close the file after you yield a parser object that will try to read it. Consider this code:

parsers = list(BatchImporter.parsers())
for p in parsers:
    # the file object that p holds will already be closed!

If you're not writing a long-running daemon process, most of the time you don't need to worry about closing files -- they will all get closed when your program exits, or when the file objects are garbage-collected. (And if you use CPython, that will happen as soon as all references to them are lost, since CPython uses reference counting.)

Nevertheless, taking care to free resources is a good habit to acquire, so I would probably write the FileParser class this way:

class FileParser:
    def __init__(self, file_or_filename, closing=False):
        if hasattr(file_or_filename, 'read'):
            self.f = file_or_filename
            self._need_to_close = closing
        else:
            self.f = open(file_or_filename, 'rb')
            self._need_to_close = True

    def close(self):
        if self._need_to_close:
            self.f.close()
            self._need_to_close = False

and then BatchImporter.parsers would become

    def parsers(self):
        for file in self.files:
            yield FileParser(file)

or, if you love functional programming

    def parsers(self):
        return itertools.imap(FileParser, self.files)

An aside: if you're new to Python, I recommend you take a look at the Python style guide (also known as PEP 8). Two-space indents look weird.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜