开发者

Should I use a class in this: Reading a XML file using lxml

This question is in continuation to my previous question, in which I asked about passing around an ElementTree.

I need to read the XML files only and to s开发者_如何学运维olve this, I decided to create a global ElementTree and then parse it wherever required.

My question is:

Is this an acceptable practice? I heard global variables are bad. If I don't make it global, I was suggested to make a class. But do I really need to create a class? What benefits would I have from that approach. Note that I would be handling only one ElementTree instance per run, the operations are read-only. If I don't use a class, how and where do I declare that ElementTree so that it available globally? (Note that I would be importing this module)

Please answer this question in the respect that I am a beginner to development, and at this stage I can't figure out whether to use a class or just go with the functional style programming approach.


There are a few reasons that global variables are bad. First, it gets you in the habit of declaring global variables which is not good practice, though in some cases globals make sense -- PI, for instance. Globals also create problems when you on purpose or accidentally re-use the name locally. Or worse, when you think you're using the name locally but in reality you're assigning a new value to the global variable. This particular problem is language dependent, and python handles it differently in different cases.

class A:
   def __init__(self):
      self.name = 'hi'

x = 3
a = A()

def foo():
   a.name = 'Bedevere'
   x = 9

foo()
print x, a.name #outputs 3 Bedevere

The benefit of creating a class and passing your class around is you will get a defined, constant behavior, especially since you should be calling class methods, which operate on the class itself.

class Knights:
   def __init__(self, name='Bedevere'):
       self.name = name
   def knight(self):
       self.name = 'Sir ' + self.name
   def speak(self):
       print self.name + ":", "Run away!"

class FerociousRabbit:
   def __init__(self):
       self.death = "awaits you with sharp pointy teeth!"
   def speak(self):
       print "Squeeeeeeee!"

def cave(thing):
   thing.speak()
   if isinstance(thing, Knights):
       thing.knight()

def scene():
   k = Knights()
   k2 = Knights('Launcelot')
   b = FerociousRabbit()
   for i in (b, k, k2):
      cave(i)

This example illustrates a few good principles. First, the strength of python when calling functions - FerociousRabbit and Knights are two different classes but they have the same function speak(). In other languages, in order to do something like this, they would at least have to have the same base class. The reason you would want to do this is it allows you to write a function (cave) that can operate on any class that has a 'speak()' method. You could create any other method and pass it to the cave function:

class Tim:
   def speak(self):
       print "Death awaits you with sharp pointy teeth!"

So in your case, when dealing with an elementTree, say sometime down the road you need to also start parsing an apache log. Well if you're doing purely functional program you're basically hosed. You can modify and extend your current program, but if you wrote your functions well, you could just add a new class to the mix and (technically) everything will be peachy keen.


Pragmatically, is your code expected to grow? Even though people herald OOP as the right way, I found that sometimes it's better to weigh cost:benefit(s) whenever you refactor a piece of code. If you are looking to grow this, then OOP is a better option in that you can extend and customise any future use case, while saving yourself from unnecessary time wasted in code maintenance. Otherwise, if it ain't broken, don't fix it, IMHO.


I generally find myself regretting it when I give in to the temptation to give a module, for example, a load_file() method that sets a global that the module's other functions can then use to find the file they're supposed to be talking about. It makes testing far more difficult, for example, and as soon as I need two XML files there is a problem. Plus, every single function needs to check whether the file's there and give an error if it's not.

If I want to be functional, I simply therefore have every function take the XML file as an argument.

If I want to be object oriented, I'll have a MyXMLFile class whose methods can just look at self.xmlfile or whatever.

The two approaches are more or less equivalent when there's just one single thing, like a file, to be passed around; but when the number of things in the "state" becomes larger than a few, then I find classes simpler because I can stick all of those things in the class.

(Am I answering your question? I'm still a big vague on what kind of answer you want.)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜