Object oriented design?
I'm trying to learn object oriented programming, but am having a hard time overcoming my structured programming background (mainly C, but many others over time). I thought I'd write a simple check register program as an exercise. I put something together pretty quickly (python is a great language), with my data in some global variables and with a bunch of functions. I can't figure out if this design can be improved by creating a number of classes to encapsulate some of the data and functions and, if so, how to change the design.
My data is basically a list of accounts ['checking', 'saving', 'Amex'], a list of categories ['food', 'shelter', 'transportation'] and lists of dicts that represent transactions [{'date':xyz, 'cat':xyz, 'amount':xyz, 'description':xzy]. Each account has an associated list of dicts.
I then have functions at the account level (create-acct(), display-all-accts(), etc.) and the transaction level (display-entries-in-account(), enter-a-transaction(), edit-a-transaction(), display-entries-between-dates(), etc.)
The user sees a list of accounts, then can choose an account and see the underlying transactions, with ability to add, delete, edit, etc. the accounts and transactions.
I currently im开发者_如何学Pythonplement everything in one large class, so that I can use self.variable throughout, rather than explicit globals.
In short, I'm trying to figure out if re-organizing this into some classes would be useful and, if so, how to design those classes. I've read some oop books (most recently Object-Oriented Thought Process). I like to think my existing design is readable and does not repeat itself.
Any suggestions would be appreciated.
You don't have to throw out structured programming to do object-oriented programming. The code is still structured, it just belongs to the objects rather than being separate from them.
In classical programming, code is the driving force that operates on data, leading to a dichotomy (and the possibility that code can operate on the wrong data).
In OO, data and code are inextricably entwined - an object contains both data and the code to operate on that data (although technically the code (and sometimes some data) belongs to the class rather than an individual object). Any client code that wants to use those objects should do so only by using the code within that object. This prevents the code/data mismatch problem.
For a bookkeeping system, I'd approach it as follows:
- Low-level objects are accounts and categories (actually, in accounting, there's no difference between these, this is a false separation only exacerbated by Quicken et al to separate balance sheet items from P&L - I'll refer to them as accounts only). An account object consists of (for example) an account code, name and starting balance, although in the accounting systems I've worked on, starting balance is always zero - I've always used a "startup" transaction to set the balanaces initially.
- Transactions are a balanced object which consist of a group of accounts/categories with associated movements (changes in dollar value). By balanced, I mean they must sum to zero (this is the crux of double entry accounting). This means it's a date, description and an array or vector of elements, each containing an account code and value.
- The overall accounting "object" (the ledger) is then simply the list of all accounts and transactions.
Keep in mind that this is the "back-end" of the system (the data model). You will hopefully have separate classes for viewing the data (the view) which will allow you to easily change it, depending on user preferences. For example, you may want the whole ledger, just the balance sheet or just the P&L. Or you may want different date ranges.
One thing I'd stress to make a good accounting system. You do need to think like a bookkeeper. By that I mean lose the artificial difference between "accounts" and "categories" since it will make your system a lot cleaner (you need to be able to have transactions between two asset-class accounts (such as a bank transfer) and this won't work if every transaction needs a "category". The data model should reflect the data, not the view.
The only difficulty there is remembering that asset-class accounts have the opposite sign from which you expect (negative values for your cash-at-bank mean you have money in the bank and your very high positive value loan for that company sports car is a debt, for example). This will make the double-entry aspect work perfectly but you have to remember to reverse the signs of asset-class accounts (assets, liabilities and equity) when showing or printing the balance sheet.
Not a direct answer to your question but O'Reilly's Head First Object-Oriented Analysis and Design is an excellent place to start.
Followed by Head First Design Patterns
"My data is basically a list of accounts"
Account is a class.
"dicts that represent transactions"
Transaction appears to be a class. You happen to have elected to represent this as a dict.
That's your first pass at OO design. Focus on the Responsibilities and Collaborators.
You have at least two classes of objects.
There are many 'mindsets' that you could adopt to help in the design process (some of which point towards OO and some that don't). I think it is often better to start with questions rather than answers (i.e. rather than say, 'how can I apply inheritance to this' you should ask how this system might expect to change over time).
Here's a few questions to answer that might point you towards design principles:
- Are other's going to use this API? Are they likely to break it? (info hiding)
- do I need to deploy this across many machines? (state management, lifecycle management)
- do i need to interoperate with other systems, runtimes, languages? (abstraction and standards)
- what are my performance constraints? (state management, lifecycle management)
- what kind of security environment does this component live in? (abstraction, info hiding, interoperability)
- how would i construct my objects, assuming I used some? (configuration, inversion of control, object decoupling, hiding implementation details)
These aren't direct answers to your question, but they might put you in the right frame of mind to answer it yourself. :)
Rather than using dicts to represent your transactions, a better container would be a namedtuple from the collections module. A namedtuple is a subclass of tuple which allows you to reference it's items by name as well as index number.
Since you may possibly have thousands of transactions in your journal lists, it pays to keep these items as small and light-weight as possible so that processing, sorting, searching, etc. is as fast and responsive as possible. A dict is a fairly heavy-weight object compared to a namedtuple which takes up no more memory than an ordinary tuple. A namedtuple also has the added advantage of keeping it's items in order, unlike a dict.
>>> import sys
>>> from collections import namedtuple
>>> sys.getsizeof((1,2,3,4,5,6,7,8))
60
>>> ntc = namedtuple('ntc', 'one two three four five six seven eight')
>>> xnt = ntc(1,2,3,4,5,6,7,8)
>>> sys.getsizeof(xnt)
60
>>> xdic = dict(one=1, two=2, three=3, four=4, five=5, six=6, seven=7, eight=8)
>>> sys.getsizeof(xdic)
524
So you see that's almost 9 times saving in memory for an eight item transaction. I'm using Python 3.1, so your milage may vary.
精彩评论