python newbie question: converting code to classes

2023-01-09 10:24 问答作者：

i have this code:

import csv
import collections

def do_work():
      (data,counter)=get_file('thefile.csv')
      b=samples_subset1(data, counter,'/pythonwork/samples_subset3.csv',500)
      return

def get_fil开发者_运维知识库e(start_file):

        with open(start_file, 'rb') as f:
            data = list(csv.reader(f))
            counter = collections.defaultdict(int)

            for row in data:
              counter[row[10]] += 1
            return (data,counter)

def samples_subset1(data,counter,output_file,sample_cutoff):

      with open(output_file, 'wb') as outfile:
          writer = csv.writer(outfile)
          b_counter=0
          b=[]
          for row in data:
              if counter[row[10]] >= sample_cutoff:
                 b.append(row) 
                 writer.writerow(row)
                 b_counter+=1
      return (b)

i recently started learning python, and would like to start off with good habits. therefore, i was wondering if you can help me get started to turn this code into classes. i dont know where to start.

Per my comment on the original post, I don't think a class is necessary here. Still, if other Python programmers will ever read this, I'd suggest getting it inline with PEP8, the Python style guide. Here's a quick rewrite:

import csv
import collections

def do_work():
    data, counter = get_file('thefile.csv')
    b = samples_subset1(data, counter, '/pythonwork/samples_subset3.csv', 500)

def get_file(start_file):
    with open(start_file, 'rb') as f:
        counter = collections.defaultdict(int)
        data = list(csv.reader(f))

        for row in data:
            counter[row[10]] += 1

    return (data, counter)

def samples_subset1(data, counter, output_file, sample_cutoff):
    with open(output_file, 'wb') as outfile:
        writer = csv.writer(outfile)
        b = []
        for row in data:
            if counter[row[10]] >= sample_cutoff:
                b.append(row) 
                writer.writerow(row)

    return b

Notes:

No one uses more than 4 spaces to indent ever. Use 2 - 4. And all your levels of indentation should match.
Use a single space after the commas between arguments to functions ("F(a, b, c)" not "F(a,b,c)")
Naked return statements at the end of a function are meaningless. Functions without return statements implicitly return None
Single space around all operators (a = 1, not a=1)
Do not wrap single values in parentheses. It looks like a tuple, but it isn't.
b_counter wasn't used at all, so I removed it.
csv.reader returns an iterator, which you are casting to a list. That's usually a bad idea because it forces Python to load the entire file into memory at once, whereas the iterator will just return each line as needed. Understanding iterators is absolutely essential to writing efficient Python code. I've left data in for now, but you could rewrite to use an iterator everywhere you're using data, which is a list.

Well, I'm not sure what you want to turn into a class. Do you know what a class is? You want to make a class to represent some type of thing. If I understand your code correctly, you want to filter a CSV to show only those rows whose row[ 10 ] is shared by at least sample_cutoff other rows. Surely you could do that with an Excel filter much more easily than by reading through the file in Python?

What the guy in the other thread suggested is true, but not really applicable to your situation. You used a lot of global variables unnecessarily: if they'd been necessary to the code you should have put everything into a class and made them attributes, but as you didn't need them in the first place, there's no point in making a class.

Some tips on your code:

Don't cast the file to a list. That makes Python read the whole thing into memory at once, which is bad if you have a big file. Instead, simply iterate through the file itself: for row in csv.reader(f): Then, when you want to go through the file a second time, just do f.seek(0) to return to the top and start again.
Don't put return at the end of every function; that's just unnecessary. You don't need parentheses, either: return spam is fine.

Rewrite

import csv
import collections

def do_work():
    with open( 'thefile.csv' ) as f:
        # Open the file and count the rows.
        data, counter = get_file(f)
        
        # Go back to the start of the file.
        f.seek(0)

        # Filter to only common rows.
        b = samples_subset1(data, counter, 
            '/pythonwork/samples_subset3.csv', 500)
   
     return b

def get_file(f):
    counter = collections.defaultdict(int)
    data = csv.reader(f)
    
    for row in data:
        counter[row[10]] += 1

    return data, counter

def samples_subset1(data, counter, output_file, sample_cutoff):
    with open(output_file, 'wb') as outfile:
        writer = csv.writer(outfile)
        b = []
        for row in data:
            if counter[row[10]] >= sample_cutoff:
                b.append(row) 
                writer.writerow(row)

    return b

继续阅读：class csv python

python newbie question: converting code to classes

Rewrite

更多精彩内容

精彩评论

最新问答

产后抑郁症的症状有什么？

黑神话悟空黄风大圣是好人吗?？

喜境牛奶光美肤仪适合出差旅行使用吗？？

当贝f3放到4米之外会模糊吗？

永劫无间高价皮肤有哪些?？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？

Rewrite

更多精彩内容

精彩评论

最新问答

产后抑郁症的症状有什么？

黑神话悟空黄风大圣是好人吗?？

喜境牛奶光美肤仪适合出差旅行使用吗？？

当贝f3放到4米之外会模糊吗？

永劫无间高价皮肤有哪些?？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生 新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？