Python - making counters, making loops?

2023-01-02 12:54 问答作者：

I am having some trouble with a piece of code below:

Input: li is a nested list as below:

li = [['>0123456789 mouse gene 1\n', 'ATGTTGGGTT/CTTAGTTG\n', 'ATGGGGTTCCT/A\n'],   ['>9876543210 mouse gene 2\n', 'ATTTGGTTTCCT\n', 'ATTCAATTTTAAGGGGGGGG\n']]

Using the function below, my desired output is simply the 2nd to the 9th digits following '>' under the condition that the number of '/' present in the entire sublist is > 1.

Instead, my code gives the digits to all entries. Also, it gives them multiple times. I therefore assume something is wrong with my counter and my 开发者_C百科for loop. I can't quite figure this out.

Any help, greatly appreciated.

import os

cwd = os.getcwd()


def func_one():
    outp = open('something.txt', 'w')       #output file
    li = []
    for i in os.listdir(cwd):           
        if i.endswith('.ext'):
            inp = open(i, 'r').readlines()
            li.append(inp)
    count = 0
    lis = []
    for i in li:
        for j in i:
            for k in j[1:]          #ignore first entry in sublist
                if k == '/':
                    count += 1
                if count > 1:
                    lis.append(i[0][1:10])      
                    next_func(lis, outp)

Thanks, S :-)

Your indentation is possibly wrong, you should check count > 1 within the for j in i loop, not within the one that checks every single character in j[1:].

Also, here's a much easier way to do the same thing:

def count_slashes(items):
    return sum(item.count('/') for item in items)

for item in li:
    if count_slashes(item[1:]) > 1:
        print item[0][1:10]

Or, if you need the IDs in a list:

result = [item[0][1:10] for item in li if count_slashes(item[1:]) > 1]

Python list comprehensions and generator expressions are really powerful tools, try to learn how to use them as it makes your life much simpler. The count_slashes function above uses a generator expression, and my last code snippet uses a list comprehension to construct the result list in a nice and concise way.

Tamás has suggested a good solution, although it uses a very different style of coding than you do. Still, since your question was "I am having some trouble with a piece of code below", I think something more is called for.

How to avoid these problems in the future

You've made several mistakes in your approach to getting from "I think I know how to write this code" to having actual working code.

You are using meaningless names for your variables which makes it nearly impossible to understand your code, including for yourself. The thought "but I know what each variable means" is obviously wrong, otherwise you would have managed to solve this yourself. Notice below, where I fix your code, how difficult it is to describe and discuss your code.

You are trying to solve the whole problem at once instead of breaking it down into pieces. Write small functions or pieces of code that do just one thing, one piece at a time. For each piece you work on, get it right and test it to make sure it is right. Then go on writing other pieces which perhaps use pieces you've already got. I'm saying "pieces" but usually this means functions, methods or classes.

Fixing your code

That is what you asked for and nobody else has done so.

You need to move the count = 0 line to after the for i in li: line (indented appropriately). This will reset the counter for every sub-list. Second, once you have appended to lis and run your next_func, you need to break out of the for k in j[1:] loop and the encompassing for j in i: loop.

Here's a working code example (without the next_func but you can add that next to the append):

>>> li = [['>0123456789 mouse gene 1\n', 'ATGTTGGGTT/CTTAGTTG\n', 'ATGGGGTTCCT/A\n'],   ['>9876543210 mouse gene 2\n', 'ATTTGGTTTCCT\n', 'ATTCAATTTTAAGGGGGGGG\n']]
>>> lis = []
>>> for i in li:
        count = 0
        for j in i:
            break_out = False
            for k in j[1:]:
                if k == '/':
                    count += 1
                if count > 1:
                    lis.append(i[0][1:10])
                    break_out = True
                    break
            if break_out:
                break

>>> lis
['012345678']

Re-writing you code to make it readable

This is so you see what I meant in the beginning of my answer.

>>> def count_slashes(gene):
    "count the number of '/' character in the DNA sequences of the gene."
    count = 0
    dna_sequences = gene[1:]
    for sequence in dna_sequences:
        count += sequence.count('/')
    return count
>>> def get_gene_name(gene):
    "get the name of the gene"
    gene_title_line = gene[0]
    gene_name = gene_title_line[1:10]
    return gene_name
>>> genes = [['>0123456789 mouse gene 1\n', 'ATGTTGGGTT/CTTAGTTG\n', 'ATGGGGTTCCT/A\n'],   ['>9876543210 mouse gene 2\n', 'ATTTGGTTTCCT\n', 'ATTCAATTTTAAGGGGGGGG\n']]
>>> results = []
>>> for gene in genes:
        if count_slashes(gene) > 1:
            results.append(get_gene_name(gene))

>>> results
['012345678']
>>>

import itertools
import glob

lis = []
with open('output.txt', 'w') as outfile:
    for file in glob.iglob('*.ext'):
        content = open(file).read()
        if content.partition('\n')[2].count('/') > 1:
            lis.append(content[1:10])
            next_func(lis, outfile)

The reason you digits to all entries, is because you're not resetting the counter.

继续阅读：counter loops python

Python - making counters, making loops?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？