开发者

What's the best practice for handling single-value tuples in Python?

I am using a 3rd party library function which reads a set of keywords from a file, and is supposed to return a tuple of values. It does this correctly as long as there are at least two keywords. However, in the case where there is only one keyword, it returns a raw string, not a tuple of size one. This is particularly pernicious because when I try to do something like

for keyword in library.get_keywords():
    # Do something with keyword

, in the case of the single keyword, the for iterates over each character of the string in succession, which throws no exception, at run-time or otherwise, but is nevertheless completely useless to me.

My question is two-fold:

Clearly this is a bug in the library, which is out of my control. How can I best work around it?

Secondly, in general, if I am writing a function that returns a tuple, what is the best practice for ensuring tuples with one element are correctly generated? For example, if I have

def tuple_maker(values):
    my_tuple = (values)
    return my_tuple

for val in tuple_maker("a string"):
    print "Value was", val

for val in tuple_maker(["str1", "str2", "str3"]):
    print "Value was", val

I get

Value was a
Value was  
Value was s
Value was t
Value was r
Value was i
Value was n
Value was g
Value was str1
Value was str2
Value was str3

What is the best way to modify the function my_tuple to actually return a tuple when there is only a single element? Do I explicitly need to check whether the size is 1, and create the tuple seperately, using the (value,) syntax? This implies that any function that has the possibility of returning a single-valued tuple must do this, which seems hacky and repetitive.

Is there开发者_如何学C some elegant general solution to this problem?


You need to somehow test for the type, if it's a string or a tuple. I'd do it like this:

keywords = library.get_keywords()
if not isinstance(keywords, tuple):
    keywords = (keywords,) # Note the comma
for keyword in keywords:
    do_your_thang(keyword)


For your first problem, I'm not really sure if this is the best answer, but I think you need to check yourself whether the returned value is a string or tuple and act accordingly.

As for your second problem, any variable can be turned into a single valued tuple by placing a , next to it:

>>> x='abc'
>>> x
'abc'
>>> tpl=x,
>>> tpl
('abc',)

Putting these two ideas together:

>>> def make_tuple(k):
...     if isinstance(k,tuple):
...             return k
...     else:
...             return k,
... 
>>> make_tuple('xyz')
('xyz',)
>>> make_tuple(('abc','xyz'))
('abc', 'xyz')

Note: IMHO it is generally a bad idea to use isinstance, or any other form of logic that needs to check the type of an object at runtime. But for this problem I don't see any way around it.


Your tuple_maker doesn't do what you think it does. An equivalent definition of tuple maker to yours is

def tuple_maker(input):
    return input

What you're seeing is that tuple_maker("a string") returns a string, while tuple_maker(["str1","str2","str3"]) returns a list of strings; neither return a tuple!

Tuples in Python are defined by the presence of commas, not brackets. Thus (1,2) is a tuple containing the values 1 and 2, while (1,) is a tuple containing the single value 1.

To convert a value to a tuple, as others have pointed out, use tuple.

>>> tuple([1])
(1,)
>>> tuple([1,2])
(1,2)


The () have nothing to do with tuples in python, the tuple syntax uses ,. The ()-s are optional.

E.g.:

>>> a=1, 2, 3
>>> type(a)
<class 'tuple'>
>>> a=1,
>>> type(a)
<class 'tuple'>
>>> a=(1)
>>> type(a)
<class 'int'>

I guess this is the root of the problem.


There's always monkeypatching!

# Store a reference to the real library function
really_get_keywords = library.get_keywords

# Define out patched version of the function, which uses the real
# version above, adjusting its return value as necessary
def patched_get_keywords():
    """Make sure we always get a tuple of keywords."""
    result = really_get_keywords()
    return result if isinstance(result, tuple) else (result,)

# Install the patched version
library.get_keywords = patched_get_keywords

NOTE: This code might burn down your house and sleep with your wife.


Rather than checking for a length of 1, I'd use the isinstance built-in instead.

>>> isinstance('a_str', tuple)
False
>>> isinstance(('str1', 'str2', 'str3'), tuple)
True


Is it absolutely necessary that it returns tuples, or will any iterable do?

import collections
def iterate(keywords):
    if not isinstance(keywords, collections.Iterable):
        yield keywords
    else:
        for keyword in keywords:
            yield keyword


for keyword in iterate(library.get_keywords()):
    print keyword


for your first problem you could check if the return value is tuple using

type(r) is tuple
#alternative
isinstance(r, tuple)
# one-liner
def as_tuple(r): return [ tuple([r]), r ][type(r) is tuple]

the second thing i like to use tuple([1]). think it is a matter of taste. could probably also write a wrapper, for example def tuple1(s): return tuple([s])


There is an important thing to watch out for when using the tuple() constructor method instead of the default type definition for creating your single-string tuples. Here is a Nose2/Unittest script you can use to play with the problem:

#!/usr/bin/env python
# vim: ts=4 sw=4 sts=4 et
from __future__ import print_function
# global
import unittest
import os
import sys
import logging
import pprint
import shutil

# module-level logger
logger = logging.getLogger(__name__)

# module-global test-specific imports
# where to put test output data for compare.
testdatadir = os.path.join('.', 'test', 'test_data')
rawdata_dir = os.path.join(os.path.expanduser('~'), 'Downloads')
testfiles = (
    'bogus.data',
)
purge_results = False
output_dir = os.path.join('test_data', 'example_out')


def cleanPath(path):
    '''cleanPath
    Recursively removes everything below a path

    :param path:
    the path to clean
    '''
    for root, dirs, files in os.walk(path):
        for fn in files:
            logger.debug('removing {}'.format(fn))
            os.unlink(os.path.join(root, fn))
        for dn in dirs:
            # recursive
            try:
                logger.debug('recursive del {}'.format(dn))
                shutil.rmtree(os.path.join(root, dn))
            except Exception:
                # for now, halt on all.  Override with shutil onerror
                # callback and ignore_errors.
                raise


class TestChangeMe(unittest.TestCase):
    '''
        TestChangeMe
    '''
    testdatadir = None
    rawdata_dir = None
    testfiles   = None
    output_dir  = output_dir

    def __init__(self, *args, **kwargs):
        self.testdatadir = os.path.join(os.path.dirname(
            os.path.abspath(__file__)), testdatadir)
        super(TestChangeMe, self).__init__(*args, **kwargs)
        # check for kwargs
        # this allows test control by instance
        self.testdatadir = kwargs.get('testdatadir', testdatadir)
        self.rawdata_dir = kwargs.get('rawdata_dir', rawdata_dir)
        self.testfiles = kwargs.get('testfiles', testfiles)
        self.output_dir = kwargs.get('output_dir', output_dir)

    def setUp(self):
        '''setUp
        pre-test setup called before each test
        '''
        logging.debug('setUp')
        if not os.path.exists(self.testdatadir):
            os.mkdir(self.testdatadir)
        else:
            self.assertTrue(os.path.isdir(self.testdatadir))
        self.assertTrue(os.path.exists(self.testdatadir))
        cleanPath(self.output_dir)

    def tearDown(self):
        '''tearDown
        post-test cleanup, if required
        '''
        logging.debug('tearDown')
        if purge_results:
            cleanPath(self.output_dir)

    def tupe_as_arg(self, tuple1, tuple2, tuple3, tuple4):
        '''test_something_0
            auto-run tests sorted by ascending alpha
        '''
        # for testing, recreate strings and lens
        string1 = 'string number 1'
        len_s1 = len(string1)
        string2 = 'string number 2'
        len_s2 = len(string2)
        # run the same tests...
        # should test as type = string
        self.assertTrue(type(tuple1) == str)
        self.assertFalse(type(tuple1) == tuple)
        self.assertEqual(len_s1, len_s2, len(tuple1))
        self.assertEqual(len(tuple2), 1)
        # this will fail
        # self.assertEqual(len(tuple4), 1)
        self.assertEqual(len(tuple3), 2)
        self.assertTrue(type(string1) == str)
        self.assertTrue(type(string2) == str)
        self.assertTrue(string1 == tuple1)
        # should test as type == tuple
        self.assertTrue(type(tuple2) == tuple)
        self.assertTrue(type(tuple4) == tuple)
        self.assertFalse(type(tuple1) == type(tuple2))
        self.assertFalse(type(tuple1) == type(tuple4))
        # this will fail
        # self.assertFalse(len(tuple4) == len(tuple1))
        self.assertFalse(len(tuple2) == len(tuple1))

    def default_test(self):
        '''testFileDetection
        Tests all data files for type and compares the results to the current
        stored results.
        '''
        # test 1
        __import__('pudb').set_trace()
        string1 = 'string number 1'
        len_s1 = len(string1)
        string2 = 'string number 2'
        len_s2 = len(string2)
        tuple1 = (string1)
        tuple2 = (string1,)
        tuple3 = (string1, string2)
        tuple4 = tuple(string1,)
        # should test as type = string
        self.assertTrue(type(tuple1) == str)
        self.assertFalse(type(tuple1) == tuple)
        self.assertEqual(len_s1, len_s2, len(tuple1))
        self.assertEqual(len(tuple2), 1)
        # this will fail
        # self.assertEqual(len(tuple4), 1)
        self.assertEqual(len(tuple3), 2)
        self.assertTrue(type(string1) == str)
        self.assertTrue(type(string2) == str)
        self.assertTrue(string1 == tuple1)
        # should test as type == tuple
        self.assertTrue(type(tuple2) == tuple)
        self.assertTrue(type(tuple4) == tuple)
        self.assertFalse(type(tuple1) == type(tuple2))
        self.assertFalse(type(tuple1) == type(tuple4))
        # this will fail
        # self.assertFalse(len(tuple4) == len(tuple1))
        self.assertFalse(len(tuple2) == len(tuple1))
        self.tupe_as_arg(tuple1, tuple2, tuple3, tuple4)
# stand-alone test execution
if __name__ == '__main__':
    import nose2
    nose2.main(
        argv=[
            'fake',
            '--log-capture',
            'TestChangeMe.default_test',
        ])

You will notice that the (nearly) identical code calling tuple(string1,) shows as type tuple, but the length will be the same as the string length and all members will be single characters.

This will cause the assertions on lines #137, #147, #104 and #115 to fail, even though they are seemingly identical to the ones that pass.

(note: I have a PUDB breakpoint in the code at line #124, it's an excellent debug tool, but you can remove it if you prefer. Otherwise simply pip install pudb to use it.)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜