What is a convenient way to store and retrieve boolean values in a CSV file
If I store a boolean value using the CSV modu开发者_JS百科le, it gets converted to the strings True
or False
by the str()
function. However, when I load those values, a string of False
evaluates to being True
because it's a non-empty string.
I can work around it by manually checking the string at read time with an IF statement to see what the string is, but it's somewhat less than elegant. Any better ideas, or is this just one of those things in the programming world?
Ways to store boolean values in CSV files
- Strings: Two common choices are
true
andfalse
,True
andFalse
, but I've also seenyes
andno
. - Integers:
0
or1
- Floats:
0.0
or1.0
Let's compare the respective advantages / disadvantages:
- Strings:
+
A human can read it-
CSV readers will have it as a string and both will evaluate to "true" whenbool
is applied to it
- Integers:
+
CSV readers might see that this column is integer andbool(0)
evaluates to false.+
A bit more space efficient-
Not totally clear that it is boolean
- Floats:
+
CSV readers might see that this column is integer andbool(0.0)
evaluates to false.-
Not totally clear that it is boolean+
Possible to have null (as NaN)
The Pandas CSV reader shows the described behaviour.
Convert Bool strings to Bool values
Have a look at mpu.string.str2bool
:
>>> str2bool('True')
True
>>> str2bool('1')
True
>>> str2bool('0')
False
which has the following implementation:
def str2bool(string_, default='raise'):
"""
Convert a string to a bool.
Parameters
----------
string_ : str
default : {'raise', False}
Default behaviour if none of the "true" strings is detected.
Returns
-------
boolean : bool
Examples
--------
>>> str2bool('True')
True
>>> str2bool('1')
True
>>> str2bool('0')
False
"""
true = ['true', 't', '1', 'y', 'yes', 'enabled', 'enable', 'on']
false = ['false', 'f', '0', 'n', 'no', 'disabled', 'disable', 'off']
if string_.lower() in true:
return True
elif string_.lower() in false or (not default):
return False
else:
raise ValueError('The value \'{}\' cannot be mapped to boolean.'
.format(string_))
There's almost always a simpler (and often more convenient) way, and most of the time it is already included in Python's amazing standard library.
In this case you can just use True
and False
, then parse them with ast
's literal_eval
method.
Let's say we have a file called test.csv
with the following content:
nonetype,integer,boolean
None,123,False
We also have a test.py
script to parse the file:
import ast
import csv
with open('test.csv', 'r') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
nonetype = ast.literal_eval(row['nonetype'])
integer = ast.literal_eval(row['integer'])
boolean = ast.literal_eval(row['boolean'])
print(type(nonetype), type(integer), type(boolean))
When we run the script we can see that the values get parsed as expected.
$ python3 test.py
<class 'NoneType'> <class 'int'> <class 'bool'>
This method is also safe to use, even with input from untrusted sources, more info on the docs.
use the int() function to covert the boolean to their int values and then store those. that being said, eli-bendersky's comment above is worth noting.
I'm not sure if answering your own question is bad form or not, but here's the solution I've come up with. It basicaly consists of hiving off that pesky IF statement I was talking about into a function.
def setyesNo(value):
if value: return 'Yes'
else: return 'No'
def checkYesNo(text):
if text == 'Yes': return True
else: return False
Then in my dictWriter do this.
for item in mylist:
writer.writerow( {'Is Cool' : setYesNo(item.is_cool),
.....
})
And in dictReader.
for line in reader:
item MyObject(is_Cool=checkYesNo(line['Is Cool']),
.....
)
mylist.append(item)
精彩评论