SQLite ERD ( cost of cast vs union )
I have to save lot of (more than 50 000 ro开发者_JAVA百科ws) document - key - value
in a SQLite database.
My concern is: these values can be textual or numerical (there are dates, strings, numbers, etc).
I have to request my base in 2 ways:
- all values for document ####
- or with comparison operators
I thought about 2 possible solutions:
solution 1:
a single table:key(text) | type(one of text/date/float) | value(text)
and using explicit cast when I need comparison (for example,
SELECT * FROM mytable WHERE (CAST(value as float) < "2010-01-01 00:00:00") AND (type='date')
or
SELECT * FROM mytable WHERE (CAST(value as float) < 17.5) AND (type='float')
What I like in this solution:
- it is easy to get all values for
What I does not like in this solution:
- I have to make a lot of (potentially) expensive cast
solution 2:
three tables:- text values :
key(text) | value(text)
- float values :
key(text) | value(float)
- date values :
key(text) | value(datetime)
What I like in this solution:
- no more casts
What i does not like in this solution:
- getting all values for has the cost of 3 selects and a union
- I find this less elegant
the question
So comes the question which of the solutions do you recommend to me? why? do you have another solution to propose?From my experience, storing the value as a string is perfectly acceptable, and the most simple/straightforward, provided that...
- You never have to search based on the value itself (unable to use index)
- You don't process vast (thousands/millions) or records at once
The option of three tables and a UNION
has a limitation you may not have spotted: the value
column in the UNIONed view can't be different data-types. They will either get implicitly cast (based on the type of the field in the first SELECT of the UNION), or the view will simply fail to be created.
My variation on that idea would be to have three value fields in place of one. You can then add a seperate index for each, have just one table, avoid the need for CASTs and various other benefits. All at the expense of a little extra complexity in the table (I'd prefer it here rather than the queries using it anyway), but also at a significant increase in space used.
Key, Type, Value_INT, Value_FLOAT, Value_DATE
EDIT:
Oh, and a last option. Don't store the values as strings, but store them as floats. All three data-types that you mentioned can be stored as floats, allowing the use of an Index.
SELECT * FROM mytable WHERE (type='date') AND (value < CAST("2010-01-01 00:00:00" AS FLOAT))
or
SELECT * FROM mytable WHERE (type='float') AND (value < 17.5)
or
SELECT * FROM mytable WHERE (type='int') AND (value < 17)
精彩评论