Sqlite vs File based data storing?
Say I have a like class like so:
class User
attr_accessor :name, :age
def initialize(name, age)
@name, @age = name, age
end
end
Now, would it be faster to save the users as marshaled insta开发者_StackOverflownces of the User class in individual files or to use an Sqlite database with an ORM? What are the disadvantages of a file based data store?
Here are the results of a benchmark, performed on an SSD. Interpret them as you will. For very simple queries and data, marshalling and loading the entire dataset into memory appears faster:
Rehearsal ---------------------------------------------------------------
Storing in DB 0.080000 0.000000 0.080000 ( 0.085909)
Marshalling to Disk 0.010000 0.000000 0.010000 ( 0.004340)
Fetching marshal 0.000000 0.000000 0.000000 ( 0.002288)
Fetching records from DB 5.530000 0.130000 5.660000 ( 5.657053)
Fetching records from Array 0.350000 0.000000 0.350000 ( 0.347798)
Find one record from DB 0.320000 0.020000 0.340000 ( 0.336068)
Find one record from Array 0.260000 0.000000 0.260000 ( 0.258766)
------------------------------------------------------ total: 6.700000sec
user system total real
Storing in DB 0.080000 0.000000 0.080000 ( 0.079717)
Marshalling to Disk 0.000000 0.000000 0.000000 ( 0.002595)
Fetching marshal 0.000000 0.000000 0.000000 ( 0.001466)
Fetching records from DB 10.830000 0.230000 11.060000 ( 11.041669)
Fetching records from Array 0.340000 0.000000 0.340000 ( 0.335473)
Find one record from DB 0.320000 0.010000 0.330000 ( 0.336917)
Find one record from Array 0.260000 0.000000 0.260000 ( 0.255746)
Here's the benchmark:
require 'benchmark'
require 'sequel'
class User
attr_reader :name, :age
def initialize(name, age)
@name, @age = name, age
end
def to_hash; {name:@name, age:@age}; end
end
db_array = 1000.times.map{ User.new "name#{rand 1000}", rand(1000) }
db_array << User.new( "unique", 42 )
DBFILE = 'users.db'; MARSHAL = 'users.marshal'
File.delete(DBFILE) if File.exists?(DBFILE)
DB = Sequel.sqlite(DBFILE)
DB.create_table(:users){ column(:name,:string); column(:age,:int) }
db_users = DB[:users]
Benchmark.bmbm do |x|
x.report('Storing in DB'){ db_users.multi_insert db_array.map(&:to_hash) }
x.report('Marshalling to Disk'){ File.open(MARSHAL, 'w'){ |f| f << Marshal.dump(db_array) } }
x.report('Fetching marshal'){ db_array = Marshal.load(File.open(MARSHAL,'r'){|f| f.read }) }
query = db_users.select{ name > "name500" }
x.report('Fetching records from DB'){ 1000.times{ query.all } }
x.report('Fetching records from Array'){ 1000.times{ db_array.select{ |u| u.name > "name500" } } }
x.report('Find one record from DB'){ 1000.times{ db_users[name:'unique'] } }
x.report('Find one record from Array'){ 1000.times{ db_array.find{ |u| u.name == "unique" } } }
end
Storing marshalled objects has the disadvantage that your marshalled data will possibly not be compatible with future changes to the ruby class. So you would probably end up persisting basic structures like Hash or Array to the files. If you are at that point, using SQLite is the better option.
I think it depends on what operations you would like to do: if you want to just read everything from file, without performing searches/selecting single instance and things like that, using file is better (you just have to read it and rebuild instances).
If you want any type of access different from cascade reading, use a database (they are program optimized to write/read file as fast as is possible allowing also operations of that type ;) )
There is another little problem to consider: I don't know how ruby performs and handles file (maybe reading from file is slower because of the parser), I think you can ask this on ruby forum but I suppose that reading a file from beginning to end will not be a problem
I would use SQLite with the DataMapper ORM ( http://datamapper.org/ ).
I think that storing your users in individual file would be difficult to manage. Querying a SQLite database with DataMapper is very simple.
精彩评论