开发者

Why use MySQL over flatfiles?

A friend and I were debating about whether he should use MySQL or a flatfile database for his website's backend. I told him to go with MySQL because it was structured, held records well, and was consistent. He开发者_JS百科 on the other hand said that he would rather go for speed. Reading files is a lot quicker than connecting to MySQL and it made me wonder whether he was right. For example, why not just create a folder for each table, like so: users/ groups/ posts/, within the folders have the files named by ID (1, 2, 3) and then for the data use a format like so: username: John\npassword: e2fc714c4727ee9395f324cd2e7f331f\nemail: example@example.com?

In other words, what are the advantages of MySQL over flatfiles?


In other words, what are the advantages of MySQL over flatfiles?

MySQL offers indexes and joins (for execution performance), transactions (for data integrity) and SQL (for development performance).

It your project involves just a 3-line self-sufficient text file, you don't need MySQL.


Reading files is a lot quicker than connecting to MySQL and it made me wonder whether he was right.

Hobcobbles. A database like mySQL stores its data in files as well, but features tons of optimizations, most obviously its indexing capabilities, allowing for huge performance increases compared to reading (or writing) a big flat file.

Flat files may be faster in certain very limited cases, but a database engine uses the experience of generations of developers working on making data access faster, and more reliable. Just think about race conditions and locking when two instances of your script try to write data into the database, for example.

If the amount of data used exceeds a few lines in a CSV file - or doesn't happen to be easily manageable in files like for example the pages of a Wiki - go with a database. It adds a layer of complication, but saves you a lot of headache.

Just think about doing a SELECT * FROM posts WHERE MONTH(post_date) = "2010-03-10" on a flat file quickly and what is necessary to write from scratch to achieve that.


What please is a "flatfile database"? A flat file is a flat file - nam it like this. SAying it is a flat file database makes you think it magically has some of the features of a database - which flat files per definition do not have.

what are the advantages of MySQL over flatfiles?

Skip MySQL here - the main question you ask is "why use a database at all".

I suggest you look into perfomance (sewarch operations - indices are there for a reason) and look up the term "ACID conditions" to get an even vague idea what a database actually DOES.

Flat files do not give you any guarantee, and decades of developers have prooven all the problems they have over and over again.


There is also the matter of security. If you don't properly protect the flat files, they can be much more easily exposed. Especially if you are storing user information, there is no barrier to entry around the flat files.

Assuming your website or application grows vertically, flat files also do not scale, because the larger the flat files get the longer they take to read.

And lastly, using flat files when it's already so easy to use databases is quite simply a hack. It's not doing things the "right way" in that EVERYONE ELSE uses databases, so I would argue the opposite: Why use flat files over MySQL? Is someone else coming into to maintain your application after the fact going to understand or agree with your decision to use flat files?


We need a bit more of context.

If your friend is reading complete pages (Stored ad "blobs" in the DB) then yes, using MySql is not much of help. If he has granular data (including, I don't know, blog posts, newsitems, images with metadata, order details) then unless the site is very skimpy and very static, a file-based approach will soon become too limited.

You proposed solution has two big drawbacks:

Using folders/filenames is the same as having just one index on each table (in this case, the filename) so searching for any other criteria will take ages. Not to mention the fact that having lots of files in a single directory will start taxing the OS.

On top of that, security-by-filename is a bit of a security risk, even if you use the hashed pwd as part of the URL.

I did some filesystem-based medium-size applications in the past (due to mismanaged requirements we couldn't use a DB) and this is fun, but really very limiting as soon you go over a few hundreds files. And even with small numbers, you have to start pulling tricks from the start to have any hope to keep the thing working.


Just an example: consider that you have 1,000,000 customers, with address info and you need to search and set of customers that live in NY. If you stored each customer in separate file, than you would need to read all 1,000,000 files and see if a customer belongs to the state. If you stored all records in one huge file - you would need to read whole file and iterate to find all customers from NY.

In both cases you loose.

In case of RDBMS like MySql - you would use so called "set" operation or SELECT statement, with addition of indexes, the engine would probably only read 10/20% more data than needed to find all customers from NY.

Hope this helps


Also, without storing all the user informaiton inside the Posts/ folder, how do you get all the posts written by John Doe (for example)? In SQL it's just a joined select statement. With flat files, you either have to store the information inside the actual post file, or write the code to perform the join & search operations on your own.


Data redundancy and a lack of atomicity are big problems in flat file databases which manifest exponentially the more data it is required to hold and introduce latency in queries and other problems such as update/ delete/ insertion anomalies.

The Relational Data Model with Normalisation helps to negate these problems, by ensuring atomicity and that each record is uniquely identifiable (First Normal Form), that each field in a table is functionally dependant on the primary key (Second Normal Form) and that non-key fields don't share transitive dependencies on other fields in the table (Third Normal Form).

The Relational Data Model is by no means the only way of doing it, perhaps not even the best, but it certainly attempts to address the problems of query latency and anomalies inherent in flat files.


Mysql has some advantage compare with flatfile, file structure is poor for query , but CRUD in file is rapid than mysql, you can use no-sql databases such as mongo db to have better structure and more speed , there are some difference between sql and no-sql databases but I think its better to use no-sql db instead of flatfile, also be aware if you work on bigdata no-sql db is better than sql surely..

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜