Product Analytical data in TXT files (using YAML)
I开发者_StackOverflow社区 am currently developing an ecommerce software using PHP/MySQL for a big company. There are two options for me to get some specificed data:
- DB (for getting huge data, such as PRODUCTS, CATEGORIES, ORDERS, etc.)
- TXT (using YAML -for getting analytical data and some options)
For instance, when a user go to product details page I need to get those TXT files:
- Product summary file (product_hit, quantity_sold, etc.) -approximately max. 90KB
- Langauge and Settings file (such as company_name, translations for template) -approximately max. 300KB
- May be one more file (I don't know right know) -assume that 100KB.
I want to use this way, because data is easily readable by human and portable between programming languages. In addition, if I use DB, I need to connect a couple of tables. But these files GET THEM TOGETHER.
My txt file looks like (YAML):
product_id: 1281
quantity_sold: 12 #item(s)
hit: 1105
hit_avarage: 92 #quantity_sold/hit
vote: 2
...
But, still I am not sure about speed and performance. Using TXT files are good idea? Should I really use this way instead of DB?
As you can't partially include and parse a YAML file, you'll have to parse the file as a whole, which means that you'll have an incredible performance hit. You can compare this to selecting all rows from a database and then looping over them to find the one that you're looking for, instead of just typing a WHERE condition. So yes, a database is much faster to accomplish what you ask.
Please do take a look at Document Based Databases though, you don't necessarily have to use a relational database. In fact, when looking at the example of the YAML file, I think using a "no SQL" database would be a better alternative.
Cheers.
I love YAML and think it's great for smaller amounts of data, but the dimensions you mention are better dealt with using a database. It's faster, and data can be indexed - in a file based scenario, you would have to walk through the whole file to find something.
Use the YAML approach. The data structure suggests that they are tantamount to fixed data / configuration settings. And if you cannot reasonably do the calculations within the database, then don't attempt to.
You could however convert your fixed data from YAML to CSV, and import them within the database into a temporary table. If and only if calculating everything there is feasible.
Cannot say anything about performance. Technically reading file data is as slow as having the database read disk sectors, and the difference between YAML parsing and column splitting might not be significant. You'll have to test that.
YAML is 'human-readable data serialization format'.
Serialization is a process of converting in-memory structures into format that can be written, possibly transmitted and read into the in-memory structures.
Database management systems are programs that help control data management from creation through processing, including
- security
- scalability
- concurrency
- data integrity (atomicity, consistency, isolation and durability)
- performance
- availability
YAML does not provide tools and integrated environment that take care of the above and if you want to use it as a principal data store you either need to isolate all of the above challenges away from this particular scenario that would use YAML as principal data management system (or reinvent the wheels to certain extent, sooner or later).
I would imagine that no "e-commerce system for a big company" would want to sacrifice any of the above listed features for human readability.
精彩评论