File vs database for storage efficiency in chat app
I'm working on a simple AJAX chat add-on for my PHP application so I can provide real-time support to my users. I'm currently using a MySQL database storing the text, timestamp and user_id of the person who is chatting. I got to thinking about how I 开发者_JAVA技巧could optimize my chat and I got to thinking about removing the need for the SQL database.
My question, would it be more efficient to use fwrite()
to append extra data to a PHP file to store the same information rather than creating an SQL connection to retrieve new posts to the chat? I know how I would accomplish this effectively, I'm just trying to figure out which way would be more efficient.
I've looked a little bit at SQLite as well; would that be a better alternative than using a MySQL database?
Database Management Systems (DBMS) exists because it's not as easy as it seems to store and access data the right way.
Storing data in a file implies access concurrency issues. When the file will grow bigger, you will have to face important memory usage or write a lot of code to load just what you need. It will also be quite hard to do basic operations like filtering (SQL WHERE
clause) or updating a row. BTW, changing the data structure promises to be error-prone. I simpler words: you will have to write a lot of code and face a lot of bugs.
IMO, not using any kind of DBMS is recreating the wheel. However choosing the right one is important.
Ill throw my hat in the ring here even though this is an old question. For speed, reliability, and ease of use a DB is the obvious easy choice... With one major caveat that a lot of people are overlooking, and that is most shared hosts (the most common form of web hosting) only allow 15 or so connections at once, even VPS's usually only allow 100-200, dedicated being 500 or more. What that means is if you have (n) users pooling every (s) seconds those connections are going to get eaten up fast, even faster if you are also running any sort of CMS. Being in the middle of developing my own chat room code on a VPS I am also facing these issues myself.
So far my method has been this.
- Make sure to pass a lastMessageReceived variable to limit the response.
- If a public chatroom pass a timestamp filter as well as the above
- if at all possible use a DB caching engine such as MySQLnd with query caching enabled and TTL set to whatever your pooling rate is.
- Don't get crazy with your pooling rate 1-2 second intervals may seem neat and snappy, but it'll kill your connection count. Dropping that to 5s or even more wont really make a huge difference and users probably wont notice, and your server load will be much lighter. Even consider a variable pooling rate that raises itself during high load.
- Write your ajax to use timeout instead of interval for its pooling and place the timeout call in the ajax success callback, this prevents the requests from stacking up during peak ussage.
- And the biggest one, if using a shared chat room with many users, write your own code to cache the SQL query into a json file and serve that to ajax requests, and write some custom TTL code to check its age and re-populate it as needed during requests, CRON would be great here if your host allows. Age checking a file and redirecting an AJAX request to it is a higher level function with very little server overhead especially compared to querying a DB. And don't parse the file in PHP looking to filter out old messages, store the file with the first message in the filename such as
chat_243.json
and save it as already formatted json, then just serve up the entire file if a request comes into the php withlastMessageReceived = 243
. Since this will create several files you will need a function to clean up files older than (m) minutes, but this is also light duty work for the server.
There are also options like DB engines designed for chat and sockets (node.js), but those require more server tweaking than typical hosting accounts allow, for my purposes I have been writing my chat-room keeping in mind the idea that it may get deployed to a shared server at some point.
Here is the code I am using currently, and it has pretty much allowed me to expand my chat to an unlimited number of connection (within the servers bandwidth)
$cacheFile = 'cache/chat_'.$_GET['last'].'.json';
if (file_exists($cacheFile) && filemtime($cacheFile) + QUERY_REFRESH_RATE > time())
{
readfile($cacheFile);
} else {
require_once("../../../../wp-load.php");
$timestampMin = gmdate("Y-m-d H:i:s", (time() - 7200));
$sql= "/*qc=on*/" . "SELECT * FROM ". DB_TABLE ."chat_posts WHERE ID > ". $_GET['last'] . " AND timestamp > '".$timestampMin."' ORDER BY ID;";
$posts = $wpdb->get_results($sql);
$json = json_encode($posts);
echo $json;
file_put_contents($cacheFile,$json);
}
If you remove or archive old conversations, MySQL would be good choice. Dont forget place indexes on date and user, or session id, so that you can retrieve them fast.
I would stick with MySQL as it is more suited for multiple access from a web application than a simple file would be.
First, use persistent connections.
In using it, I would suggest using PDO if you are not already. (PDO option for persistent connections is PDO::ATTR_PERSISTENT => true)
Note: The default setting in PHP is to simply emulate a prepared statement. You want it to be a TRUE prepared statement so set PDO::ATTR_EMULATE_PREPARES to 0. (http://bugs.php.net/bug.php?id=54638)
Also: Prepared statements will not use the MySQL cache in versions before 5.1.17 and will not cache variable prepared statements prior to 5.1.21 so make sure you have a recent version of MySQL.
PDO provides more than a potential performance boost. You should look into if you have not already. More on PDO here: http://ca2.php.net/manual/en/book.pdo.php
Since the others have covered MySQL vs Text, I'd like to give you an alternative.
These kinds of applications are ideal for noSQL solutions like MongoDB. Since most likely you'll be polling the server over an interval, something with very fast read capability is a good idea.
You could also use something like Node.JS to tie it all together.
I had a similar consideration, and wrote this generic, simple and high performance solution, that you might like Rocket-store
Include the one file and the you can start inserting records right away.
It has three major methods: put, get, and delete (and some special methods and options settings)
Example:
// collection/table = "cars", key = "Mercedes"
$rs->post("cars", "Mercedes-Benz GT R", ["owner" => "Lisa Simpson",reg: "N3RD"]);
// Get all records from "cars" collection
$result = $rs->get("cars", "Mercedes*");
// Delete records in collection "cars" where keys match "*cede*"
$rs->delete("cars", ""*cede*"");
Rocket-store relies on the underlying filesystem cash, to optimize for speed, and can operate with millions of records.
Its easily 100 times faster than a relational database, on inserts. Typically 40.000 inserts/sec. You don't need to install anything but that one include file..
You get simple sequences, auto increment and other minor conveniences, but that about it. Nothing fancy.
I like it a lot, when ever I need a simple storage solutions, but I'm also very biased ;)
精彩评论