Testing, deploying and updating PHP site
Quite soon I will be releasing my website project and, before it sees the light of a day, I'd like to prepare some kind of "updating model". I am using Debian with Apache, PHP 5.3 and MySQL (latest I think), not installed as one package but rather separately.
I came up with my simple idea of carrying out the process, so please take a look and point out weak spots:开发者_StackOverflow
- Testing - I found somewhere that it is common practice to deploy a test version of site to a beta.mysite.com subdomain an carry out tests from there. The tests would use the same database as the live site. After an initial release, each new Testing candidate would be a separate branch (merged upon deployment, still know nothing about branching anyway).
- Deploying - If everything works correctly on the beta phase, copy and overwrite the old version of page.
Problems I can spot immediately:
- Tests would be fine as long as the database remains unchanged. What to do if it gets changed after all?
- I would like to be the update as transparent as possible, without any Maintenance Modes nor anything, but I am afraid that copying the files will result in problems in that matter.
Anything else that can be a problem here, or maybe is there completely better way of doing the thing?
1. Testing
Usually, you never test something on the production environment and especially production database for three reasons:
Performance: testing may be CPU-intensive and waste other precious resources of your servers. Since you don't want to reduce the performance of the production environment during tests, you should not use production environment for that.
Data protection: you don't want to alter the data in your production database during tests. This means that not only your tests may have a limited range (i.e. you will probably think twice before testing some bug which might destroy all data related to your actual users, letting the bug to be exploited later by a hacker), but you may accidentally alter data by running an untested code on your production database.
Security: if you are in the context of a company and have a team, you probably assign the work related to tests to a dedicated tester. Giving to this tester an access to the production environment is not a good idea for security reasons.
1.1 Test environment
Test environment must be as similar as possible to production environment. For example, if you test an application you ship for Windows XP with .NET Framework 3.5, you shouldn't test it under Windows 7 with .NET Framework 4, because things will work during tests and fail once your customers start using the application.
Example: once, I worked on an application which used NTFS Alternate Data Streams. Everything worked perfectly well both during development and during tests, where nobody thought about the fact that in 2009, FAT32 is still alive. Of course, once in production, a customer installed the application on a FAT32 formatted flash drive, and it crashed.
Note that this doesn't mean that you should use the same environment during development.
In case of databases, things are different. The database engine and version must be the same, and the schema must be the same (same tables, same constraints, etc.), but data should be different in most cases, test database being filled with some random data, not related to the data you have in production.
1.2 Database: testing bottlenecks
For example, if a website is released recently, you don't have a large amount of data. If the database contains the list of registered users, you'll have only a few users at the beginning. On the other hand, you would probably need to test not only if your application works, but also if performances are correct and what are the bottlenecks. In this case, you'll need to test it with large amounts of data: this way, you can have a few thousands of users in production environment, and billions of randomly generated user accounts in test environment.
1.3 Database: testing the correctness of the output
Another case when you want your test data to be different from the production data is to avoid HTML injection and to verify if the output is correct. If you have an e-commerce website, you have an SQL table Products
, and every product has a title which will be displayed on the website. In test environment, you should have products with names like:
1. A very long name of a product goes here. Oh, this name is really huge!
2. javascript:alert('<a>&\é<%щ你好')
3.
4. '; delete * from Users
Those names will ensure that:
- Long names are displayed correctly,
- Names are escaped correctly, unicode is supported and the encoding is correct,
- Empty names don't break the layout,
- SQL injection is avoided but without escape during output (in case when the title can be changed through the website).
If you start filling the production database with this sort of things, your users will probably think that your website is broken or hacked.
2. Deploying
Everything depends on the actual number of requests per second.
2.1 Small websites
If your website is small and not used too often, you really shouldn't care about update workflow. Copying changed source files can take less than a second, since those files are small. If it really bother you, you can schedule the copying of files at the time of day when there are few visitors. For most small websites, updating source code in the middle of the night should be fine.
Also, there is nothing wrong at displaying the message that between 4 A.M. and 5 A.M. the server may go offline. Working sometimes at night, I often see those messages for example at my bank website, where they probably need, for security reasons, to go completely offline once per week during maintenance or other scheduled tasks.
2.2 Server farms
If your website is large and has thousands of requests per second, you probably have a server farm. In this case, the update process shouldn't be a problem, since the servers will go offline one by one, update themselves, then return to the farm.
Using the same database is risky, as you might accidentally delete, modify or otherwise enter faulty information in the DB which then end up on your production site.
You are correct, copying the files over could take time and could potentially break the site while copying. A better option would be to use rsync so only changed files are copied, which would be quicker.
Even better, use symlinks. Point the web server at a symlink that points to the "production" directory. Do the same thing for the test/beta directory. When the testing is done, point the production symlink at the test directory, which would now would become your production directory.
You can name your production directories by date and/or version. If there is a problem, you can fail back by just repointing the symlink. Keep as many previous versions as you like. Swapping versions would be nearly instantaneous.
Note, there are still possibilities for issues while repointing the symlink. The only way around this is to have multiple web servers in a load balanced setup.
精彩评论