开发者

Background processing video uploads, what is a proficient way in PHP?

I am developing a video upload site and I have ran into a dilemma: videos uploaded need to be converted into the FLV format in order to be displayed to a visitor but, if I execute the command within the script, the script will hang for about 10-15 minutes while the FFMPEG converts the video.

I had an idea to insert a record in to the databa开发者_如何转开发se indicating the file needs to be processed, then using a cron job set to every 5 minutes to select records from the database which needs to be processed, process them, then update the database showing they have been processed. My worry about this is executing too many processes and the server crashing under the strain, so has anyone got any solutions to this or a way to better the process I have in mind?


Okay, this is now what I have in mind, so the user uploads a video and a row is inserted in to the database indicating the video needs to be processed. A cron job set to every 5 minutes checks what needs to be processed and what is being processed, say I would make a maximum of five processes at one time, so the script would check if any video needs to be processed and how many videos are being processed, if it is less then five, it updates the record indicating that it is being processed, once the video has been processed, it updates the record indicating it has been processed and the cron job starts again, any thoughts?


Gearman is a good solution for this kind of problem, it lets you instantly dispatch a job and have any number of workers (which may be on different servers) available to fulfill it.

To start with you can run a few workers on the same server, but if you start to run into load issues then you can just fire up another server with some more workers, so it's horizontally scalable.


If you're using PHP-FPM then you can make use of fastcgi_finish_request() as documented on PHP.net. FastCGI Process Manager (FPM)

fastcgi_finish_request() - special function to finish request and flush all data while continuing to do something time-consuming (video converting, stats processing etc.);

If you're not using PHP-FPM or want something more advanced then you might consider using a queue manager like Gearman which is perfectly suited to the scenario you're describing. The advantage of using Gearman over running a process with shell_exec is you can take a look at how many jobs are running / how many are left and check their statuses. You also make scaling much easier as it's now trivial to add job servers:

$worker->addServer("10.0.0.1"); 


I love this class (see the specific comment) in the PHP manual: http://www.php.net/manual/en/function.exec.php#88704

Basically, it lets you spin off a background process on *Nix systems. it returns a pid, which you can store in the session. When you reload the page to check on it, you simply recreate the ForkedProcess class with the saved pid, and you can check on it's status. If it's complete, the process should be done.

It doesn't allow for much error checking, but it's incredibly lightweight.


If you expect a lot of traffic you should seriously consider a dedicated server.

On a single server, you can use shell_exec along with the UNIX nohup command to get the PID of the process.

   function run_in_background($Command, $Priority = 0)
   {
       if($Priority)
           $PID = shell_exec("nohup nice -n $Priority $Command 2> /dev/null & echo $!");
       else
           $PID = shell_exec("nohup $Command 2> /dev/null & echo $!");
       return($PID);
   }
   function is_process_running($PID)
   {
       exec("ps $PID", $ProcessState);
       return(count($ProcessState) >= 2);
   }

A full description of this technique is here: http://nsaunders.wordpress.com/2007/01/12/running-a-background-process-in-php/

You could perhaps put the list of PIDs in a MySQL table and then use your cron job every 5 mins to detect when a video is complete and update the relevant values in the database.


You can call ffmpeg use system and send the output to /dev/null, this will make that call return right away, effectively handling it in the background.


Spawn couple of worker processes which will consume messages from message queue like for example beanstalkd. This way you can control the number of concurrent tasks(conversions) and also don't have to pay price of spawning processes(because processes keep running in background).

I think it would be even a lot faster if you used/coded C and used Redis as your message queue. Redis has a very good c client library named Hiredis. I don't think this would be insanely difficult to accomplish.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜