开发者

Critical code parts in php application?

Okay, in my head this is somewhat complicated and I hope I can explain it. If anything is unclear please comment, so I can refine the question.

I want to handle user file uploads to a 3rd server.

So we have

  • the User
  • the the website (server where the website runs on)
  • the storage server (which recieves the file)

The flow should be like:

  1. The Website requests an upload url from the storage clouds gateway, that points directly to the final storage server (something like http://serverXY.mystorage.com/upload.php). Along with the request a "target path" (website specific and globally uni开发者_如何学JAVAque) and a redirect url is sent.

  2. the Website generates an upload form with the storage servers upload url as target, the user selects a file and clicks the submit button. The storage server handles the post request, saves the file to a temporary location (which is '/tmp-directory/'.sha1(target-path-fromabove)) and redirects the back to the redirect url that has been specified by the website. The "target path" is also passed.

  3. I do not want any "ghosted files" to remain if the user cancels the process or the connection gets interrupted or something! Also entries in the websites database that have not been correctly processed int he storage cloud and then gets broken must be avoided. thats the reason for this and the next step

  4. these are the critical steps

    • The website now writes en entry to its own database, and issues a restful request to the storage api (signed, website has to authenticate with secret token) that
    • copies the file from its temporary location on the storage server to its final location (this should be fast because its only a rename)
    • the same rest request also inserts a database row in the storage networks database along with the websites id as owner
  5. All files in tmp directory on the storage server that are older than 24 hours automatically get deleted.

If the user closes the browser window or the connection gets interrupted, the program flow on the server gets aborted too, right? Only destructors and registered shutdown functions are executed, correct?

Can I somehow make this code part "critical" so that the server, if it once enters this code part, executes it to teh end regardless of whether the user aborts the page loading or not?

(Of course I am aware that a server crash or an error may interrupt at any time, but my concerns are about the regular flow now)

One of me was to have a flag and a timestamp in the websites database that marks the file as "completed" and check in a cronjob for old incompleted files and delete them from the storage cloud and then from the websites database, but I would really like to avoid this extra field and procedure.

I want the storage api to be very generic and use it in many other future projects.

I had a look at Google storage for developers and Amazon s3.

They have the same problem and even worse. In amazon S3 you can "sign" your post request. So the file gets uploaded by the user under your authority and is directly saved and stored and you have to pay it. If the connection gets interrupted and the user never gets back to your website you dont even know. So you have to store all upload urls you sign and check them in a cronjob and delete everything that hasnt "reached its destination".

Any ideas or best practices for that problem?


If I'm reading this correctly, you're performing the critical operations in the script that is called when the storage service redirects the user back to your website.

I see two options for ensuring that the critical steps are performed in their entirety:

  1. Ensure that PHP is ignoring connection status and is running scripts through to completion using ignore_user_abort().
  2. Trigger some back-end process that performs the critical operations separately from the user-facing scripts. This could be as simple as dropping a job into the at queue if you're using a *NIX server (man at for more details) or as complex as having a dedicated queue management daemon, much like the one LrdCasimir suggested.

The problems like this that I've faced have all had pretty time-consuming processes associated with their operation, so I've always gone with Option 2 to provide prompt responses to the browser, and to free up the web server. Option 1 is easy to implement, but Option 2 is ultimately more fault-tolerant, as updates would stay in the queue until they could be successfully communicated to the storage server.

The connection handling page in the PHP manual provides a lot of good insights into what happens during the HTTP connection.


I'm not certain I'd call this a "best practice" but a few ideas on a general approach for this kind of problem. One of course is to allow the transaction of REST request to the storage server to take place asynchronously, either by a daemonized process that listens for incoming requests (either by watching a file for changes, or a socket, shared memory, database, whatever you think is best for IPC in your environment) or a very frequently running cron job that would pick up and deliver the files. The benefits of this are that you can deliver a quick message to the User who uploaded the file, while the background process can try, try again if there's a connectivity issue with the REST service. You could even go as far as to have some AJAX polling taking place so the user could get a nice JS message displayed when you complete the REST process.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜