开发者

Load-balancing problem while implementing async HTTP with REST

I've read about asynchronous HTTP and REST and how can you return a resource after POST-ing to initiate a long running task and then polling that resource to get the task's status. I wonder what happens if I have two machines that are load-balanced and one will receive the initial POST and the other a subsequent GET. The second will not be aware of what the first one started, unless I use a common storage 开发者_运维技巧to keep the task's state. How can I prevent that if I want to keep the state only on the client?


When you did the POST to initiate the long running task, you really should return a URI in the location header to point to the running task. e.g.

POST /LongTasks
=>
201 Created
Location: /RunningTask/233


GET /RunningTask/233
=> 
Content-Type: text/plain

InProgress

At that point, you have a URL to the resource that represents the running task. At that point the load balancing issue is the same as any other. You cannot create state on the server that is accessible by the client that does not have a URI.

However, as long as those two loadbalancing servers are directly accessible you could do

POST http://example.org/LongTasks
=>
201 Created
Location: http://serverA.example.org/RunningTask/233


GET http://serverA.example.org/RunningTask/233
=> 
Content-Type: text/plain

InProgress


What you are referring to is generally called client affinity. Basically, you cookie the client so that the load balancer knows which farmed application server to send a request to. Since the get and post will propagate the cookie, the queries for one user will always go to the same server. You can learn more about some of the config (using Apache as the reverse proxy to Tomcat) here: http://docs.codehaus.org/display/JETTY/Configuring+mod_proxy

That said, using a shared storage is often lighter if you don't have a significant farm of backend processors. For example, with a few machines, using memcached as a lightweight storage for the status info is not a bad idea, and one that I have used successfully for both session data and status data.

Note also that using a reverse proxy solves the SSL issue (where you can't see the cookie with a hardware load balancer because of the encryption). The RP encodes/decodes and proxies to a backend server. Apache's mod_proxy is a common choice, though nginx is up and coming. You can also alternatively use IP-based affinity. However, I learned that was a bad idea the hard way once, when I realized that the entirety of a very large urban school system reads as one IP because of their filtering system :)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜