开发者

What sort of Java container for long-running processes on EC2?

I'm a long-time client-side (Swing) developer and I operated pretty much by myself in the same job for a long time. Working from home in a vacuum, I was pretty much completely isolated from the community. I recently took a position as a server-side Java guy for a startup, and I'm learning a ton of stuff but I'm the only Java person and am pretty much on my own again. Having never done server-side Java before, so much of this stuff is completely new and I feel like I have no idea what the normal best-practices are, or I don't have an intuitive feel for what tools to use for what jobs. I keep reading and reading various Internet sources (SO is awesome!) trying to bulk up my knowledg开发者_C百科e, but some things seem hard to search for because they don't have any obvious keywords. Hopefully some of you gurus here can point me in the right direction.

I'm in charge of implementing our backend REST service, which for now supports our website and an iPhone app. We're doing a social media site, eventually with many different clients. Currently the only clients of the service are our own website and our own iPhone app. I'm using Jersey, Spring, Tomcat, and RDS (Amazon's MySQL) on Amazon's EC2 platform. Our media storage is via S3. I've picked up all of these things pretty quickly and so far so good -- things are working fine with the website and the iPhone app. Cool.

Our next step is adding some long-running server-side processing. This processing is basically CPU-intensive stuff that doesn't involve any communication until it's done. I'm trying to figure out what the best way to handle this is. I'm thinking of using Amazon's SQS to queue up jobs in response to the REST events that should trigger them, but I can't figure out how I should handle the dequeuing and processing. I know I need some threads somewhere that take jobs off the SQS queue and process them, and then tell the REST service that the job is done. But where do these threads live?

  1. In a plain "java -jar jobconsumer.jar" process on another EC2 instance that starts a small thread pool. Maybe use Spring to wire up this piece and start it running?

  2. In a webapp deployed in a container like Tomcat on another EC2 instance? I don't really know what benefits I would get from this, but somehow running in a container like this seems more stable? Does this sort of container even really support long-running processing loops, or is it just good at responding to HTTP events?

Now that I write it out like that, I don't really see why I would want to use a container. It just seems like an over-complication. However, the Java community seems so centered on these types of containerized, "managed" environments that to not use a container seems somehow wrong. I feel like maybe I'm not understanding what some of the major benefits of these containers are? I mean, beyond the obvious benefits of the web-facing Servlet and JSP specs. Would any of the functionality of those specs help me out with something like this?


For a regular Java web app, you almost certainly want to be using one of the Servlet containers such as Tomcat - it takes care of accepting connections, parsing and serialising HTTP messages, JSPs, SSL, authentication, etc for you.

For a non-web app, the argument for using Tomcat (or similar) is weaker, but there are a few reasons to still consider it:

  • straightforward to add JSPs for querying and managing the app or add a web API in future
  • easy distribution of releases (one .war vs. an unholy mess of jars and config files)
  • hot deployment (although I've yet to see anyone using this for anything serious)

In terms of long-running processing loops, Servlet containers don't help you out beyond notifying your ServletContextListener when the app starts, so you can kick off any long-running tasks.

It's worth noting that if you're already using Spring, it's relatively easy to switch from a stand-alone app to a container using ContextLoaderListener, so it shouldn't be a problem if you decide later that you need the web stuff.


We recently faced a similar question, as we are hosting a large distributed service on EC2.

In short, we are very happy with Jetty 7 as a container. We use it for our user-facing-www, public-api, and internal-backend-api services. In some cases we use it for non-api services such as a workqueue, simply to expose a bit of status & health info for our monitoring.

The great thing about Jetty (any version) is that it can be configured in ~5 lines of code, with zero external config files etc. It's not a container specifically, but an http server that you can embed.

We use Guice for dependency injection, which also favors config-file-less implementations.

Long lived Java processes are nothing to worry about - you basically bring up your servers / threads / threadpools in your main method and don't call System.exit until you want to shutdown explicitly.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜