开发者

Numerical computing environment on cloud? [ Undergrad Project ]

I am a computer science undergraduate currently in my final year. As my final year project, I am thinking of creating a matlab-like numerical computing environment as SAAS that supports matrix manipulations, plotting of functions and data, image processing operations etc. The project is going to be created in Java + Scala. Scala will be used for application's DSL. Rest of the application is going to be programmed in Java.

I was thinking of implementing this system on google app engine so that we could parallelize various algorihms across a number of servers and thus obtain faster results. However I do not have any prior experience with web development (except some simple sites in PHP).

So I had the following key questions:

  1. First of all does it make sense to have an application like matlab hosted on cloud?
  2. How easy or dif开发者_如何学Pythonficult it would be to write such an application over google app engine, considering my limited experience with web development?
  3. Can you please point me to some already existing projects that parallelize mathematical, graph and image processing algorithms.

I know the question is very much subjective but I still request you all not to close it as I am very much confused regarding my project and need some expert advice.

Any hep would be greatly appreciated!

Thanks!


About half a year ago I've thought about making such thing.

Thoughts ended up with nothing except some code at http://code.google.com/p/metaplasm...

In fact, the tricky thing with GAE is that computation must be sliced into thirty secods slices with no shared memory (only memcache and database). After you're accomplish that, everything else will go smooth :-)


App Engine probably isn't the right platform for this. App Engine is targeted at web applications where each request does a modest amount of computation, but you need to service a lot of them - most traditional webapps, such as social networking sites, blogs, web-based games, and so on and so forth. It isn't targeted at services that need to do intensive computation for a single user request, and while it has services to do parallel background processing, they're asynchronous, which is probably also not what you want for your use-case.

What I would recommend is looking at other cloud environments, such as Amazon's EC2, for the processing power and parallelism you need. App Engine would still do an admirable job as a frontend for such a service, though! For example, you could use an App Engine app to manage jobs, dispatch them to backends, and turn up and down VM instances as required by load.


This absolutely makes sense, and there are two existing projects that run numerical routines in the cloud.

Biocep (free, runs R & Scilab on EC2 or Eucalyptus) and Monkey Analytics (commercial, runs R, Octave or Python on EC2).


Why not try BOINC opensource distributed computing system ?

http://boinc.berkeley.edu/

It allows multiple platforms, multiple hosting environments and services all kind of numerical computation jobs depending on parallel environments.

Moreover, You don't need any web development knowledge. You need to just create a new project in BOINC and try running it in existing volunteer computing environment.


You might encounter issues with this type of service on GAE as it's quite restrictive on what you are allowed to do in the sandbox. From the GAE Docs

An App Engine application cannot:

  • spawn a sub-process or thread. A web request to an application must be handled in a single process within a few seconds. Processes that take a very long time to respond are terminated to avoid overloading the web server.

This could make it tricky to offer the types of services you describe. The scaling that GAE offers enables you to grow the number of requests you can handle but doesn't really offer you good tools for scaling the CPU resources for a single request.

Sounds like an interesting idea for a project though, good luck.


It makes little sense to me to write the rest in Java. That's precisely where I think Scala would make the most difference.


I'm hosting my Java math online demo on Google appengine. This non parallelized demo of course hits the Google Appengine quota limits for time expensive requests.

But with the help of the appengine-mapreduce library you can parallelize your mathematical algorithms and avoid these limits.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜