开发者

Why do web applications send HTML over the wire?

This question pertains to web applications. I have very little web app development experience, so might be missing some very obvious points/issues. Please point them out.

As I understand, in most web applications, a web server sends HTML over the wire to a client (browser). This happens every time a HTTP request is made. I feel this is very wasteful of bandwidth.

1) Since browsers can run JavaScript, why don't we just send a JavaScript program which can generate the webpage's HTML content (which the browser then renders).

2) Further a browser might cache the JavaScript program and next time the server only need send the data. The protocol might involve the browser sending the "program version" it has.

Consider an example of a relatively simple website Hacker News [http://news.ycombinator.com]. Let us separate the data (30 posts + their metadata) from its presentation. Assuming 1) above, the server can just send the data (say in JSON) + a JavaScript program to generate HTML. This gist shows the idea. The data for the 30 posts is in JSON [http://www.json.org/js.html] format. For this particular example the data transferred is cut in 1/2 (size of data+JavaScript / size of HTML). Further if browsers can do 2) above, it reduces the data transferred on each visit to 1/4 (size of data / size of HTML). [Note: this analysis is without considering compression; gzip,deflate is very successful in reducing the size of HTML. But isn't prevention better than cure?]

I see atleast the following advantages of this :-

* For most web pages, it will reduce the size of data transferred over the wire.

*开发者_开发技巧 Forces web apps to separate data from its presentation.

Disadvantages might include - more complex browsers, time to run the JavaScript program to generate HTML (this might get offset by the reduction in data size).

Now my question is - why are web applications not developed this way, or, why do web applications send HTML over the wire? Surely the web server (sending out HTML) doesn't care about HTML at all, so why should it, first, generate it, and then send it over the wire?


There are a few reasons, some of them historical this is by no means a complete list but just some of my experiences:

  1. HTML predates JS, and a lot of scripts and libraries predate JS
  2. Older browsers (think IE<=6) had rubbish, inconsistent JS engines, their rendering engines were much more consistent in how they treat HTML. So many more libraries and scripts predate consistent JS
  3. It is a nightmare to debug applications written as you suggest if they are not constructed right (we have one at my work, it takes 30 minutes to find where a piece of html is actually generated)
  4. It is a lot more work to do it right - why not use templates or static docs or something much simpler
  5. Its not really a problem - HTML compresses really well
  6. What you suggest is done - its called AJAX (OK, so ajax is more general than this but you all know what i mean)
  7. It simply doesn't work for most plain-text user agents including those used by most search engines. If this page is serving most of your content, its generally a good idea to make it easy for Google to parse


Well the obvious reason on why this is the case is that JavaScript wasn't around when we started sending HTML around, and HTML was an improvement to sending around plaintext documents.

The reason we don't do this now: we eschew complex solutions to problems that aren't really problems.

Average internet connections download nearly 1M bytes per second, and web browsers are quite adept at parsing and starting to render this HTML before it's even all ready to be. They're also great at parallelizing the downloading of resources on the page. If we want to save a few bytes at the cost of some compute cycles, we gzip content before sending it. Problem solved.

And for the record, we do this with AJAX in complex webpages (checkout Github's source browsing for a great example of how awesome this can be).


What you suggest can, and is, done. Remember, web pages used to be static documents. Full blown web-based applications are a relatively recent idea.

I might also suggest that it isn't necessarily more efficient, especially when your pages are sent gzipped.


What you suggest is basically what a JavaScript full stack framework like ExtJS does. You can create rich, data intensive applications without writing any HTML -- well, only enough to reference the necessary .js libraries. The complex DOM needed for layouts, grids, forms etc is all created by the framework.


The simple answer is that HTML is older. Why is C99 not fully implemented with a lot of compilers? They figure 1989 is new enough for them. Also, JavaScript exercises a lot more control over people's browsers than they seem to want. Conditional statements and encoded data pose a security concern, and some people want to keep that can of worms closed to begin with. True, HTML is a very inefficient markup, but the size is insignificant compared to the images you download from the internet. That favicon takes up as much data as the page itself, and it's only 16 pixels across.


A good reason that the server-side code of a web application might do lots of HTML template work on the server side is that in many server environments it's not made easy to bundle up server-side data structures (object graphs) for easy delivery to the client. There may be information kept in server-side data structures that really shouldn't be delivered out to the client. Thus in order to send out a "pure" data-only response, the server would have to trim off sensitive data before delivering out the JSON. That's not an unsolvable problem, but I don't know of many server frameworks that facilitate a solution.

The server has direct, unfettered access to the database and to everything else that makes an application work: user preferences, history, account details, system settings, etc. To build an application that's client-centric for rendering purposes would mean concocting ways of keeping all that information intact and up-to-date on the client. For a lot of applications, that might not be terribly easy.

Finally, it's only relatively recently that it would make sense to trust a browser to provide a stable enough platform for building a long-lived "application environment" as a continually-updating web page. By building a web app such that pages are sometimes completely reloaded, there are lots of little "reboots". That's a cheap and dumb way of keeping a lid on at least some kinds of memory leaks.


Most implementations of sites with heavy Javascript use won't start executing until the DOM has fully loaded; then you'll get every page with 'loading screens' when the page wrapper has downloaded, but none of the content has.

Also, do remember that not all users have Javascript enabled, and not all browsers support high-level Javascript (think mobiles).


I would send HTML in a response if I wanted my application to work without Javascript. I would write HTML rendering code in my server-side language (most of the time not Javascript), which could then be used for two purposes: serving whole HTML pages, and serving bits of HTML in response to XHRs.

If the Javascript code is restricted to things like reporting UI events and replacing innerHTML with server-generated code, I don't have to duplicate any of my application logic across languages/frameworks. This duplication problem is one of the reasons why server-side Javascript is getting people excited.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜