开发者

Quantify the semantic value of <p> as opposed to <div>

I'm transforming some XML, which I have no control over, to XHTML. The XML schema defines a <para> tag for paragraphs and <unordered-list> and <ordered-list> for lists.

Frequently in this XML, I find lists nested within paragraphs开发者_运维知识库. So, a straight-forward transformation causes <ul>s to get nested within <p>s, which is illegal in XHTML.

I've created a list of ways to deal with it and here are the most obvious:

  1. Just don't worry about it. The browsers will do fine. Who cares. (I don't like this option, but it's an option!)
  2. Write a fancy-pants component to my transform that makes sure all <para> tags get closed before unordered lists start, and re-opened afterward. (I like this option the most, but it's complicated due to multiple levels of nesting, and we may not have the budget for this)
  3. Just transform <para> to <div> and set the margins on the divs so it looks like a paragraph in the browser. This is the easiest solution that emits valid XHTML, but it takes from the semantic value of the markup.

My questions are:

  • how much value do I lose if I go with option 3?
  • Does it really matter?
  • What is the actual effect on the user experience?
  • If you can cite references, please do (this is easy to speculate on). For example, I was thinking it might affect search results from a Google Search Appliance that we are using.
  • If search terms appear in divs, do they carry less weight?
  • Or is there less of an association between them and preceding header tags?

How can I find this out?


I've come up against this too.

Personally, I consider it a grave mistake on part of the standard that a p cannot contain lists. I think it's typographically legal, so it should be legal in what was originally intended to be a markup for text.

I may be flamed for this, but XHTML has crashed and burned in the real world, regardless of whether it was a good idea or not. The often horrible tag soup that is today's HTML markup will continue to survive for a goodly long time, if only because bad markup and lenient browsers will continue to perpetuate each other forever.

Thus, I tend to go with Option 1.

Option 3 is also viable, in my opinion. While I don't have proof, I'm pretty sure no search engine is crazy enough to actually put any trust in most of the formatting tags we apply to our HTML. meta and a tags are obvious exceptions, of course.


First of all, unless you set every CSS property available now plus every one possibly available in the future, then you can't guarantee your <div> will match up, WRT styles, with <p>. (Though I agree you can get close and this is probably good enough, but read on.) I don't know of any visual browsers or other tools that would seriously treat them differently, but this is just as much an artifact, IMHO, of the current widespread loose interpretation on the web, as it is of them being close in meaning.

Is <ul> the right transformation for every <unordered-list> in your source data? If they are always displayed as block-level content instead of 1) an, 2) inline, 3) list; then that's a safe bet. If so, you can break the paragraph into two (and wrap the whole thing in <div> if you like).

Example input:

<para>Yadda yadda: <unordered-list/> And so fin.</para>

Output:

<div>
<p>Yadda yadda:</p>
<ul/>
<p>And so fin.</p>
</div>


The good news is that any of these 3 options would work.

There are many, many people on SO that will tell you "if it works, forget semantics and do it." So Option 1 would probably be a site favorite if everyone here was asked.

Option 2 is my favorite and would be the best semantically. I would definetely do it if time/budget allows.

However, Option 3 is a close second and hopefully this will answer your question: The <div> element and the <p> element are near-identical. In fact, the biggest difference is semantics. They each have only one rule applied to them in most browsers' CSS specification: display: block.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜