Marking up a search result list with HTML5 semantics

2023-01-07 15:55 问答作者：

Making a search result list (like in Google) is not very hard, if you just need something that works. Now, however, I want to do it with perfection, using the benefits of HTML5 semantics. The goal is to define the defacto way of marking up a search result list that potentially could be used by any future search engine.

For each hit, I want to

order them by increasing number
display a clickable title
show a short summary
display additional data like categories, publishing date and file size

My first idea is something like this:

<ol>
  <li>
    <article>
      <header>
        <h1>
          <a href="url-to-the-page.html">
            The Title of the Page
          </a>
        </h1>
      </header>
      <p>A short summary of the page</p>
      <footer>
        <dl>
          <dt>Categories</dt>
          <dd>
            <nav>
               <ul>
                  <li><a href="first-category.html">First category</a></li>
                  <li><a href="second-category.html">Second category</a></li>
                </ul>
            </nav>
          </dd>
          <dt>File size</dt>
          <dd>2 kB</dd>
          <dt>Published</dt>
          <dd>
            <time datetime="2010-07-15T13:15:05-02:00" pubdate>Today</time>
          </dd>
        </dl>
      </footer>
    </article>
  </li>
  <li>
    ...
  </li>
  ...
</ol>

I am not really happy about the <article/> within the <li/>. First, the search result hit is not an article by itself, but just a very short summary of one. Second, I am not even sure you are allowed to put an article within a list.

Maybe the <details/> and <summary/> tags are开发者_JS百科 more suitable than <article/>, but I don't know if I can add a <footer/> inside that?

All suggestions and opinions are welcome! I really want every single detail to be perfect.

1) I think you should stick with the article element, as

[t]he article element represents a self-contained composition in a document, page, application, or site and that is intended to be independently distributable or reusable [source]

You merely have a list of separate documents, so I think this is fully appropriate. The same is true for the front page of a blog, containing several posts with titles and outlines, each in a separate article element. Besides, if you intend to quote a few sentences of the articles (instead of providing summaries), you could even use blockquote elements, like in the example of a forum post showing the original posts a user is replying to.

2) If you're wondering if it's allowed to include article elements inside a li element, just feed it to the validator. As you can see, it is permitted to do so. Moreover, as the Working Draft says:

Contexts in which this element may be used:

Where flow content is expected.

3) I wouldn't use nav elements for those categories, as those links are not part of the main navigation of the page:

only sections that consist of major navigation blocks are appropriate for the nav element. In particular, it is common for footers to have a short list of links to various pages of a site, such as the terms of service, the home page, and a copyright page. The footer element alone is sufficient for such cases, without a nav element. [source]

4) Do not use the details and/or summary elements, as those are used as part of interactive elements and are not intended for plain documents.

UPDATE: Regarding if it's a good idea to use an (un)ordered list to present search results:

The ul element represents a list of items, where the order of the items is not important — that is, where changing the order would not materially change the meaning of the document. [source]

As a list of search results actually is a list, I think this is the appropriate element to use; however, as it seems to me that the order is important (I expect the best matching result to be on top of the list), I think that you should use an ordered list (ol) instead:

The ol element represents a list of items, where the items have been intentionally ordered, such that changing the order would change the meaning of the document. [source]

Using CSS you can simply hide the numbers.

EDIT: Whoops, I just realized you already use an ol (due to my fatique, I thought you used an ul). I'll leave my ‘update’ as is; after all, it might be useful to someone.

I'd markup it up this way (without using any RDFa/microdata vocabularies or microformats; so only using what the plain HTML5 spec gives):

<ol start="1">

  <li id="1">
    <article>
     <h1><a href="url-to-the-page.html" rel="external">The Title of the Page</a></h1>
     <p>A short summary of the page</p>
     <footer>
       <dl>
         <dt>Categories</dt>
         <dd><a href="first-category.html">First category</a></dd>
         <dd><a href="second-category.html">Second category</a></dd>
         <dt>File size</dt>
         <dd>2 <abbr title="kilobyte">kB</code></dd>
         <dt>Published</dt>
         <dd><time datetime="2010-07-15T13:15:05-02:00">Today</time></dd>
        </dl>
      </footer>
    </article>
  </li>

  <li id="2">
    <article>
     …
    </article>
  </li>

</ol>

`start` attribute for `ol`

If the search engine uses pagination, you should give the start attribute to the ol, so that each li reflects the correct ranking position.

`id` for each `li`

Each li should get id atribute, so that you can link to it. The value should be the rank/position.

One could think that the id should be given to the article instead, but I think this would be wrong: the rank/order could change by time. You are not referring to a specific result but to a result position.

Remove the `header`

It is not needed if it contains only the heading (h1).

Add `rel="external"` to the link

The link to each search result is an external link (leading to a different website), so it should get the rel value external.

Remove `nav`

The category links are not navigation in scope of the article. So remove the nav.

Each category in a `dd`

You used:

<dt>Categories</dt>
<dd>
 <ul>
  <li><a href="first-category.html">First category</a></li>
  <li><a href="second-category.html">Second category</a></li>
 </ul>
</dd>

Instead, you should list each category in its own dd and remove the ul:

<dt>Categories</dt>
<dd><a href="first-category.html">First category</a></dd>
<dd><a href="second-category.html">Second category</a></dd>

`abbr` for file size

The unit in "2 kB" should be marked-up with abbr:

2 <abbr title="kilobyte">kB</code>

Remove `pubdate` attribute

It's not in the spec anymore.

Other things that could be done

give hreflang attribute to the link if the linked result has a different language than the search engine
give lang attribute to the link description and the summary if it is in a different language than the search engine
summary: use blockquote (with cite attribute) instead of p, if the search engine does not create a summary itself but uses the meta-description or a snippet from the page.
title/link description: use q (with cite attribute) if the link description is exactly the title from the linked webpage

Aiming for a 'perfect' HTML5 template is futile because the spec itself is far from perfect, with most of the prescribed use-cases for the new 'semantic' elements obscure at best. As long as your document is structured in a logical fashion, you won't have any problems with search engines (most of the new tags don't have the slightest impact). Indeed, following the HTML5 spec to the letter - for example, using <h1> tags within each new sectioning element - may make your site less accessible (to screen readers, for example). Don't strive for 'perfect' or close-to, because it doesn't exist - HTML5 is not thought-out well enough for that. Just concentrate on keeping your markup logical and uncluttered.

I found a good resource for HTML5 is HTML5Doctor. Check the article archive for practical implementations of the new tags. Not a complete reference mind you, but nice enough to ease into it :)

As shown by the Footer element page, sections can contain footers :)

继续阅读：search-engine semantic-markup

Marking up a search result list with HTML5 semantics

`start` attribute for `ol`

`id` for each `li`

Remove the `header`

Add `rel="external"` to the link

Remove `nav`

Each category in a `dd`

`abbr` for file size

Remove `pubdate` attribute

Other things that could be done

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

start attribute for ol

id for each li

Remove the header

Add rel="external" to the link

Remove nav

Each category in a dd

abbr for file size

Remove pubdate attribute

Other things that could be done

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

`start` attribute for `ol`

`id` for each `li`

Remove the `header`

Add `rel="external"` to the link

Remove `nav`

Each category in a `dd`

`abbr` for file size

Remove `pubdate` attribute

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？