Human-readable URLs: preferably hierarchical too?

2023-01-21 17:44 问答作者：

In a now migrated question about human-readable URLs I allowed myself to elaborate a little hobby-horse of mine:

When I encounter URLs like http://www.example.com/product/123/subpage/456.html I always think that this is an attempt on creating meaningful hierarchical URLs which, however, is not entirely hierarchical. What I mean is, you should be able to slice off one level at a time. In the above, the URL has two violations on this principle:

/product/123 is one piece of information represented as two levels. It would be more correctly represented as /product:123 (or whatever delimiter you like)

/subpage is very likely not an entity in itself (i.e., you cannot go up one level from 456.html as http://www.example.com/product/123/subpage is "nothing").

Therefore, I find the following more correct:
http://www.example.com/product:123/456.html
Here, you can always navigate up one level at a time:

http://www.example.com/product:123/456.html — The subpage

http://www.example.com/product:123 — The product page

http://www.example.com/ — The root

Following the same philosophy, the following would make sense [and provide an additional link to the products listing]:
http://www.example.com/products/123/456.html
Where:

http://www.example.com/products/123/456.html — The subpage

http://www.example.com/products/123 — The product page

http://www.example.com/products — The list of products

http://www.example.com/ — The root

My primary motivation for this approach is that if every "path element" (delimited by /) is selfcontained¹, you will always be able to navigate to the "parent" by simply removing the last element of the URL. This is what I (sometimes) do in my file explorer when I want to go to the parent directory. Following the same line of logic the user (or a search engine / crawler) can do the same. Pretty smart, I think.

On the other hand (and this is the important bit of the question): While I can never prevent that a user tries to access a URL he himself has amputated, am I wrongfully asserting (and honouring) that a search engine might do the same? I.e., is it reasonable to expect that no search engine (or really: Google) would try to access http://www.example.com/product/123/subpage (point 2, above)? (Or am I really only taking the human factor into account here?)

This is not a question about personal preference. It's techical question about what I can expect of an crawler / indexer and to what extend I should take non-human URL manipulation into account when designing URLs.

Also, the structural "depth" of http://www.example.com/product/123/subpage/456.html is 4, where http://www.example.com/products/123/456.html is only 3. Rumour has it that this depth influences search engine ranking. At least, so I was told. (It is now evident that SEO is not what 开发者_运维问答I know most about.) Is this (still?) true: does the hierarchical depth (number of directories) influence search ranking?

So, is my "hunch" technically sound or should I spend my time on something else?

Example: Doing it (almost) right

Good ol' SO gets this almost right. Case in point: profiles, e.g., http://stackoverflow.com/users/52162:

http://stackoverflow.com/users/52162 — Single profile
http://stackoverflow.com/users — List of users
http://stackoverflow.com/ — Root

However, the canonical URL for the profile is actually http://stackoverflow.com/users/52162/jensgram which seems redundant (the same end-point represented on two hierarchical levels). Alternative: http://stackoverflow.com/users/52162-jensgram (or any other delimiter consistently used).

¹⁾ Carries a complete piece of information not dependent on "deeper" elements.

Hierarchical urls of this kind "http://www.example.com/product:123/456.html" are as useless as "http://www.example.com/product/123/subpage", because when users see your urls, they don't care about identifiers from your database, they want meaningful paths. This is why StackOverflow puts question titles into urls: "http://stackoverflow.com/questions/4017365/human-readable-urls-preferably-hierarchical-too".

Google advices against practice of replacing usual queries like "http://www.example.com/?product=123&page=456", because when every site develops it's own scheme, crawler doesn't know what each part means, if it's important or not. Google has invented sophisticated mechanisms to find important arguments and ignore unimportant, which means you'll get more pages into index and there will be less duplicates. But these algorithms often fail when web developers invent their own scheme.

If you care about both users and crawlers you should use urls like this instead:

http://www.example.com/products/greatest-keyboard/benefits — the subpage
http://www.example.com/products/greatest-keyboard — the product page
http://www.example.com/products — the list of products
http://www.example.com/ — the root

Also, search engines give higher rating to pages with keywords in the url.

继续阅读：hierarchical human-readable seo url-modification

Human-readable URLs: preferably hierarchical too?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？