Why does Chrome incorrectly determine page is in a different language and offer to translate?
The new Google Chrome auto-translation feature is tripping up on one page within one of our applications. Whenever we navigate to this particular page, Chrome tells us the page is in Danish and offers to translate. The page is in English, just like every other page in our app. This particular page is an internal testing page开发者_如何转开发 that has a few dozen form fields with English labels. I have no idea why Chrome thinks this page is Danish.
Does anyone have insights into how this language detection feature works and how I can determine what is causing Chrome to think the page is in Danish?
Update: according to Google
We don’t use any code-level language information such as lang attributes.
They recommend you make it obvious what your site's language is.
Use the following which seems to help although Content-Language
is deprecated and Google says they ignore lang
<html lang="en" xml:lang="en" xmlns= "http://www.w3.org/1999/xhtml">
<meta charset="UTF-8">
<meta name="google" content="notranslate">
<meta http-equiv="Content-Language" content="en">
If that doesn't work, you can always place a bunch of text (your "About" page for instance) in a hidden div. That might help with SEO as well.
EDIT (and more info)
The OP is asking about Chrome, so Google's recommendation is posted above. There are generally three ways to accomplish this for other browsers:
W3C recommendation: Use the
lang
and/orxml:lang
attributes in the html tag:<html lang="en" xml:lang="en" xmlns= "http://www.w3.org/1999/xhtml">
UPDATE: previously a Google recommendation now deprecated spec although it may still help with Chrome. :
meta http-equiv
(as described above):<meta http-equiv="Content-Language" content="en">
Use HTTP headers (not recommended based on cross-browser recognition tests):
HTTP/1.1 200 OK Date: Wed, 05 Nov 2003 10:46:04 GMT Content-Type: text/html; charset=iso-8859-1 Content-Language: en
Exit Chrome completely and restart it to ensure the change is detected. Chrome doesn't always pick up the new meta tag on tab refresh.
I added lang="en"
to the doctype declaration, added meta tags for charset utf-8 and Content-Langauge in the HTML header, specified charset as utf-8 and Content-Language as en
in the HTTP response headers and it did nothing to stop Chrome from declaring my page was in Portuguese. The only thing that fixed the problem was adding this to the HTML header:
<meta name="google" content="notranslate">
But now I've prevented users from translating my page that is clearly in English to their own language. Poor job, Chrome. You can be better than this.
Specify the default language for the document, then use the translate attribute and Google's notranslate
class per element/container, as in:
<html lang="en">
...
<span><a href="#" translate="no" class="notranslate">English</a></span>
Explanation:
The accepted answer presents a blanket solution, but does not address how to specify the language per element, which can fix the bug and ensure your page remains translatable.
Why is this better? This will cooperate with Google's internationalization versus shut it off. Referring back to the OP:
Why does Chrome incorrectly determine page is in a different language and offer to translate?
Answer: Google is trying to help you with internationalization, but we need to understand why this is failing. Building off of NinjaCat's answer, we assume that Google reads and predicts the language of your website using an N-gram algorithm -- so, we can't say exactly why Google wants to translate your page; we can only assume that:
- There are words on your page that belong to a different language.
- Marking the containing element as
translate="no"
andlang="en"
(or removing these words) will help Google to correctly predict the language of your page.
Unfortunately, most people reaching this post won't know what words are causing the trouble. Use Chrome's built-in "Translate to English" feature (in the Right-Click context menu) to see what gets translated, you may see unexpected translations like the following:
So, update your html with the appropriate translation tags until the Google Translation of your page changes nothing -- then we should expect the popup to go away for future visitors.
Won't it be a lot of work to add all these extra tags? Yes, very likely. If you are using Wordpress or another Content Management System then look in their documentation for quick ways to update your code!
Without knowing what the text was, perhaps the ngram detection is being tricked by the content of your page.
http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-you.html
https://en.wikipedia.org/wiki/N-gram
Chromium thinks this page in Filipino: http://www.reyalvarado.com/portfolio/cuba/ Notes: There is pretty much no text on the page except for the owner's name and the menu items. Menu items are dynamically replaced with images by FLIR.
The HTML declares the page as US English:
<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="en-US">
Try including the property xml:lang=""
to the <html>
, if the other solutions don't work:
<html class="no-js" lang="pt-BR" dir="ltr" xml:lang="pt-BR">
精彩评论