Different encoding of latex and bibtex files [closed]
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this questionDoes LaTeX handle situation when a .bib file has different encoding than .tex file? For inst开发者_JAVA技巧ance, .tex is in ISO-8859-2 and .bib in UTF-8. Can the encoding be converted on the fly by LaTeX? Or the only way is to do is manually?
First of all, according to the LyX wiki BibTeX can't use UTF-8:
BibTeX does not support files encoded in UTF-8 (i.e., Unicode), which is nowadays the default file encoding on most OSes. The reason is that current BibTeX (v. 0.99c) was released in 1988 and thus predates the advent of unicode. Unless the long-announced BibTeX v. 1.0 or one of the many planned potential successing applications are ready, latin1 (ISO-8859-1) or another 8-bit encoding has to be used for the bib file (this does not affect the LaTeX encoding, which still can be utf8).
Usually, whatever is inside a BibTeX file gets copied verbatim to the LaTeX source code (with some formatting maybe and case changings, &c.), such as book titles, authors, &c.
So your BibTeX file encoding has to match the one used by your LaTeX file, otherwise things get funny. You also can't use babel-provided commands in BibTeX (such as "a
for ä
, provided by n?german) unless your document includes the right packages.
The canonical way is to make BibTeX files agnostic of any encoding or package issues by always specifying special characters with their appropriate commands.
This basically means that instead of writing ä
you would have to use {\" a}
if you want to be absolutely sure that it works. Seems to be fairly standard practice.
The BibTeX manual BibTeXing by Oren Patashnik also details this:
BibTeX now handles accented characters. For example if you have an entry with the two fields
author = "Kurt G{\"o}del", year = 1931,
and if you're using the alpha bibliography style, then BibTeX will construct the label
[Göd31]
for this entry, which is what you'd want. To get this feature to work you must place the entire accented character in braces; in this case either{\"o}
or{\"{o}}
will do. Furthermore these braces must not themselves be enclosed in braces (other than the ones that might delimit the entire field or the entire entry); and there must be a backslash as the very first character inside the braces. Thus neither{G{\"{o}}del}
nor{G\"{o}del}
will work for this example. This feature handles all the accented characters and all but the nonbackslashed foreign symbols found in Tables 3.1 and 3.2 of the LaTeX book. This feature behaves similarly for "accents" you might define; we'll see an example shortly. For the purposes of counting letters in labels, BibTeX considers everything contained inside the braces as a single letter.
You can change the input encoding on the fly:
\inputencoding{latin2}
\bibliography{mybib}
\inputencoding{utf8}
The \inputencoding
command is provided by the inputenc
package.
BibTeX has huge problems with non-ASCII characters, even in the newest version. If you prefer a modern system, I'd like to recommend the combination of biblatex and biber. Both are still in beta stage, but they work quite well even in production environments. With this combination, most problems related to LaTeX bibliographies will vanish. As a side note, the biblatex documentation also contains a section about encoding issues with traditional BibTeX (§ 2.4.3).
Bibtex has random support for any non-standard character encodings -- essentially sometimes it works, most of the time it doesn't and officially it is not supported (More details ).
Personally, in .bib, I stick to the basic ASCII and LaTeX magic like \"o. For .tex, if I don't write in English, I keep .tex in UTF-8 with \usepackage[utf8]{inputenc} .
精彩评论