How to setup vim properly for editing in utf-8
I've run into problems a few times because vim's encoding was set to latin1 by default and I didn't notice and assumed it was using utf-8. Now that I have, I'd like to set up vim so that it will do the right thing in all obvious cases, and use utf-8 by default.
What I'd like to avoid:
- Forcing a file saved in some other encoding that would have worked before my changes to open as utf-8, resulting in gibberish.
- Forcing a terminal that doesn't support multibyte characters (like the Windows XP one) to try to display them anyway, resulting in gibberish.
- Interfering with other programs' ability to read or edit the files (I have a (perhaps unjustified) aversion to using a BOM by default because I am unclear on how 开发者_开发知识库likely it is to mess other programs up.)
- Other issues that I don't know enough about to guess at (but hopefully you do!)
What I've got so far:
if has("multi_byte")
if &termencoding == ""
let &termencoding = &encoding
endif
set encoding=utf-8 " better default than latin1
setglobal fileencoding=utf-8 " change default file encoding when writing new files
"setglobal bomb " use a BOM when writing new files
set fileencodings=ucs-bom,utf-8,latin1 " order to check for encodings when reading files
endif
This is taken and slightly modified from the vim wiki. I moved the bomb
from setglobal fileencoding
to its own statement because otherwise it doesn't actually work. I also commented out that line because of my uncertainty towards BOMs.
What I'm looking for:
- Possible pitfalls to avoid that I missed
- Problems with the existing code
- Links to anywhere this has been discussed / set out already
Ultimately, I'd like this to result in a no-thought-required copy/paste snippet that will set up vim for utf-8-by-default that will work across platforms.
EDIT: I've marked my own answer as accepted for now, as far as I can tell it works okay and accounts for all things it can reasonably account for. But it's not set in stone; if you have any new information please feel free to answer!
In response to sehe, I'll give a go at answering my own question! I removed the updates I made to the original question and have moved them to this answer. This is probably the better way to do it.
The answer:
if has("multi_byte")
if &termencoding == ""
let &termencoding = &encoding
endif
set encoding=utf-8 " better default than latin1
setglobal fileencoding=utf-8 " change default file encoding when writing new files
endif
I removed the bomb
line because according to the BOM Wikipedia page it is not needed when using utf-8 and in fact defeats ASCII backwards compatibility. As long as ucs-bom
is first in fileencodings
, vim will be able to detect and handle existing files with BOMs, so it is not needed for that either.
I removed the fileencodings
line because it is not needed in this case. From the Vim docs: When 'encoding' is set to a Unicode encoding, and 'fileencodings' was not set yet, the default for 'fileencodings' is changed.
I am using setglobal filencoding
(as opposed to set fileencoding
) because:
When reading a file, fileencoding
will be automatically set based on fileencodings
. So it only matters for new files then. And according to the docs again:
For a new file the global value of 'fileencoding' is used.
I think it would suffice to have a vanilla vimrc + fenc=utf-8
The rest should be pretty decent out-of-the-box
I'd use the BOM only on Windows platforms with Microsoft tooling (although even some of these fail to always write a BOM; however it is the default for Notepad Unicode saving, .NET XmlWriter and other central points of the MS platform tools)
精彩评论