开发者

How to setup vim properly for editing in utf-8

I've run into problems a few times because vim's encoding was set to latin1 by default and I didn't notice and assumed it was using utf-8. Now that I have, I'd like to set up vim so that it will do the right thing in all obvious cases, and use utf-8 by default.

What I'd like to avoid:

  • Forcing a file saved in some other encoding that would have worked before my changes to open as utf-8, resulting in gibberish.
  • Forcing a terminal that doesn't support multibyte characters (like the Windows XP one) to try to display them anyway, resulting in gibberish.
  • Interfering with other programs' ability to read or edit the files (I have a (perhaps unjustified) aversion to using a BOM by default because I am unclear on how 开发者_开发知识库likely it is to mess other programs up.)
  • Other issues that I don't know enough about to guess at (but hopefully you do!)

What I've got so far:

if has("multi_byte")
  if &termencoding == ""
    let &termencoding = &encoding
  endif
  set encoding=utf-8                     " better default than latin1
  setglobal fileencoding=utf-8           " change default file encoding when writing new files
  "setglobal bomb                        " use a BOM when writing new files
  set fileencodings=ucs-bom,utf-8,latin1 " order to check for encodings when reading files
endif

This is taken and slightly modified from the vim wiki. I moved the bomb from setglobal fileencoding to its own statement because otherwise it doesn't actually work. I also commented out that line because of my uncertainty towards BOMs.

What I'm looking for:

  • Possible pitfalls to avoid that I missed
  • Problems with the existing code
  • Links to anywhere this has been discussed / set out already

Ultimately, I'd like this to result in a no-thought-required copy/paste snippet that will set up vim for utf-8-by-default that will work across platforms.

EDIT: I've marked my own answer as accepted for now, as far as I can tell it works okay and accounts for all things it can reasonably account for. But it's not set in stone; if you have any new information please feel free to answer!


In response to sehe, I'll give a go at answering my own question! I removed the updates I made to the original question and have moved them to this answer. This is probably the better way to do it.

The answer:

if has("multi_byte")
  if &termencoding == ""
    let &termencoding = &encoding
  endif
  set encoding=utf-8                     " better default than latin1
  setglobal fileencoding=utf-8           " change default file encoding when writing new files
endif

I removed the bomb line because according to the BOM Wikipedia page it is not needed when using utf-8 and in fact defeats ASCII backwards compatibility. As long as ucs-bom is first in fileencodings, vim will be able to detect and handle existing files with BOMs, so it is not needed for that either.

I removed the fileencodings line because it is not needed in this case. From the Vim docs: When 'encoding' is set to a Unicode encoding, and 'fileencodings' was not set yet, the default for 'fileencodings' is changed.

I am using setglobal filencoding (as opposed to set fileencoding) because: When reading a file, fileencoding will be automatically set based on fileencodings. So it only matters for new files then. And according to the docs again:

For a new file the global value of 'fileencoding' is used.


I think it would suffice to have a vanilla vimrc + fenc=utf-8

The rest should be pretty decent out-of-the-box

I'd use the BOM only on Windows platforms with Microsoft tooling (although even some of these fail to always write a BOM; however it is the default for Notepad Unicode saving, .NET XmlWriter and other central points of the MS platform tools)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜