开发者

How to check if a .txt file is in ASCII or UTF-8 format in Windows environment?

I have converted a .txt file from ASCII to UTF-8 using 开发者_如何学CUltraEdit. However, I am not sure how to verify if it is in UTF-8 format in Windows environment.

Thank you!


Open the file in Notepad. Click 'Save As...'. In the 'Encoding:' combo box you will see the current file format.


Text files in Windows don't have a format. There's an unofficial convention that if the file starts with the BOM codepoint in UTF-8 format that it's UTF-8, but that convention isn't universally supported. That would be the 3 byte sequence "\xef\xbf\xbe", i.e. ￾ in the Latin-1 character set.


Open the file using Notepad++ and check the "Encoding" menu, you can check the current Encoding and/or Convert to a set of encodings available.


Open it in a hex editor and make sure that the first three bytes are a UTF8 BOM (EF BB BF)


If you use Windows 10 and has Windows Subsystem for Linux (WSL), it can be easily done by typing "file " from the shell.

For example:

$ file code.cpp

code.cpp: C source, UTF-8 Unicode (with BOM) text, with CRLF line terminators


I had a directory of files that I wanted to check. I created an Excel macro to determine ANSI vs. UTF-8. This worked for me.

        Sub GetTextFileEncoding()
        Dim sFile As String
        Dim sPath As String
        Dim sTextLine As String
        Dim iRow As Integer

        'Set Defaults and Initial Values
        iRow = 1
        sPath = "C:textfiles\"
        sFile = Dir(sPath & "*.txt")

        Do While Len(sFile) > 0
            'Get FileType
            'Debug.Print sFile & " - " & FileEncodeType(sPath & sFile)

            'Show on Excel Worksheet
            Cells(iRow, 1).Value = sFile
            Cells(iRow, 2).Value = FileEncodeType(sPath & sFile)

            'Get next file
            sFile = Dir

            'Increment Row
            iRow = iRow + 1
        Loop
    End Sub

    Function FileEncodeType(sFile As String) As String
        Dim bEF As Boolean
        Dim bBB As Boolean
        Dim bBF As Boolean

        bEF = False
        bBB = False
        bBF = False

        Open sFile For Input As #1
            If Not EOF(1) Then
                'Read first line
                Line Input #1, textline
                'Debug.Print textline
                For i = 1 To 3
                    'Debug.Print Asc(Mid(textline, i, 1)) & " - " & Mid(textline, i, 1)
                    Select Case i
                        Case 1
                            If Asc(Mid(textline, i, 1)) = 239 Then
                                bEF = True
                            End If
                        Case 2
                             If Asc(Mid(textline, i, 1)) = 187 Then
                                bBB = True
                            End If
                        Case 3
                             If Asc(Mid(textline, i, 1)) = 191 Then
                                bBF = True
                            End If
                        Case 4

                    End Select
                Next
            End If
        Close #1

        If bEF And bBB And bBF Then
            FileEncodeType = "UTF-8"
        Else
            FileEncodeType = "ANSI"
        End If
    End Function
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜