开发者

Check if file has a CSV format with Python

Could someone pro开发者_如何学Govide an effective way to check if a file has CSV format using Python ?


You could try something like the following, but just because you get a dialect back from csv.Sniffer really won't be sufficient for guaranteeing you have a valid CSV document.

csv_fileh = open(somefile, 'rb')
try:
    dialect = csv.Sniffer().sniff(csv_fileh.read(1024))
    # Perform various checks on the dialect (e.g., lineseparator,
    # delimiter) to make sure it's sane

    # Don't forget to reset the read position back to the start of
    # the file before reading any entries.
    csv_fileh.seek(0)
except csv.Error:
    # File appears not to be in CSV format; move along


Adding to the answer by gotgenes: I reached good results with also checking for non-printable characters that should(tm) not be included in csv files.

def is_csv(infile):
    try:
        with open(infile, newline='') as csvfile:
            start = csvfile.read(4096)

            # isprintable does not allow newlines, printable does not allow umlauts...
            if not all([c in string.printable or c.isprintable() for c in start]):
                return False
            dialect = csv.Sniffer().sniff(start)
            return True
    except csv.Error:
        # Could not get a csv dialect -> probably not a csv.
        return False


Python has a csv module, so you could try parsing it under a variety of different dialects.


Try parsing it as CSV and see if you get an error.


You need to think clearly on what you consider a CSV file to be.

For example, what sort of characters can occur between the commas. Is it text-only? Can it be Unicode characters as well? Should every line have the same number of commas?

There is no strict definition of a CSV file that I'm aware of. Usually it's ASCII text separated by commas and every line has the same number of commas and is terminated by your platform's line terminator.

Anyway, once you answer the questions above you'll be a bit farther on your way to knowing how to detect when a file is a CSV file.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜