开发者

Identifying file type in Java

Please help me to 开发者_如何学JAVAfind out the type of the file which is being uploaded. I wanted to distinguish between excel type and csv.

MIMEType returns same for both of these file. Please help.


I use Apache Tika which identifies the filetype using magic byte patterns and globbing hints (the file extension) to detect the MIME type. It also supports additional parsing of file contents (which I don't really use).

Here is a quick and dirty example on how Tika can be used to detect the file type without performing any additional parsing on the file:

import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.HashMap;

import org.apache.tika.metadata.HttpHeaders;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.metadata.TikaMetadataKeys;
import org.apache.tika.mime.MediaType;
import org.apache.tika.parser.AutoDetectParser;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.Parser;
import org.xml.sax.helpers.DefaultHandler;

public class Detector {

    public static void main(String[] args) throws Exception {
        File file = new File("/pats/to/file.xls");

        AutoDetectParser parser = new AutoDetectParser();
        parser.setParsers(new HashMap<MediaType, Parser>());

        Metadata metadata = new Metadata();
        metadata.add(TikaMetadataKeys.RESOURCE_NAME_KEY, file.getName());

        InputStream stream = new FileInputStream(file);
        parser.parse(stream, new DefaultHandler(), metadata, new ParseContext());
        stream.close();

        String mimeType = metadata.get(HttpHeaders.CONTENT_TYPE);
        System.out.println(mimeType);
    }

}


I hope this will help. Taken from an example not from mine:

import javax.activation.MimetypesFileTypeMap;
import java.io.File;

class GetMimeType {
  public static void main(String args[]) {
    File f = new File("test.gif");
    System.out.println("Mime Type of " + f.getName() + " is " +
                         new MimetypesFileTypeMap().getContentType(f));
    // expected output :
    // "Mime Type of test.gif is image/gif"
  }

}

Same may be true for excel and csv types. Not tested.


I figured out a cheaper way of doing this with java.nio.file.Files

public String getContentType(File file) throws IOException {
        return Files.probeContentType(file.toPath());
}

- or -

public String getContentType(Path filePath) throws IOException {
        return Files.probeContentType(filePath);
}

Hope that helps.

Cheers.


A better way without using javax.activation.*:

 URLConnection.guessContentTypeFromName(f.getAbsolutePath()));


If you are already using Spring this works for csv and excel:


import org.springframework.mail.javamail.ConfigurableMimeFileTypeMap;

import javax.activation.FileTypeMap;
import java.io.IOException;

public class ContentTypeResolver {

    private FileTypeMap fileTypeMap;

    public ContentTypeResolver() {
        fileTypeMap = new ConfigurableMimeFileTypeMap();
    }

    public String getContentType(String fileName) throws IOException {
        if (fileName == null) {
            return null;
        }
        return fileTypeMap.getContentType(fileName.toLowerCase());
    }

}

or with javax.activation you can update the mime.types file.


The CSV will start with text and the excel type is most likely binary.

However the simplest approach is to try to load the excel document using POI. If this fails try to load the file as a CSV, if that fails its possibly neither type.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜