开发者

Fastest possibility of listing a directory and getting the URLs of every file in Java

i am planning to perform a standard list command to get a vector or a list of the content of a directory.

I know this is easy by using

File f = new File("C:/testDir");
File[] files = f.listFiles();

The problem is that I need a list/array/vector of URLs. So my thoughts were to convert the files to URL. With the org.apache.commons.io.FileUtils library this is possible with the following simple code:

URL[] urls = FileUtils.toURLs(file开发者_如何学运维s);

This does exactly what I need, but is unfortunately very slow (especially for directories with thousands of files), although it is just using a for-loop and parses every single File object with the "toURL()" method.

Does someone know a way to do this task in a better performance?


The only optimization that is simple would be reducing object creation, which will make a modest improvement in performance. Instead of using listFiles(), which creates a whole slew of File objects, use list() to get a String array of just the file names, not the paths, and create the URLs directly. String creation and storage will have less object overhead in this case. The string manipulation could obviously be made faster and more proper as well, although it probably won't make a huge difference.

Something like:

ArrayList<URL> urls = new ArrayList<URL>(); //or use an array if you prefer.
for(String name: f.files())
    urls.add(new URL("file://"+f.getPath()+"/"+name));


Create a new URL object, instead of invoke the toUrl() method seems to be more efficient. I have checked this out:

    File parent=new File("./doc");
    File[] listado=parent.listFiles();
    long t0=0L;
    try {
       t0=System.currentTimeMillis();
       for(int k=0;k<10000;k++) {
        URL[] listaArchivos=new URL[listado.length];
        for (int i = 0; i < listado.length; i++) {
            listaArchivos[i]=listado[i].toURL();
        }
       } 
    } catch (Exception e) {
        e.printStackTrace();
    }
    System.out.println("Files:"+listado.length+"; Time 1: "+(System.currentTimeMillis()-t0)+" ms");


    try {
        t0=System.currentTimeMillis();
        for(int k=0;k<10000;k++) {
            URL[] listaArchivos=new URL[listado.length];
            for (int i = 0; i < listado.length; i++) {
                listaArchivos[i]=new URL("file://"+listado[i].getAbsolutePath());
            }
        }
    } catch (Exception e) {
        e.printStackTrace();
    }           
    System.out.println("Files:"+listado.length+"; Time 2: "+(System.currentTimeMillis()-t0)+" ms");

My output is:

Files:14; Time 1: 1985 ms
Files:14; Time 2: 516 ms


If you really have that many file you might want to use several threads. Each of n threads converses 1/n files.

For this to be efficient you need really many files.


Your solution is fine, and you shouldn't worry about performance, unless you have tens of thousands of files in that directory.

A performance optimization might be to cache the array of URLs if this functionality is used a lot.

That said - measure how much does it take to perform this on a directory with 2k files, and then optimize.


Other people have responded saying that constructing the URLs by string concatenation (e.g. "file://" + dirPath + "/" + file.getName() is a lot faster than calling File.toURI().toString(). For instance the OP reports a 5 fold speedup. I wondered why there is such a difference.

Apparently, one reason is that the toURI() method checks to see if this is a directory, and appends a / if it is. The corollary is that a URL for a directory produced by String concatenation won't have a trailing /.

There's another caveat with creating "file:" URLs by string concatenation. That is that if the names in the file's path contains reserved characters (per the URL / URI specs), then string concatenation may produce a malformed URL / URI. The reserved characters typically need to be % escaped. Furthermore, on Windows it is not entirely clear how drive letters should be represented in "file:" URLs.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜