开发者

File.lastModified() painfully slow!

I'm doing a recursive copy of files and like xcopy /D I only want to c开发者_如何学Copy newer files destination files (I cannot use xcopy directly since I need to alter some files in the copy process).

In java I use lastModified() to check if the destination file is older than the source file and it's very slow.

  • Can I speed up the process (maybe using JNI??)?
  • Are there any other copy scripts that can do the job better (copy new files + regexp change some text files)?

Copying files anyways is not an option since that will take more time than checking last modified date (copying over the network).


You need to determine why it is so slow.

When you are running the progrma what is the CPU utilisation of your process. If it more than 50% user, then you should be able to optmise your program, if its less than 20% there isn't so much you can do.

Usually this method is slow because the file you are examining is on disk rather than in memory. If this is the case you need to speed up how you access your disk, or get a faster drive. e.g. SSD can be 10-100x faster at doing this.

A bulk query might help. You can do this by using multiple threads to check the lastModified date. e.g. have a fixed size thread pool and add a task for each file. The size of the thread pool determines the number of files polled at once.

This allows the OS to re-order the requests to suit the layout on the disk. Note: This is fine in theory, but you have to test whether this makes things faster on your OS/hardware as its just as likely to make things slower. ;)


So I ran across this on network drives. Painful. I had a directory with 17000+ files on it. On a local drive it took less than 2 seconds to check the last modified date. On a networked drive it took 58 seconds!!! Of course my app is an interactive app so I had some complaints.

After some research I decided that it would be possible to implement some JNI code to do the Windows Kernel32 findfirstfile/findnextfile/findclose to dramatically improve the process but then I had 32 and 64 bit version etc. ugh. and then lose the cross platform capabilities.

Although a bit of a nasty hack here is what I did. My app operates on windows mostly but I didn't want to restrict it to do so so I did the following. Check to see if I am operating on windows. If so then see if I am using a local hard disk. If not then we are going to do the hackish method.

I stored everything case insensitive. Probably not a great idea for other OS's that may have a directory with both files 'ABC' and 'abc'. If you need to care about this then you can decide by creating a new File("ABC") and new File("abc") and then using the equals method to compare them. On case insensitive file systems like windows it will return true but on unix systems it will return false.

Although it may be a little hackish the time it took went from 58 seconds to 1.6 seconds on a network drive so I can live with the hack.

        boolean useJaveDefaultMethod = true;

    if(System.getProperty("os.name").startsWith("Windows"))
    {
        File f2 = f.getParentFile();
        while(true)
        {
            if(f2.getParentFile() == null)
            {
                String s = FileSystemView.getFileSystemView().getSystemTypeDescription(f2);
                if(FileSystemView.getFileSystemView().isDrive(f2) && "Local Disk".equalsIgnoreCase(s))
                {
                    useJaveDefaultMethod = true;
                }
                else
                {
                    useJaveDefaultMethod = false;
                }
                break;
            }
            f2 = f2.getParentFile();
        }
    }
    if(!useJaveDefaultMethod)
    {
        try
        {
            ProcessBuilder pb = new ProcessBuilder("cmd.exe", "/C", "dir " + f.getParent());
            pb.redirectErrorStream(true);
            Process process = pb.start();
            InputStreamReader isr = new InputStreamReader(process.getInputStream());
            BufferedReader br = new BufferedReader(isr);

            String line;
            DateFormat df = new SimpleDateFormat("dd-MMM-yy hh:mm a");
            while((line = br.readLine()) != null)
            {
                try
                {
                    Date filedate = df.parse(line);
                    String filename = line.substring(38);
                    dirCache.put(filename.toLowerCase(), filedate.getTime());
                }
                catch(Exception ex)
                {

                }
            }
            process.waitFor();

            Long filetime = dirCache.get(f.getName().toLowerCase());
            if(filetime != null)
                return filetime;

        }
        catch(Exception Exception)
        {
        }
    }

    // this is SO SLOW on a networked drive!
    long lastModifiedDate = f.lastModified();
    dirCache.put(f.getName().toLowerCase(), lastModifiedDate);

    return lastModifiedDate;


Unfortunately the way Java handles looking up lastModified is slow (basically it queries the underlying file system for each file as you request the information, there is no bulk loading of this data on listFiles or similar).

You could potentially invoke a more efficient native program to do this in bulk, but any such solution would be closely tied to the platform you deploy to.


I imagine you are doing this over the network, otherwise there would be little point in the copy. Network directory operations are slow, bad luck. You could always just copy the file below a certain size threshold, whatever makes the total operation take least time.

I disagree with Kris here: there's nothing startlingly inefficient in the way Java does it, and in any case it really has to do it that way because you want the latest value.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜