c# EnumerateFiles wildcard returning non matches?
As a simplified example I am executing the following
IEnumerable<string> files = Directory.EnumerateFiles(path, @"2010*.xml",
SearchOption.TopDirectoryOnly).ToList();
In my results set I am getting a few files which do no match the file pattern. According to msdn searchPattern wildcard is "Zero or more character开发者_如何学Cs" and not a reg ex. An example is that I am getting a file name back as "2004_someothername.xml".
For information there are in excess of 25,000 files in the folder.
Does anyone have any idea what is going on?
This is due to how Windows does wildcard matching - it includes the encoded 8.3 filenames in its wildcard search, resulting in some surprising matches!
A way to get around this bug is to retest all file results that come back through the OS wildcard match and test with a manual comparison of the wildcard to each (long) file name. Another way is to turn off 8.3 filenames altogether via the registry. I have been burned by this on numerous occasions, including having important (non-matching) files get deleted via a wildcard based del command from the command prompt.
To summarize, be very careful, especially if you have many files in a directory on making any critical production decisions or taking any actions based on an OS file/wildcard match, without a secondary verification of results.
Here is an explanation of this bizarre behavior.
Another explanation from O'Reilly's site.
I can reproduce your problem with the follow code (Sorry, VB). It creates 55,000 zero-byte files named 2000_0001.xml
through 2010_5000.xml
. Then it looks for all of the files that start with 2010. On my machine (Windows 7 SP1 32-bit) it returns 5,174 files instead of just 5,000.
Option Explicit On
Option Strict On
Imports System.IO
Public Class Form1
Private TempFolder As String = Path.Combine(My.Computer.FileSystem.SpecialDirectories.Desktop, "Temp")
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
CreateFiles()
Dim Files = Directory.EnumerateFiles(TempFolder, "2010*.xml", SearchOption.TopDirectoryOnly).ToList()
Using FS As New FileStream(Path.Combine(My.Computer.FileSystem.SpecialDirectories.Desktop, "Report.txt"), FileMode.Create, FileAccess.Write, FileShare.Read)
Using SW As New StreamWriter(FS, System.Text.Encoding.ASCII)
For Each F In Files
SW.WriteLine(F)
Next
End Using
End Using
DeleteFiles()
End Sub
Private Sub CreateFiles()
If Not Directory.Exists(TempFolder) Then Directory.CreateDirectory(TempFolder)
Dim Bytes() As Byte = {}
Dim Name As String
For Y = 2000 To 2010
Trace.WriteLine(Y)
For I = 1 To 5000
Name = String.Format("{0}_{1}.xml", Y, I.ToString.PadLeft(4, "0"c))
File.WriteAllBytes(Path.Combine(TempFolder, Name), Bytes)
Next
Next
End Sub
Private Sub DeleteFiles()
Directory.Delete(TempFolder, True)
End Sub
End Class
Not a solution for the MS bug (which possibly uses Windows File Search underneath, which would be terrible for your results...), but a solution as a workaround, which gives you some extra leverage and control over the results:
var files = from file in
Directory.EnumerateFiles(path, "*",
SearchOption.TopDirectoryOnly)
where (new FileInfo(file)).Name.StartsWith("2010") &&
(new FileInfo(file)).Extension == "xml"
select dir;
I just tried your example and I can't see it doing anything wrong, so I guess there's more to your environment and/or the "non simplified" code that isn't covered here.
I've used this code:
Console.WriteLine("Starting...");
IEnumerable<string> files = Directory.EnumerateFiles("C:\\temp\\test\\2010", @"2010*.xml", SearchOption.TopDirectoryOnly).ToList();
foreach (string file in files)
{
Console.WriteLine("Found[{0}]", file);
}
Console.ReadLine();
In my folder structure i've created the following:
c:\temp\test\2010\2004_something.xml c:\temp\test\2010\2010_abc.xml c:\temp\test\2010\2010_def.xml
The output of the application is simply:
Starting...
Found[C:\temp\test\2010\2010_abc.xml]
Found[C:\temp\test\2010\2010_def.xml]
Can you provide some more feedback as to what is happening in your scenario, in the real app? or can you reproduce the problem in a smaller app?
Having suffered the same problem, and finding this post I thought I would post my solution:
IEnumerable<string> Files = Directory.EnumerateFiles(e.FileName, "*.xml").Where(File => File.EndsWith(".xml", StringComparison.InvariantCultureIgnoreCase));
This only tests the suffix but eliminates matches to my backup files that end .xml~.
精彩评论