How to sort by file name the same way Windows Explorer does?
This is the famous problem of "ASCIIbetical" order versus "Natural" order as applied to powershell. To be able to sort in powershell the same way as explorer 开发者_开发知识库does, you can use this wrapper over StrCmpLogicalW API, that actually performs the natural sorting for Windows Explorer. This will require some plumbing though.
However, this article suggests that there is a three liner implementation of the sort in python. One would hope that Get-ChildItem cmdlet or at least File System Provider can have built-in natural sorting option. Unfortunately, they do not.
So here is the question, what is simplest implementation of this in Powershell? By simple I mean the least amount of code to write, and possibly no third-party/external scripts/components. Ideally I want a short Powershell function that would do the sorting for me.
TL;DR
Get-ChildItem | Sort-Object { [regex]::Replace($_.Name, '\d+', { $args[0].Value.PadLeft(20) }) }
Here is some very short code (just the $ToNatural
script block) that does the trick with a regular expression and a match evaluator in order to pad the numbers with spaces. Then we sort the input with padded numbers as usual and actually get natural order as a result.
$ToNatural = { [regex]::Replace($_, '\d+', { $args[0].Value.PadLeft(20) }) }
'----- test 1 ASCIIbetical order'
Get-Content list.txt | Sort-Object
'----- test 2 input with padded numbers'
Get-Content list.txt | %{ . $ToNatural }
'----- test 3 Natural order: sorted with padded numbers'
Get-Content list.txt | Sort-Object $ToNatural
Output:
----- test 1 ASCIIbetical order
1.txt
10.txt
3.txt
a10b1.txt
a1b1.txt
a2b1.txt
a2b11.txt
a2b2.txt
b1.txt
b10.txt
b2.txt
----- test 2 input with padded numbers
1.txt
10.txt
3.txt
a 10b 1.txt
a 1b 1.txt
a 2b 1.txt
a 2b 11.txt
a 2b 2.txt
b 1.txt
b 10.txt
b 2.txt
----- test 3 Natural order: sorted with padded numbers
1.txt
3.txt
10.txt
a1b1.txt
a2b1.txt
a2b2.txt
a2b11.txt
a10b1.txt
b1.txt
b2.txt
b10.txt
And finally we use this one-liner to sort files by names in natural order:
Get-ChildItem | Sort-Object { [regex]::Replace($_.Name, '\d+', { $args[0].Value.PadLeft(20) }) }
Output:
Directory: C:\TEMP\_110325_063356
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 2011-03-25 06:34 8 1.txt
-a--- 2011-03-25 06:34 8 3.txt
-a--- 2011-03-25 06:34 8 10.txt
-a--- 2011-03-25 06:34 8 a1b1.txt
-a--- 2011-03-25 06:34 8 a2b1.txt
-a--- 2011-03-25 06:34 8 a2b2.txt
-a--- 2011-03-25 06:34 8 a2b11.txt
-a--- 2011-03-25 06:34 8 a10b1.txt
-a--- 2011-03-25 06:34 8 b1.txt
-a--- 2011-03-25 06:34 8 b2.txt
-a--- 2011-03-25 06:34 8 b10.txt
-a--- 2011-03-25 04:54 99 list.txt
-a--- 2011-03-25 06:05 346 sort-natural.ps1
-a--- 2011-03-25 06:35 96 test.ps1
Allow me to copy and paste my answer from another question.
Powershell Sort-Object Name with numbers doesn't properly
Windows explorer is using a legacy API from shlwapi.dll which called StrCmpLogicalW
, that's the reason seeing different sorting results.
I don't want to pad zeros, so wrote a script.
https://github.com/LarrysGIT/Powershell-Natural-sort
Since I am not a C# expert, pull requests are appreciated if it's not tidy.
Find following PowerShell script, it uses the same API.
function Sort-Naturally
{
PARAM(
[System.Collections.ArrayList]$Array,
[switch]$Descending
)
Add-Type -TypeDefinition @'
using System;
using System.Collections;
using System.Collections.Generic;
using System.Runtime.InteropServices;
namespace NaturalSort {
public static class NaturalSort
{
[DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
public static extern int StrCmpLogicalW(string psz1, string psz2);
public static System.Collections.ArrayList Sort(System.Collections.ArrayList foo)
{
foo.Sort(new NaturalStringComparer());
return foo;
}
}
public class NaturalStringComparer : IComparer
{
public int Compare(object x, object y)
{
return NaturalSort.StrCmpLogicalW(x.ToString(), y.ToString());
}
}
}
'@
$Array.Sort((New-Object NaturalSort.NaturalStringComparer))
if($Descending)
{
$Array.Reverse()
}
return $Array
}
Find test results below.
PS> # Natural sort
PS> . .\NaturalSort.ps1
PS> Sort-Naturally -Array @('2', '1', '11')
1
2
11
PS> # If regular sort is being used
PS> @('2', '1', '11') | Sort-Object
1
11
2
PS> # Not good
PS> $t = ls .\testfiles\*.txt
PS> $t | Sort-Object
1.txt
10.txt
2.txt
PS> # Good
PS> Sort-Naturally -Array $t
1.txt
2.txt
10.txt
I prefer @Larry Song's answer because it sorts exactly the way Windows Explorer does. I tried to simplify it a little to make it less intrusive.
Add-Type -TypeDefinition @"
using System.Runtime.InteropServices;
public static class NaturalSort
{
[DllImport("Shlwapi.dll", CharSet = CharSet.Unicode)]
private static extern int StrCmpLogicalW(string psz1, string psz2);
public static string[] Sort(string[] array)
{
System.Array.Sort(array, (psz1, psz2) => StrCmpLogicalW(psz1, psz2));
return array;
}
}
"@
Then you can use it like:
$array = ('1.jpg', '10.jpg', '2.jpg')
[NaturalSort]::Sort($array)
which outputs:
1.jpg
2.jpg
10.jpg
Translation from python to PowerShell works pretty well:
function sort-dir {
param($dir)
$toarray = {
@($_.BaseName -split '(\d+)' | ?{$_} |
% { if ([int]::TryParse($_,[ref]$null)) { [int]$_ } else { $_ } })
}
gci $dir | sort -Property $toarray
}
#try it
mkdir $env:TEMP\mytestsodir
1..10 + 100..105 | % { '' | Set-Content $env:TEMP\mytestsodir\$_.txt }
sort-dir $env:TEMP\mytestsodir
Remove-Item $env:TEMP\mytestsodir -Recurse
You can do it even better when you use Proxy function approach. You add -natur
parameter to Sort-Object
and you have pretty beautiful solution.
Update: First I was quite surprised that PowerShell handles comparing arrays in this way. After I tried to create test files ("a0", "a100", "a2") + 1..10 + 100..105 | % { '' | Set-Content $env:TEMP\mytestsodir\$_.txt }
, it turned out that it doesn't work. So, I think there is no elegant solution like, because PowerShell is static under the covers, whereas python is dynamic.
精彩评论