开发者

Sorting integers and calculate average of top 20 in batch file

Let's say that I have a batch file that reads arbitrary integers from a file. The file is structured such that each line contains one integer, like so:

24
17
43
103
...

I need to calculate the average of the top 20 numbers that are in the file. In order to do that, I need some sort of data structure that stores the top 20 numbers. However, as far as I know there are no arrays in batch files. I may have to resort to using t开发者_开发问答emporary files or some other method that I am not aware of. So my ultimate goal is to determine the best approach for implementing some sort of sorting algorithm for the batch file and calculate the average of the top 20 integers.

There is a constraint that I need to place on the problem. The file is pretty huge in terms of size (around ~500 lines) so I would rather not use temporary files due to the huge amount of read/write operations done (unless if you can convince me otherwise of course).


You can mimic arrays in batch. Take a look at Using Arrays in Batch Files.


The following solution uses arrays as described in the article by the link provided in the @Stoney's answer. It also uses zero-padding for correct sorting, which is a nice idea by @jeb, although this solution doesn't use the sort command. Instead, the sorting is automatically done by the SET command, whose output is used for iterating over the 'array'.

@ECHO OFF
SET top=5

SET cnt=0
FOR /F %%N IN (datafile) DO CALL :insert %%N
IF %cnt%==0 GOTO :EOF

IF %cnt% LSS %top% (SET threshold=0) ELSE SET /A threshold=cnt-top

SET s=0
SET i=0

FOR /F "tokens=2 delims=.=" %%A IN ('SET __number.') DO CALL :calc %%A

SET /A res=s/(cnt-threshold)-1000000
ECHO Average is %res%
PAUSE
GOTO :EOF

:insert
SET /A n=1000000+%1
SET /A __number.%n%+=1
SET /A cnt+=1
GOTO :EOF

:calc
SET /A i_prev=i
SET /A i+=__number.%1
IF %i% LEQ %threshold% GOTO :EOF
IF %i_prev% GEQ %threshold% (
  SET /A s+=%1*__number.%1
) ELSE (
  SET /A "s+=%1*(i_prev+__number.%1-threshold)"
)

Basically, the solution implements the following algorithm:

  1. Pick numbers from the file one by one:

    1.1. If the number is encountered for the first time, add it to the array with the count of 1.

    1.2. If the number is a duplicate of an already added number, increase the corresponding count value by 1.

    1.3. Increase the total count of numbers by 1.

  2. Calculate the threshold value, which is the total count minus the top quantity of numbers whose average is to be calculated.

  3. Iterate through the array items like this:

    3.1. Increase the index by the current number's count value.

    3.2. If the index exceeds the threshold:

    • If it has exceeded the threshold at the current iteration, increase the total sum by the product of the number and that part of its count that has exceeded the threshold.

    • If the threshold was exceeded earlier, increase the total sum by the product of the number and its count.

    3.3. If the index doesn't exceed the threshold, omit the item.

  4. Calculate the average as the total sum divided by the difference between total count and threshold.


You can use the sort command to sort the numbers, but there is the porblem, that sort uses a string sort and does not sort numbers, so a 2 seems to be greater than 10.
But this can be solved if you format all numbers to the same length into a temporary file.
So you get

024
017
043
103
...

Sort them with the /R (Reverse) option, to begin the output with the biggest number.

Then you can simply read 20 lines and build the sum for the average


for small files:

@Echo oFF

for /f %%a in (f1.txt) do Call :Append %%a
call :sort %sort%
pause
goto :EOF

:Append
call set sort=%%sort%% %*
goto :eof

:Sort
Setlocal EnableDelayedExpansion
Set/A n=1,s=0,c=s,r=s
for %%: In (%*) do (
    Set /a c+=1
    Set "nm.!c!=%%:")
:LP.1
if %s% EQU %c% Set/A n+=1,s=0
    Set/A s+=1
    Call :SPL %n% %s%
If %n% LEQ %c% goto :LP.1
:LP.2
    Echo:!nm.%c%!
    Set/A c-=1
If %c% GTR 0 goto :LP.2 
Endlocal & goto :EOF
:SPL
 If !nm.%1! GTR !nm.%2! (
   Set "t=!nm.%2!"&Set "nm.%2=!nm.%1!"
   Set "nm.%1=!t!"
 ) 
goto :EOF

or a variant:

 @echo off
 for /f %%# in (f1.txt) do (
    set x=##########%%#
    call set #%%x:~-0xa%%==)
 for /f "delims=#=" %%a in ('set ##') do echo(%%a
 pause 

there are also: gnu sort

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜