Extracting list of numbers from plain text file
I have text file containing ser开发者_如何学编程ies of numbers following similar pattern:
<Lorepsum ipsum lores aus Lorep NUM="100" aus Lore>
<Lorepsum ipsum lores aus Lorpsum NUM="101" Lorepsum>
<Lorepsum ipsum lores aus Lorp77dsum NUM="102" ipsum lores aus>
<Lorepsum ipsum lores aus Lopsum NUM="103" lores aus>
Is it possible to write a windows batch script to extract the numbers from the file and put it into a new file?
o/p file should contain
101
102
103
104
Yes, but it's not very pretty. The obvious candidate for this would b regular expressions which you only have for matching (and then only very limited) in batch files. If you'd use PowerShell then it'd just be
Get-Content foo.txt | ForEach-Object {
[Regex]::Match($_, 'NUM="(\d+)"').Groups[1].Value
}
But sadly, in a batch file this is a little more complicated.
You can, however, use for /f
to parse the file and then examine the tokens. There is no easy way to parse a line token by token, though. And tokenizing stops after 31 tokens (if I remember correctly). In any case, the following does work:
@echo off
for /f "delims=" %%f in (foo.txt) do call :parse "%%f"
goto :eof
:parse
setlocal enabledelayedexpansion
set i=0
:parseImpl
set /a i+=1
(
for /f "tokens=%i% delims= " %%l in (%1) do (
rem Jump out if no more tokens are there
if "%%l"=="" goto :eof
rem Remember the token
set T=%%l
if "!T:~0,4!"=="NUM=" (
set N=!T:~4!
rem add redirection here if needed
echo !N:"=!
)
)
) || goto :eof
rem This above will cause the loop to stop once no more tokens are there.
rem The for loop will return a non-zero exit code then.
goto parseImpl
It's not too pretty, but fairly straightforward. Since when reading a file I can use each line only once I delegate the work to a subroutine which goes over the line as often as necessary. For this the variable i
is used which keeps track of the current token number. Then another for
loop is employed which extracts the requested token from the string. If the token starts with NUM=
then it is assumed to be the number you want. It is cleaned up and printed.
If you want them directly into a file, then change the respective line to
>out.txt echo !N:"=!
The code can also be found in my SVN.
This should get you started:
@echo off
set cnt=0
set max=9
:enter_loop
if %cnt% GTR %max% goto end_loop
echo NUM="%cnt%" >> output.txt
set /a cnt="cnt+1"
goto enter_loop
:end_loop
pause
精彩评论