开发者

vbscript regex replace headache

I have a text file I'm trying to process with vbscript, it looks like this:

111 ,   ,       ,Yes    ,Yes
222 ,   ,       ,Yes    ,Yes
333 ,   ,       ,Yes    ,Yes
444 ,   ,       ,Yes    ,Yes
555 ,   ,       ,Yes    ,Yes
666 ,   ,       ,Yes    ,Yes

What I want is to remove the carriage returns and tabs, commas and 'yes' (or the regex "\t,\t,\t\t,Yes\t,Yes") to give this output:

('111','222','333','444','555','666')

I'm using this code:

Const ForReading = 1
Const ForWriting = 2

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile(filePath, ForReading)

strText = objFile.ReadAll
objFile.Close
'chr(010) = line feed chr(013) = carriage return
strNewText = Replace(strText, "\t,\t,\t\t,Yes\t,Yes" & chr(013) & chr(010), "','") 

Set objFile = objFSO.OpenTextFile(filePath, ForWriting)
objFile.WriteLine strNewText
objFile.Close

This isn't giving the desired output however, If I take the ""\t,\t,\t\t,Yes\t,Y开发者_JAVA百科es" &" out of the replace it removes the carriage returns, which is fine but I also need the commas tabs and 'yes' removed, as well as having a (' at the start and ') at the end. I'm guessing it's the way I've used the regex but I've not used much vbscript so I'm not sure


Instead of hunting down what you don't want, it's easier and less errorprone to concentrate on what you want:

  Dim sExp   : sExp   = "('111','222','333','444','555','666')"
  Dim aLines : aLines = Array( _
      "111 ,   ,       ,Yes    ,Yes" _
    , "222 ,   ,       ,Yes    ,Yes" _
    , "333 ,   ,       ,Yes    ,Yes" _
    , "444 ,   ,       ,Yes    ,Yes" _
    , "555 ,   ,       ,Yes    ,Yes" _
    , "666 ,   ,       ,Yes    ,Yes" _
  )     
  Dim sAll : sAll = Join( aLines, vbCrLf )
  WScript.Echo sAll
  Dim reCut : Set reCut = New RegExp
  reCut.Global    = True
  reCut.MultiLine = True
  reCut.Pattern   = "^\d+"
  Dim oMTS : Set oMTS = reCut.Execute( sAll )
  If 0 = oMTS.Count Then
     WScript.Echo "Bingo A!"
  Else
     ReDim aNums( oMTS.Count - 1 )
     Dim nI
     For nI = 0 To UBound( aNums )
         aNums( nI ) = oMTS( nI ).Value
     Next
     Dim sRes : sRes = "('" & Join( aNums, "','" ) & "')"    
     If sRes = sExp Then
        WScript.Echo "QED:", sRes
     Else   
        WScript.Echo "Bingo B!"
     End If
  End If

output:

111 ,   ,       ,Yes    ,Yes
222 ,   ,       ,Yes    ,Yes
333 ,   ,       ,Yes    ,Yes
444 ,   ,       ,Yes    ,Yes
555 ,   ,       ,Yes    ,Yes
666 ,   ,       ,Yes    ,Yes
QED: ('111','222','333','444','555','666')

Annotations:

I use an array to build my string to process (sAll). Your string (strText) comes from a file. So:

  Dim sAll : sAll = Join( aLines, vbCrLf )
  ==>
  Dim sAll : sAll = objFile.ReadAll

The string is parsed by an RegExp (reCut), its pattern ^\d+ looks for a sequence (+) of digits (\d) at the start (^) of a line (not the whole string; that's why the MultiLine attribute is set to True). The result of .Execute is a Match Collection (oMTS), containg Matches.

To make the the concatenation of the expected result easier, the values of the Matches are copied to an array (aNums).

The "('" & Join( aNums, "','" ) & "')" expression combines the array's elements using the separator (combinator?) ',' - to complete the result, we need just a suitable head (' resp. tail ').


Try this

(.*?)(?:\s*,){3}Yes\s*,Yes\r?

you need to take care of the linebreaks, with Regexr \r was fine. I put the line breaks into the regex because I wanted to have it optional using the ? afterwards. Otherwise the last row will not be replaced if it does not end with a line break.

and replace it with

'$1',

Here you will get a additional comma at the end. I am at the moment not sure how to handle this.

$1 is the content of the first capturing group, in your case the part before the first comma should be in it.

See it here on Regexr

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜