开发者

Parsing strings with linux bash - Delphi code to Bash code

Could someone tell me how I'd write the following code in a linux bash script?

procedure ParseLine(Line: String; var url, lang, Identifier: String);
var
  p1,p2: Integer;
Begin
  p1 := Pos(Char(VK_TAB),Line开发者_运维知识库);
  p2 := PosEx(Char(VK_TAB),Line,p1+1);
  url := Copy(Line,1,p1-1);
  lang := Copy(Line,p1+1,p2 - (p1+1));
  Identifier := Copy(Line,p2+1,Length(Line));
  p1 := Pos('(',lang);
  lang := Copy(lang,1,p1-1);
End;

The line I need to parse looks something like this

XXXXX \tab XXXX(XXX) \tab XXXX

Thanks.


Here's a BASH script that works for your sample input. Unfortunately I didn't find a way to specify the "Tab" character alone, I used the [:blank:] class (it also includes space). If you really need to only match tab and not space as delimiter, you can replace all the [:blank:] occurrence with actual TAB characters you'd type from your keyboard. I also didn't save the matched parts to some global variables (as bash functions would normally do) I simply echo'ed them.

#!/bin/bash

function split {
  # Preapre small parts of the future regex. Makes writing the actual regex
  # easier and provides a place to explain the regex
  blank="[[:blank:]]" # one blank character (tab or space). Uses the [:blank:] character class in a character set regex selector
  optional_blanks="${blank}*" # zero or more blank characters.
  mandatory_blanks="${blank}+" # one or more blank characters.
  non_blank="[^()[:blank:]]" # one character that is not tab space or paranthesis: This is the stuff we intend to capture.
  capture="(${non_blank}+)" # one or more non-blank non paranthesis characters in captaruing paranthesis.

  # Concatenate our regex building blocks into a big regex. Notice how I'm using ${optional_blanks} for maximum flexibility,
  # for example around the "(" and ")" tests.
  regex="${optional_blanks}${capture}${mandatory_blanks}${capture}${optional_blanks}\(${optional_blanks}${capture}${optional_blanks}\)${optional_blanks}${capture}${optional_blanks}"


  # The regex is applied using the =~ binary operator.
  if [[ $1 =~ $regex ]];
  then
    # We got a match, our capturing groups are saved into bash
    # variables ${BASH_REMATCH[n]}. We'll echo those, but in
    # real use the function would probably copy those values to
    # some global names to be easily used from outside the function.
    echo ${BASH_REMATCH[1]}
    echo ${BASH_REMATCH[2]}
    echo ${BASH_REMATCH[3]}
    echo ${BASH_REMATCH[4]}
  else
    # Oops, input doesn't match.
    echo not matched
  fi
}

# call our function with static input for testing
# purposes.
echo "Test 1 - tab separated fields without extra space"
split "1234     56(78)  90"

# Since we're using [:blank:] and that includes both space and tab
# this also works
echo "Test 2 - space separated fields with lots of meaningless space"
split "1234 56 (    78 )      90       "
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜