Powershell regex question involving balancing parenthesis
How can I convert a table column definition into an array of columns using regex's without taking formatting into account? I'm confused how to split on "," as they can also appear in the datatype definition part (between matching parenthesis).
Sample input:
CREATE table test ( DISTRICT VARCHAR(3开发者_StackOverflow) CHARACTER SET LATIN NOT CASESPECIFIC, CUSTOMER_ACCOUNT DECIMAL(8,0), CUSTOMER_SUB_ACCOUNT DECIMAL(3,0), SERVICE_SEQ_NUM DECIMAL(7,0), EFFECTIVE_DATE TIMESTAMP(0), SUBSCRIBER_SEQ_NUM DECIMAL(7,0) )
Thanks!
Frederik
Traditionally regular expressions are not capable of testing for/matching balanced parentheses. But several libraries have made extensions that allow some recursion. E.g. pcre with its ?R "quantifier". And Microsoft has added "balancing groups". You can find an example of that feature for matching (...) at http://oreilly.com/catalog/regex2/chapter/ch09.pdf
I think you could use negative look-behind for this scenario since the commas that are not of interest to you seem to be preceeded by a number:
$str = @'
CREATE table test ( DISTRICT VARCHAR(3) CHARACTER SET LATIN NOT CASESPECIFIC,
CUSTOMER_ACCOUNT DECIMAL(8,0),
CUSTOMER_SUB_ACCOUNT DECIMAL(3,0),
SERVICE_SEQ_NUM DECIMAL(7,0),
EFFECTIVE_DATE TIMESTAMP(0),
SUBSCRIBER_SEQ_NUM DECIMAL(7,0) )
'@
$str -split '(?<!\d),'
CREATE table test ( DISTRICT VARCHAR(3) CHARACTER SET LATIN NOT CASESPECIFIC
CUSTOMER_ACCOUNT DECIMAL(8,0)
CUSTOMER_SUB_ACCOUNT DECIMAL(3,0)
SERVICE_SEQ_NUM DECIMAL(7,0)
EFFECTIVE_DATE TIMESTAMP(0)
SUBSCRIBER_SEQ_NUM DECIMAL(7,0) )
Note this is using PowerShell 2.0's -split
operator.
For paren matching, you might try something like this:
$re = [regex]@'
(?x)
\(
(?>
[^()]+
|
\( (?<DEPTH>)
|
\) (?<-DEPTH>)
)*
(?(DEPTH)(?!))
\)
'@
if ($str -match $re) {
$matches[0]
}
Outputs:
(
DISTRICT VARCHAR(3) CHARACTER SET LATIN NOT CASESPECIFIC,
CUSTOMER_ACCOUNT DECIMAL(8,0),
CUSTOMER_SUB_ACCOUNT DECIMAL(3,0),
SERVICE_SEQ_NUM DECIMAL(7,0),
EFFECTIVE_DATE TIMESTAMP(0),
SUBSCRIBER_SEQ_NUM DECIMAL(7,0)
)
See this blog post for more help on paren matching.
精彩评论