ANTLR 3 parsing problem
I have written an ANTLR 3 grammar for parsing TaskJuggler III bookings files (see below).
On line
project prj "Sample project" "1.0" 2010-10-24-00:00-+0200 - 2010-11-23-09:00-+0100 {
I'm getting following errors:
line 1:42 mismatched character '-' expecting set '0'..'9'
line 1:48 mismatched character ':' expecting set '0'..'9'
line 1:67 mismatched character '-' expecting set '0'..'9'
line 1:73 mismatched character ':' expecting set '0'..'9'
Thereafter, an OutOfMemory error occurs.
Here is the relevant part of the grammar:
bookingsFile returns [DefaultBookingsFile bookingsFile]
: { bookingsFile = new DefaultBookingsFile(); } projectHeader projectIds (resourceDeclaration)* (task)* ( suppStmt=supplementStatement
{bookingsFile.addSupplementStatement( $suppStmt.suppStmt ); }
)* ;
projectHeader
: 'project prj "' ANY_TEXT '" "1.0"' TJ3_BOOKING_TIME '-'
TJ3_BOOKING_TIME '{'
'}' ;
TJ3_BOOKING_TIME
: DIGIT DIGIT DIGIT DIGIT '-' DIGIT DIGIT '-' DIGIT DIGIT
'-' DIGIT DIGIT ':' DIGIT DIGIT '-' TIMEZONE
;
TIMEZONE
: ('+'|'-')DIGIT DIGIT DIGIT DIGIT ;
Question: What am I doing wrong?
Thanks in advance
Dmitri
P. S.: Full version of the grammar is available at
http://bazaar.launchpad.net/~dp-sw-dev/pcc/prototype1/files/head%3A/src/main/java/at/silverstrike/pcc/impl/tj3bookingsparser/grammar/
and below
grammar Bookings;
options {
backtrack=true;
memoize=true;
}
@header {
package at.silverstrike.pcc.impl.tj3bookingsparser.grammar;
}
@lexer::header {
package at.silverstrike.pcc.impl.tj3bookingsparser.grammar;
}
bookingsFile returns [DefaultBookingsFile bookingsFile]
:
{
bookingsFile = new DefaultBookingsFile();
}
projectHeader
projectIds
(resourceDeclaration)*
(task)*
(
suppStmt=supplementStatement {bookingsFile.addSupplementStatement( $suppStmt.suppStmt ); }
)*
;
projectHeader
:
'project prj "' ANY_TEXT '" 开发者_开发百科"1.0"' TJ3_BOOKING_TIME '-' TJ3_BOOKING_TIME '{'
'}'
;
projectIds
:
'projectids prj'
;
resourceDeclaration
:
'resource' TJ3_IDENTIFIER TJ3_STRING
;
task
:
'task' TJ3_IDENTIFIER TJ3_STRING '{' ANY_TEXT '}'
;
supplementStatement returns [DefaultSupplementStatement suppStmt]
:
{
suppStmt = new DefaultSupplementStatement();
}
'supplement task' taskId=TJ3_DOTTED_TASK_IDENTIFIER { suppStmt.setTaskId($taskId.text); }
'{'
(
bStmt=bookingStatement {suppStmt.addBookingStatement( $bStmt.stmt ); }
)*
ANY_TEXT
'}'
;
bookingStatement returns [DefaultBookingStatement stmt]
:
{
stmt = new DefaultBookingStatement();
}
TJ3_IDENTIFIER ':'
'booking'
resource=TJ3_IDENTIFIER { stmt.setResource($resource.text); }
ib1=indBooking { stmt.addIndBooking($ib1.indBooking); }
(
','
ib2=indBooking { stmt.addIndBooking($ib2.indBooking); }
)*
overTimeEtc
;
indBooking returns [DefaultIndBooking indBooking]
:
startTime=TJ3_BOOKING_START_TIME '+' duration=TJ3_DURATION 'h'
{
$indBooking = new DefaultIndBooking($startTime.text, $duration.text);
}
;
overTimeEtc
:
'{' ANY_TEXT '}'
;
TJ3_IDENTIFIER
: ('a'..'Z'|'A'..'Z') ('a'..'Z'|'A'..'Z'|'0'..'9'|'_')*
;
DIGIT
: '0'..'9'
;
TJ3_STRING
: '"' ('a'..'z'|'A'..'Z'|'0'..'9'|' '|'_')* '"'
;
ANY_TEXT
: ('a'..'z'|'A'..'Z'|'0'..'9'|' '|'_')*
;
TJ3_DOTTED_TASK_IDENTIFIER
: TJ3_IDENTIFIER ('.' TJ3_IDENTIFIER)*
;
TJ3_BOOKING_TIME
: DIGIT DIGIT DIGIT DIGIT '-' DIGIT DIGIT '-' DIGIT DIGIT '-' DIGIT DIGIT ':' DIGIT DIGIT '-' TIMEZONE
;
TJ3_BOOKING_START_TIME
: DIGIT DIGIT DIGIT DIGIT '-' DIGIT DIGIT '-' DIGIT DIGIT ('-' DIGIT DIGIT ':' DIGIT DIGIT)? (TIMEZONE)?;
TIMEZONE
: ('+'|'-')DIGIT DIGIT DIGIT DIGIT
;
TJ3_DURATION
: FP_VALUE ('min' | 'h' | 'd' | 'w' | 'm' | 'y')
;
FP_VALUE
: DIGIT+
| DIGIT* '.' DIGIT*
;
Your rule:
TJ3_BOOKING_START_TIME
: DIGIT DIGIT DIGIT DIGIT '-' DIGIT DIGIT '-' DIGIT DIGIT ('-' DIGIT DIGIT ':' DIGIT DIGIT)? (('+'|'-')DIGIT DIGIT DIGIT DIGIT)?
;
does not match this part of your input:
" ... 2010-10-25-00:00-+0200 ... "
// ^^
The -+
part is not accounted for in your rule.
EDIT
Try something like this:
grammar Bookings;
bookingsFile
: Project Prj String String Time Hyphen Time OpenParen CloseParen EOF
;
Project
: 'project'
;
Prj
: 'prj'
;
OpenParen
: '{'
;
CloseParen
: '}'
;
Hyphen
: '-'
;
String
: '"' ~'"'* '"'
;
Time
: D D D D '-' D D '-' D D '-' D D ':' D D '-+' D D D D
;
fragment
D
: '0'..'9'
;
Space
: (' ' | '\t' | '\r'? '\n'){$channel=HIDDEN;}
;
Interpreting the source:
project prj "Sample project" "1.0" 2010-10-25-00:00-+0200-2010-11-24-09:00-+0100 {
}
yields the parse-tree:
HTH
精彩评论