Finding words in treetop - some matches not being made
I've run into a bit of a strange situation.
I'm trying to parse measurements using treetop.
For instance - 6' of 1/2" Copper Pipe of course, this can also be written as feet, Feet, inch, inches, Inch, inch, etc. etc.
so I have a rule
rule measurement ('\'' / 'Foot' / 'foot' / 'Feet' / 'feet' / '"' / 'Inches' / 'inches' / 'Inch' / 'inch' / 'cm' / 'cms' / 'Centimeters' / 'centimeters' / 'Centimeter' / 'centimeter' / 'm' / 'ms' / 'Meters' / 'meters'/ 'Meter' / 'meter' / 'lb' / 'lbs' / 'Pounds' / 'pounds' / 'Pound' / 'pound' ) (s? ')' / s) { def value [:measurement, text_value] end } end rule space [\s]+ end
When I enter '6 inches', '6 pounds', '6 Meters', everything works great, and I get my number and measurement returned.
When I enter '6 meters', meters isn't parsed properly.
Most of the measurements work fine, only 'meters' and 'pound' are being missed in the measurements I've provided here (but I'm sure I'll be adding more measurements in the future.
Any ideas as to why I would be experiencing this?
As per request, a more 'pared down' version of the full grammar
grammar FullMeasurements rule full_product measur开发者_运维知识库es s? alternate_measure product_name { def value [:full_product, text_value] end } end rule measures single_measure / dual_measure / quantity { def measures [:measures, text_value] unless text_value.blank? end } end rule dual_measure quantity s? single_measure { def value [:dual_measure, text_value] unless text_value.blank? end } end rule alternate_measure '(' s? single_measure { def value [:alternate_measure, text_value] unless text_value.blank? end } end rule single_measure (range_number / number) s? measurement optional_secondary_measurements { def value [:single_measure, text_value] end } end rule optional_secondary_measurements measurement? { def value [:optional_secondary_measurements, text_value] end } end rule quantity (range_number / number) s? divisor? { def value [:quantity, text_value] end } end rule measurement ('\'' / 'Foot' / 'foot' / 'Feet' / 'feet' / '"' / 'Inches' / 'inches' / 'Inch' / 'inch' / 'cm' / 'cms' / 'Centimeters' / 'centimeters' / 'Centimeter' / 'centimeter' / 'm' / 'ms' / 'Meters' / 'meters'/ 'Meter' / 'meter' / 'lb' / 'lbs' / 'Pounds' / 'pounds' / 'Pound' / 'pound' ) (s? ')' / s) { def value [:measurement, text_value] end } end rule divisor "x" end rule product_name !measures words+ { def value [:product_name, text_value] end } end rule number frac_number / regular_number optional_frac { def value [:number, text_value] end } end rule optional_frac frac_number? { def value [:optional_frac, text_value] end } end rule frac_number (s? regular_number '/' regular_number) { def value [:frac_number, text_value] end } end rule words [0-9a-zA-Z\-()&.%'*\s]+ { def value text_value end } end rule regular_number [0-9\.]+ { def value text_value end } end rule space [\s]+ end end
Since PEGs are greedy and /
is an ordered alternation, your measurement
rule matches the literal text "meter" and then your grammar fails because it cannot find a following rule that matches the left over "s". Unlike regular expressions, PEGs will not backtrack through previous successful matches when a later one fails.
Switch the order of items in your rule to have the plurals first, and you should be good to go.
Phrogz was on the right track, but it's not "meter" being matched first, but 'm' that leaves nothing to match the "eter" or "eters" that's left over.
精彩评论