Simplest treetop grammar is returning a parse error, just learning
I'm trying to learn treetop and was taking most of the code from https://github.com/survival/lordbishop for parsing names and was going to build from that.
My structure is a bit different because I'm building it in rails, rather than ruby command line.
When I run a very simple parse, I have a parse error being returned on a space (which should be one of the simpler things in my grammar. What am I doing wrong?
My code is fairly simple, in my model
require 'treetop' require 'polyglot' require 'grammars/name' class Name def self.parse(data) parser = FullNameParser.new tree = parser.parse(data) if tree.nil? return "Parse error at offset: #{parser.index}" end result_hash = {} tree.value.each do |node| result_hash[node[0] = node[1].strip if node.is_a?(Array) && !node[1].blank? end return result_hash end end
I've stripped most of the grammar down to just getting words and spaces
grammar FullName rule word [^\s]+ { def value text_value end } end rule s [\s]+ { def value "" end } end end
I'm trying to parse 'john smith',i was hoping to just开发者_JS百科 get back words and spaces and build my logic from there, but I'm stuck at even this simple level. Any suggestions??
AFAIK, treetop starts parsing with the first rule in your grammar (the rule word
, in your case!). Now, if you input is 'John Smith'
(i.e.: word
, s
, word
), it stops parsing after matching the rule word
for the first time. And produces an error when it encounters the first s
since word
does not match s
.
You need to add a rule to the top of your grammar that describes an entire name: that is a word, followed by a space followed by a word, etc.
grammar FullName
rule name
word (s word)* {
def value
text_value
end
}
end
rule word
[^\s]+ {
def value
text_value
end
}
end
rule s
[\s]+ {
def value
text_value
end
}
end
end
A quick test with the script:
#!/usr/bin/env ruby
require 'rubygems'
require 'treetop'
require 'polyglot'
require 'FullName'
parser = FullNameParser.new
name = parser.parse('John Smith').value
print name
will print:
John Smith
精彩评论