Parsing a template file with C++
Recently I've been busy with some PHP framework - completely off-topic by the way.
Anyhow, I got specific html/template files I would like to parse with C++ (don't ask m开发者_C百科e why, it's just because I want to write it in C++). Besides that, it might actually be the first useful thing I would ever write in C++.
Anyway, to get back to the problem, imagine I have a file like the following:
<table>
<tr>
<th>ID</th>
<th>Title</th>
<th>Actions</th>
</tr>
{foreach from="$pages => $page"}
<tr>
<td>{$page.Id()}</td>
<td>{$page.Title()}</td>
<td><a href="page/edit/{$page.Id()}/">Edit</a> | <a href="page/delete/{$page.Id()}/">Delete</a></td>
</tr>
{foreachelse}
<tr>
<td colspan="3">There are no pages to be displayed</td>
</tr>
{/foreach}
</table>
And the output should be:
<table>
<tr>
<th>ID</th>
<th>Title</th>
<th>Actions</th>
</tr>
<?php if(count($pages) > 0): ?>
<?php foreach($pages as $page): ?>
<tr>
<td><?php echo $page->getId(); ?></td>
<td><?php echo $page->getTitle(); ?></td>
<td><a href="page/edit/<?php echo $page->getId(); ?>/">Edit</a> | <a href="page/delete/<?php echo $page->getId(); ?>/">Delete</a></td>
</tr>
<?php endforeach; ?>
<?php else: ?>
<tr>
<td colspan="3">There are no pages to be displayed</td>
</tr>
<?php endif; ?>
</table>
Why I am doing this might not be exactly clear to you, but it remains a problem, applicable somewhere else in any case.
Anyhow, some forward and backward lookups and modifications in the output files are required. What is the right approach to this problem?
You can write a handcrafted parser, which might be nontrivial, depending on your actual requirements. Your next best bet is to use BNF-like C++ parsers, e.g. boost::spirit, so you don't need to sweat processing parsing rules yourself. You will still need to write correct semantic actions to convert { ... } to php.
The right approach, in my view, would not to re-invent the wheel (i.e. writing your own parser) but rather an existing library that will make it easier and less time consuming for you. One of those C++ libraries could be wxHTMLParser or wxHTML.
For these type of problems I tend to be inclined towards REGEX. Using either boost::regex
or the GNU regex classes or any other library. Identifying those markers and converting them is mostly a regex search and replace thing (with parameters for variable names, values, etc.), and you don't have to write code to actually parse the complete HTML and the special inserts.
精彩评论