Hosts file ANTLR grammar
Is there an existing, working hosts file grammar on the web?
I checked out list on http://www.antlr.org/grammar/list, but I didn't find it there.
I also checked the hosts file entry in Wikipedia, and it referenced RFC 952, but I don't think that is the same format used by /windows/system32/drivers/etc/hosts.
Any grammar format is better than none, but I would prefer one in ANTLR format. This is the first time I've used any grammar generators, and I want to keep my learning curve low. I'm 开发者_运维技巧already planning to use ANTLR for consuming other files.
From a Microsoft page:
The HOSTS file format is the same as the format for host tables in the Version 4.3 Berkeley Software Distribution (BSD) UNIX /etc/hosts file.
And the /etc/hosts file is described here.
An example file:
#
# Table of IP addresses and hostnames
#
172.16.12.2 peanut.nuts.com peanut
127.0.0.1 localhost
172.16.12.1 almond.nuts.com almond loghost
172.16.12.4 walnut.nuts.com walnut
172.16.12.3 pecan.nuts.com pecan
172.16.1.2 filbert.nuts.com filbert
172.16.6.4 salt.plant.nuts.com salt.plant salt
A hosts file looks to be formatted like this:
- each table entry in /etc/hosts contains an IP address separated by whitespace(s) from a list of hostnames associated with that address
- a table entry can optionally end with zero or more alias
- comments begin with
#
The bold words will be the rules in the ANTLR grammar, which may look like this:
grammar Hosts;
parse
: tableEntry* EOF
;
tableEntry
: address hostName aliases?
{
System.out.println("\n== Entry ==");
System.out.println(" address : " + $address.text);
System.out.println(" hostName : " + $hostName.text);
System.out.println(" aliases : " + $aliases.text);
}
;
address
: Octet '.' Octet '.' Octet '.' Octet
;
hostName
: Name
;
aliases
: Name+
;
Name
: Letter+ ('.' Letter+)*
;
Comment
: '#' ~('\r' | '\n')* {$channel=HIDDEN;}
;
Space
: (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}
;
Octet
: Digit Digit Digit
| Digit Digit
| Digit
;
fragment Letter
: 'a'..'z'
| 'A'..'Z'
;
fragment Digit
: '0'..'9'
;
which can be tested with the class:
import org.antlr.runtime.*;
public class Main {
public static void main(String[] args) throws Exception {
String source =
"# \n" +
"# Table of IP addresses and Hostnames \n" +
"# \n" +
"172.16.12.2 peanut.nuts.com peanut \n" +
"127.0.0.1 localhost \n" +
"172.16.12.1 almond.nuts.com almond loghost \n" +
"172.16.12.4 walnut.nuts.com walnut \n" +
"172.16.12.3 pecan.nuts.com pecan \n" +
"172.16.1.2 filbert.nuts.com filbert \n" +
"172.16.6.4 salt.plant.nuts.com salt.plant salt ";
ANTLRStringStream in = new ANTLRStringStream(source);
HostsLexer lexer = new HostsLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
HostsParser parser = new HostsParser(tokens);
parser.parse();
}
}
and will produce the following output:
bart@hades:~/Programming/ANTLR/Demos/Hosts$ java -cp antlr-3.3.jar org.antlr.Tool Hosts.g
bart@hades:~/Programming/ANTLR/Demos/Hosts$ javac -cp antlr-3.3.jar *.java
bart@hades:~/Programming/ANTLR/Demos/Hosts$ java -cp .:antlr-3.3.jar Main
== Entry ==
address : 172.16.12.2
hostName : peanut.nuts.com
aliases : peanut
== Entry ==
address : 127.0.0.1
hostName : localhost
aliases : null
== Entry ==
address : 172.16.12.1
hostName : almond.nuts.com
aliases : almond loghost
== Entry ==
address : 172.16.12.4
hostName : walnut.nuts.com
aliases : walnut
== Entry ==
address : 172.16.12.3
hostName : pecan.nuts.com
aliases : pecan
== Entry ==
address : 172.16.1.2
hostName : filbert.nuts.com
aliases : filbert
== Entry ==
address : 172.16.6.4
hostName : salt.plant.nuts.com
aliases : salt.plant salt
Note that this is just a quick demo: host names can contain other characters than the ones I described, to name just one shortcoming.
精彩评论