开发者

Haskell Parsec and Unordered Properties

I am trying to use Parsec to parse something like this:

property :: CharParser SomeObject
property = do
    name
    parameters
    value
    return SomeO开发者_如何学PythonbjectInstance { fill in records here }

I am implementing the iCalendar spec and on every like there is a name:parameters:value triplet, very much like the way that XML has a name:attributes:content triplet. Infact you could very easily convert an iCalendar into XML format (thought I can't really see the advantages).

My point is that the parameters do not have to come in any order at all and each paramater may have a different type. One parameter may be a string while the other is the numeric id of another element. They may share no similarity yet, in the end, I want to place them correctly in the right record fields for whatever 'SomeObjectInstance' that I wanted the parser to return. How do I go about doing this sort of thing (or can you point me to an example of where somebody had to parse data like this)?

Thankyou, I know that my question is probably a little confused but that reflects my level of understanding of what I need to do.

Edit: I was trying to avoid giving the expected output (because it is large, not because it is hidden) but here is an example of an input file (from wikipedia):

BEGIN:VCALENDAR

VERSION:2.0

PRODID:-//hacksw/handcal//NONSGML v1.0//EN

BEGIN:VEVENT

UID:uid1@example.com

DTSTAMP:19970714T170000Z

ORGANIZER;CN=John Doe:MAILTO:john.doe@example.com

DTSTART:19970714T170000Z

DTEND:19970715T035959Z

SUMMARY:Bastille Day Party

END:VEVENT

END:VCALENDAR

As you can see it contains one VEvent inside a VCalendar, I have made data structures that represent them here.

I am trying to write a parser that parses that type of file into my data structures and I am stuck on the bit where I need to handle properties coming in any order with any type; date, time, int, string, uid, ect. I hope that makes more sense without repeating the entire iCalendar spec.


Parsec has the Parsec.Perm module precisely to parse unordered but linear (i.e. at the same level in the syntax tree) elements such as attribute tags in XML files.

Unfortunately the Perm module is mostly undocumented. The best reference is the Parsing Permutation Phrases paper which the Haddock doc page refers to, but even that is largely a description of the technique rather than how to use it.


Ok, so between BEGIN:VEVENT and END:VEVENT, you have many key value pairs. So write a rule keyValuePair that returns (key, value). Now inside the rule for VEVENT you do many KeyValuePair to get a list of pairs. Once you've done that you use a fold to populate a VEVENT record with the given values. In the function you give to fold, you use pattern matching to find out in which field to store the value. As the starting value for the accumulator you use a VEvent record where the optional fields are set to Nothing. Example:

pairs <- many keyValuePairs
vevent = foldr f (VEvent {sequence = Nothing}) pairs
    where f ("SUMMARY", v) ve = ve {summary = v}
          f ("DSTART", v) ve = ve {dstart = read v}

...and so on. Do the same for the other components.

Edit: Here's some runnable example code for the fold:

data VEvent = VEvent {
        summary :: String,
        dstart :: String,
        sequenceSt :: Maybe String
        } deriving Show

vevent pairs = foldr f (VEvent {sequenceSt = Nothing}) pairs
    where f ("SUMMARY", v) ve = ve {summary = v}
          f ("DSTART", v) ve = ve {dstart = v}
          f ("SEQUENCEST", v) ve = ve {sequenceSt = Just v}

main = do print $ vevent [("SUMMARY", "lala"), ("DSTART", "lulu")]
          print $ vevent [("SUMMARY", "lala"), ("DSTART", "lulu"), ("SEQUENCEST", "lili")]

Output:

VEvent {summary = "lala", dstart = "lulu", sequenceSt = Nothing}
VEvent {summary = "lala", dstart = "lulu", sequenceSt = Just "lili"}

Note that this will produce a warning when compiled. To avoid the warning, initialize all non-optional fields to undefined explicitly.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜