开发者

F# String Pattern-Matching with Wildcards

As part of a project I have assigned myself as a way of improving my knowledge of F# and functional programming in general, I am attempting to write a string pattern-matching algorithm from scratch without using any loops or variables (or regular expressions, or String.Replace and friends). As this is purely a learning project, I'm not interested in the best possible way to do it, just the best functional way to do it.

I'm trying to write a function that accepts a wildcard character, a pattern string, and an input string as parameters. If the pattern does not match the input, the function returns None. If the pattern does match the input, the function returns Some(str) where str is whatever part of the input string matched any wildcards that might have been present in the pattern string.

I have this mostly working, and I'll include the code in a moment. I've written a generic pattern-matching function that works on any generic list of anything that supports equality, and then a helper function that takes strings and passes lists of characters to the generic function. This all works, except for one thing: the support for开发者_如何转开发 multiple wildcards in the pattern string isn't very good - it takes the matches for each wildcard and concatenates them together into a single string in the output.

For example:

> strMatch '*' "foo" "bar";;
val it : string option = None

> strMatch '*' "test" "test";;
val it : string option = Some ""

> strMatch '*' "functional programming is *" "functional programming is fun";;
val it : string option = Some "fun"

> strMatch '*' "* and *" "you and me";;
val it : string option = Some "youme"

It's the last one that I'm trying to fix. Ideally I'd like to return a list of strings rather than a single string, with each element in the list being the string that matched one wildcard. Failing that, I can probably make do with a version that returns only the match for the first wildcard - it's the concatenated values from both wildcards that I need to get rid of. I'm just not quite sure how to approach it.

So if anyone can suggest how I can group my return values by which wildcard they matched, I would be grateful. I'm also interested in any other improvements to my code that you might want to suggest.

let rec doMatch (wildcard:'a) (pat:'a list) (input:'a list) : 'a list option =
    let singleMatch p i =
        match (p, i) with
        | phd :: ptl, ihd :: itl ->
            if phd = wildcard then
                match doMatch wildcard ptl itl with
                | None -> None
                | Some x -> Some(ihd :: x)
            else None
        | _ -> None

    let longerMatch p i =
        match (p, i) with
        | phd :: ptl, ihd :: itl ->
            if phd = wildcard then
                match doMatch wildcard p itl with
                | None -> None
                | Some x -> Some(ihd :: x)
            else None
        | _ -> None

    match (pat, input) with
    | [], [] -> Some([])
    | [], _::_ -> None
    | _::_, [] -> None
    | phd :: ptl, ihd :: itl ->
        if phd <> wildcard then
            if phd = ihd then doMatch wildcard ptl itl
            else None
        else
            match singleMatch pat input with
            | Some x -> Some(x)
            | None -> longerMatch pat input

let strMatch (wildcard:char) (pat:string) (input:string) =
    match doMatch wildcard (List.ofSeq pat) (List.ofSeq input) with
    | None -> None
    | Some x -> Some(new string(Array.ofList x))

You've probably guessed, but this is part of an Eliza chat-bot implementation in F#.


From a design point of view, I like the idea of returning an

'a list option

where e.g.

None              // it did not match
Some[]            // matched, input had 0 wildcards
Some["foo";"bar"] // matched, input has 2 wildcards, "foo" matched 1st, "bar" 2nd

That is, just guarantee that when 'Some' is returned, the length of the list equals the number of wildcards, and the elements of the list are the matches in order. This seems to me to be straightforward to implement as well as reasonable for client code to use/consume.

(I am unclear if there is any deeper question in your long post.)

Looks like fun stuff!

EDIT

Here's some updated code. My gut tells me it's not all correct, but it at least works on your examples. The key is to use

'a list list option

since 'a is a character, an 'a list is like a string, and we want a list of strings. singleMatch starts a new list of strings, whereas longerMatch is consing onto the front of the current string.

let rec doMatch (wildcard:'a) (pat:'a list) (input:'a list) 
           : 'a list list option =
    let singleMatch p i =
        match (p, i) with
        | phd :: ptl, ihd :: itl ->
            if phd = wildcard then
                match doMatch wildcard ptl itl with
                | None -> None
                | Some xs -> Some([ihd]::xs)
            else None
        | _ -> None

    let longerMatch p i =
        match (p, i) with
        | phd :: ptl, ihd :: itl ->
            if phd = wildcard then
                match doMatch wildcard p itl with
                | None -> None
                | Some ([]) -> Some([[ihd]])
                | Some (x::xs) -> Some((ihd :: x)::xs)
            else None
        | _ -> None

    match (pat, input) with
    | [], [] -> Some([])
    | [], _::_ -> None
    | _::_, [] -> None
    | phd :: ptl, ihd :: itl ->
        if phd <> wildcard then
            if phd = ihd then doMatch wildcard ptl itl
            else None
        else
            match singleMatch pat input with
            | Some x -> Some(x)
            | None -> longerMatch pat input

let strMatch (wildcard:char) (pat:string) (input:string) =
    match doMatch wildcard (List.ofSeq pat) (List.ofSeq input) with
    | None -> None
    | Some x -> Some(x|>List.map (fun chList -> new string(Array.ofList chList)))

printfn "%A" (strMatch '*' "foo" "bar")
printfn "%A" (strMatch '*' "test" "test")
printfn "%A" (strMatch '*' "functional programming is *" 
                           "functional programming is fun")
printfn "%A" (strMatch '*' "* and *" "you and me")
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜