开发者

Refactoring small F# function

I've made the following F# function that will get me an url from the html contents of a web page:

let getPicUrl (urlContents : string) =
  let START_TOKEN = "jpg_url="
  let startIndex = urlContents.IndexOf(START_TOKEN)
  let endIndex = urlContents.IndexOf("&amp", startIndex)
  let s = startIndex + START_TOKEN.Length
  let l = endIndex-startIndex-START_TOKEN.Length

  urlContents.Substring(s, l)

what the last line, urlContents.Substring(s, l), actually needs is only s and l, so I was wondering whether I could refactor parts of this function into some internal functions so I'd let my intentions be clearer. Ideally getPicUrl would only have 2 let instruc开发者_如何学Gotions, s and l, and all the others would be internal definitions to those let instructions. If this can in any way be achieved or not is another story..

The only obvious way I can think at the moment to improve the above code would be to switch endIndex of place so we'd have

let getPicUrl (urlContents : string) =
  let START_TOKEN = "jpg_url="
  let startIndex = urlContents.IndexOf(START_TOKEN)
  let s = startIndex + START_TOKEN.Length
  let l =
    let endIndex = urlContents.IndexOf("&amp", startIndex)
    endIndex-startIndex-START_TOKEN.Length

  urlContents.Substring(s, l)

but I keep wondering if there'd be a clearer way of organizing this function's let definitions.


Firstly, your function is buggy. A non-matching string will make it grumpy.

I like regexes for this sort of thing. With this active pattern:

open System.Text.RegularExpressions

let (|Regex|_|) pattern input =
  let m = Regex.Match(input, pattern)
  if m.Success then Some(List.tail [for g in m.Groups -> g.Value])
  else None

you can do:

let tryGetPicUrl = function
  | Regex @"jpg_url=([^&]+)&amp" [url] -> Some url
  | _ -> None

You could also turn your original approach into an active pattern:

let (|Between|_|) (prefix:string) (suffix:string) (value:string) =
  match value.IndexOf(prefix) with
  | -1 -> None
  | s ->
    let n = s + prefix.Length + 1
    match value.IndexOf(suffix, n) with
    | -1 -> None
    | e -> Some (value.Substring(n, e - n))

and do:

let tryGetPicUrl = function
  | Between "jpg_url" "&amp" url -> Some url
  | _ -> None


You can write it this way:

let getPicUrl (urlContents : string) =
  let s =
    let START_TOKEN = "jpg_url="
    let startIndex = urlContents.IndexOf(START_TOKEN)
    startIndex + START_TOKEN.Length
  let l =
    let endIndex = urlContents.IndexOf("&amp", s)
    endIndex-s

  urlContents.Substring(s, l)


Another option would be to use split method of string (I hope the string is not too long as that would be a performance hit) and use option type to indicate whether the URL was found or not.

let getPicUrl (urlContents : string) =
    let splitAndGet n (sep:string) (str:string) = 
        let spl = str.Split([|sep|],StringSplitOptions.None)
        match spl.Length with
        | x when x > n -> Some (spl.[n])
        | _ -> None 
    match urlContents |> splitAndGet 1 "jpg_url=" with
    | Some str -> str |> splitAndGet 0 "&amp"
    | _ -> None
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜