First non-repeating char in a string ? in haskell or F#

2023-01-16 05:11 问答作者：

Given a sequence of char what is the most efficient way to find the first non repeating char Interested purely functional implementation开发者_Go百科 haskell or F# preffered.

A fairly straightforward use of Data.Set in combination with filter will do the job in an efficient one-liner. Since this seems homeworkish, I'm declining to provide the precise line in question :-)

The complexity should, I think, be O(n log m) where m is the number of distinct characters in the string and n is the total number of characters in the string.

A simple F# solution:

let f (s: string) =
  let n = Map(Seq.countBy id s)
  Seq.find (fun c -> n.[c] = 1) s

Here's an F# solution in O(n log n): sort the array, then for each character in the original array, binary search for it in the sorted array: if it's the only one of its kind, that's it.

open System
open System.IO
open System.Collections.Generic

let Solve (str : string) =
    let arrStr = str.ToCharArray()
    let sorted = Array.sort arrStr
    let len = str.Length - 1

    let rec Inner i = 
        if i = len + 1 then
            '-'
        else
            let index = Array.BinarySearch(sorted, arrStr.[i])
            if index = 0 && sorted.[index+1] <> sorted.[index] then 
                arrStr.[i]
            elif index = len && sorted.[index-1] <> sorted.[index] then 
                arrStr.[i]
            elif index > 0 && index < len && 
                 sorted.[index+1] <> sorted.[index] && 
                 sorted.[index-1] <> sorted.[index] then 
                arrStr.[i]
            else
                Inner (i + 1)

    Inner 0

let _ =
    printfn "%c" (Solve "abcdefabcf")

A - means all characters are repeated.

Edit: ugly hack with using the - for "no solution" as you can use Options, which I keep forgetting about! An exercise for the reader, as this does look like homework.

Here's a bit longish solution, but guaranteed to be worst-case O(n log n):

import List
import Data.Ord.comparing

sortPairs :: Ord a => [(a, b)]->[(a, b)]
sortPairs = sortBy (comparing fst)

index :: Integral b => [a] -> [(a, b)]
index = flip zip [1..]

dropRepeated :: Eq a => [(a, b)]->[(a, b)]
dropRepeated [] = []
dropRepeated [x] = [x]
dropRepeated (x:xs) | fst x == fst (head xs) =
                          dropRepeated $ dropWhile ((==(fst x)).fst) xs
                    | otherwise =
                          x:(dropRepeated xs)

nonRepeatedPairs :: Ord a => Integral b => [a]->[(a, b)]
nonRepeatedPairs = dropRepeated . sortPairs . index

firstNonRepeating :: Ord a => [a]->a
firstNonRepeating = fst . minimumBy (comparing snd) . nonRepeatedPairs

The idea is: sort the string lexicographically, so that it's easy to remove any repeated characters in linear time and find the first character which is not repeated. But in order to find it, we need to save information about characters' positions in text.

The speed on easy cases (like [1..10000]) is not perfect, but for something harder ([1..10000] ++ [1..10000] ++ [10001]) you can see the difference between this and a naive O(n^2).

Of course this can be done in linear time, if the size of alphabet is O(1), but who knows how large the alphabet is...

An alternate Haskell O(n log n) solution using Data.Map and no sorting:

module NonRepeat (
    firstNonRepeat
    )
    where

import Data.List (minimumBy)
import Data.Map (fromListWith, toList)
import Data.Ord (comparing)

data Occurance = Occ { first :: Int, count :: Int }
    deriving (Eq, Ord)

note :: Int -> a -> (a, Occurance)
note pos a = (a, Occ pos 1)

combine :: Occurance -> Occurance -> Occurance
combine (Occ p0 c0) (Occ p1 c1) = Occ (p0 `min` p1) (c0 + c1)

firstNonRepeat :: (Ord a) => [a] -> Maybe a
firstNonRepeat = fmap fst . findMinimum . occurances
    where occurances = toList . fromListWith combine . zipWith note [0..]
          findMinimum = safeMinimum . filter ((== 1).count.snd)
          safeMinimum [] = Nothing
          safeMinimum xs = Just $ minimumBy (comparing snd) xs

let firstNonRepeating (str:string) =
    let rec inner i cMap =
        if i = str.Length then
            cMap 
            |> Map.filter (fun c (count, index) -> count = 1) 
            |> Map.toSeq 
            |> Seq.minBy (fun (c, (count, index)) -> index)
            |> fst      
        else
            let c = str.[i]
            let value = if cMap.ContainsKey c then 
                            let (count, index) = cMap.[c]
                            (count + 1, index)
                        else 
                            (1, i)
            let cMap = cMap.Add(c, value)
            inner (i + 1) cMap 
    inner 0 (Map.empty)

Here is a simpler version that sacrifices speed.

let firstNonRepeating (str:string) =
    let (c, count) = str 
                     |> Seq.countBy (fun c -> c) 
                     |> Seq.minBy (fun (c, count) -> count)
    if count = 1 then Some c else None

How about something like this:

let firstNonRepeat s =
  let repeats = 
    ((Set.empty, Set.empty), s)
    ||> Seq.fold (fun (one,many) c -> Set.add c one, if Set.contains c one then Set.add c many else many)
    |> snd
  s
  |> Seq.tryFind (fun c -> not (Set.contains c repeats))

This is pure C# (so I assume there's a similar F# version), which will be efficient if GroupBy is efficient (which it ought to be):

static char FstNonRepeatedChar(string s)
{
    return s.GroupBy(x => x).Where(xs => xs.Count() == 1).First().First();
}

继续阅读：algorithm f#haskell

First non-repeating char in a string ? in haskell or F#

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？