Getting n smallest numbers in a sequence

2023-04-10 05:40 问答作者：

What would be the most efficient way to take n smallest numbers from a sequence,

[ [1 2 3] [9 2 1] [2 3 4] [5 6 7] ]

I would like to take 2 smallest from the sequence based on the first item,

[1 2 3] [2 3 4]

currently I am sorting the whole list then taking first n items but that probably is not the most efficient way to go, it is a big list an开发者_StackOverflow中文版d I need to do this frequently.

The Joy of Clojure, Chapter 6.4 describes a lazy sorting algorithm.The beauty of lazy sorting is that it will only do as much work as necessary to find the first x values. So if x << n this algorithm is O(n). Here is a modified version of that algorithm.

(defn sort-parts                                                                                                                                                                                                            
  [work f]                                                                                                                                                                                                                  
  (lazy-seq                                                                                                                                                                                                                 
   (loop [[part & parts] work]                                                                                                                                                                                              
     (if-let [[pivot & xs] (seq part)]                                                                                                                                                                                      
       (let [psmaller? (partial f pivot)]                                                                                                                                                                                   
         (recur (list* (filter psmaller? xs)                                                                                                                                                                                
                       pivot                                                                                                                                                                                                
                       (remove psmaller? xs)                                                                                                                                                                                
                       parts)))                                                                                                                                                                                             
       (when-let [[x & parts] parts]                                                                                                                                                                                        
         (cons x                                                                                                                                                                                                            
               (sort-parts parts f)))))))                                                                                                                                                                                   

(defn qsort [xs f] (sort-parts (list xs) f))                                                                                                                                                                                

(defn cmp [[a _ _] [b _ _]] (> a b))                                                                                                                                                                                        

(def a [[1 2 3] [9 2 1]  [2 3 4] [5 6 7]])                                                                                                                                                                                   

(take 2 (qsort a cmp))

As referenced, you can use the median-of-medians algorithm to select the kth smallest element in linear time, and then partition in linear time. This will provide you with the k smallest elements in O(n). The elements will however be unsorted, so if you want the k smallest elements sorted it will cost you another O(klogk).

A few important notes:

Firstly, although the complexity is O(n) small constants are not guaranteed and you might find minimal improvement, especially if your n is reasonably small. There are random linear selection algorithms that run in better actual times (usually the expected running time is O(n) with worse worst-cases but they have smaller constants than the deterministic ones).
Why can't you maintain the array in a sorted fashion? That would probably be much more performant. You would simply need to insert each element in the correct place which costs O(logn), but finding the k smallest would then be O(1) (or O(k) if you have to build the array afresh).
If you decide against the above note, then an alternative is to keep the array sorted after every such procedure, provide insert in O(1) to the end of the array and then execute a "merge sort" every time you need to find the k smallest elements. I.e. you sort only the new ones and then merge them in in linear time. So that would cost O(mlogm + n) where m is the number of elements added since last sort.

If n is small, you can create a second list of size n that you keep sorted, so you always have quick access to the largest in that list; iterate through the big list checking if each is less than the largest in the small list; if so, insert it into the small list... the small list is full, pop off the prior oldest.

If n is less than 3 or 4, you can just brute force it. If n can be larger, you'll want to do a binary search to find the insertion point for each. If n can be very large, then a different mechanism may be in order.

继续阅读：algorithm clojure

Getting n smallest numbers in a sequence

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？