Longest increasing subsequence
Given an input sequence, what is the best way to find the longest (not necessarily continuous) increasing subsequence
[0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15] # input
[1, 9, 13, 15] # an example of an increasing subsequence (not the longest)
[0, 2, 6, 9, 13, 15] # longest increasing subsequence (not a unique answer)
[0, 2, 6, 9, 11, 15] # another possible solution
I'm looking for the best algorithm. If there is code, Pyth开发者_JS百科on would be nice, but anything is alright.
I just stumbled in this problem, and came up with this Python 3 implementation:
def subsequence(seq):
if not seq:
return seq
M = [None] * len(seq) # offset by 1 (j -> j-1)
P = [None] * len(seq)
# Since we have at least one element in our list, we can start by
# knowing that the there's at least an increasing subsequence of length one:
# the first element.
L = 1
M[0] = 0
# Looping over the sequence starting from the second element
for i in range(1, len(seq)):
# Binary search: we want the largest j <= L
# such that seq[M[j]] < seq[i] (default j = 0),
# hence we want the lower bound at the end of the search process.
lower = 0
upper = L
# Since the binary search will not look at the upper bound value,
# we'll have to check that manually
if seq[M[upper-1]] < seq[i]:
j = upper
else:
# actual binary search loop
while upper - lower > 1:
mid = (upper + lower) // 2
if seq[M[mid-1]] < seq[i]:
lower = mid
else:
upper = mid
j = lower # this will also set the default value to 0
P[i] = M[j-1]
if j == L or seq[i] < seq[M[j]]:
M[j] = i
L = max(L, j+1)
# Building the result: [seq[M[L-1]], seq[P[M[L-1]]], seq[P[P[M[L-1]]]], ...]
result = []
pos = M[L-1]
for _ in range(L):
result.append(seq[pos])
pos = P[pos]
return result[::-1] # reversing
Since it took me some time to understand how the algorithm works I was a little verbose with comments, and I'll also add a quick explanation:
seq
is the input sequence.L
is a number: it gets updated while looping over the sequence and it marks the length of longest incresing subsequence found up to that moment.M
is a list.M[j-1]
will point to an index ofseq
that holds the smallest value that could be used (at the end) to build an increasing subsequence of lengthj
.P
is a list.P[i]
will point toM[j]
, wherei
is the index ofseq
. In a few words, it tells which is the previous element of the subsequence.P
is used to build the result at the end.
How the algorithm works:
- Handle the special case of an empty sequence.
- Start with a subsequence of 1 element.
- Loop over the input sequence with index
i
. - With a binary search find the
j
that letseq[M[j]
be<
thanseq[i]
. - Update
P
,M
andL
. - Traceback the result and return it reversed.
Note: The only differences with the wikipedia algorithm are the offset of 1 in the M
list, and that X
is here called seq
. I also test it with a slightly improved unit test version of the one showed in Eric Gustavson answer and it passed all tests.
Example:
seq = [30, 10, 20, 50, 40, 80, 60]
0 1 2 3 4 5 6 <-- indexes
At the end we'll have:
M = [1, 2, 4, 6, None, None, None]
P = [None, None, 1, 2, 2, 4, 4]
result = [10, 20, 40, 60]
As you'll see P
is pretty straightforward. We have to look at it from the end, so it tells that before 60
there's 40,
before 80
there's 40
, before 40
there's 20
, before 50
there's 20
and before 20
there's 10
, stop.
The complicated part is on M
. At the beginning M
was [0, None, None, ...]
since the last element of the subsequence of length 1 (hence position 0 in M
) was at the index 0: 30
.
At this point we'll start looping on seq
and look at 10
, since 10
is <
than 30
, M
will be updated:
if j == L or seq[i] < seq[M[j]]:
M[j] = i
So now M
looks like: [1, None, None, ...]
. This is a good thing, because 10
have more chanches to create a longer increasing subsequence. (The new 1 is the index of 10)
Now it's the turn of 20
. With 10
and 20
we have subsequence of length 2 (index 1 in M
), so M
will be: [1, 2, None, ...]
. (The new 2 is the index of 20)
Now it's the turn of 50
. 50
will not be part of any subsequence so nothing changes.
Now it's the turn of 40
. With 10
, 20
and 40
we have a sub of length 3 (index 2 in M
, so M
will be: [1, 2, 4, None, ...]
. (The new 4 is the index of 40)
And so on...
For a complete walk through the code you can copy and paste it here :)
Here is how to simply find longest increasing/decreasing subsequence in Mathematica:
LIS[list_] := LongestCommonSequence[Sort[list], list];
input={0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15};
LIS[input]
-1*LIS[-1*input]
Output:
{0, 2, 6, 9, 11, 15}
{12, 10, 9, 5, 3}
Mathematica has also LongestIncreasingSubsequence function in the Combinatorica` libary. If you do not have Mathematica you can query the WolframAlpha.
C++ O(nlogn) solution
There's also an O(nlogn) solution based on some observations. Let Ai,j be the smallest possible tail out of all increasing subsequences of length j using elements a1, a2, ... , ai. Observe that, for any particular i, Ai,1, Ai,2, ... , Ai,j. This suggests that if we want the longest subsequence that ends with ai + 1, we only need to look for a j such that Ai,j < ai + 1 < = Ai,j + 1 and the length will be j + 1. Notice that in this case, Ai + 1,j + 1 will be equal to ai + 1, and all Ai + 1,k will be equal to Ai,k for k!=j+1. Furthermore, there is at most one difference between the set Ai and the set Ai + 1, which is caused by this search. Since A is always ordered in increasing order, and the operation does not change this ordering, we can do a binary search for every single a1, a2, ... , an.
Implementation C++ (O(nlogn) algorithm)
#include <vector> using namespace std; /* Finds longest strictly increasing subsequence. O(n log k) algorithm. */ void find_lis(vector<int> &a, vector<int> &b) { vector<int> p(a.size()); int u, v; if (a.empty()) return; b.push_back(0); for (size_t i = 1; i < a.size(); i++) { if (a[b.back()] < a[i]) { p[i] = b.back(); b.push_back(i); continue; } for (u = 0, v = b.size()-1; u < v;) { int c = (u + v) / 2; if (a[b[c]] < a[i]) u=c+1; else v=c; } if (a[i] < a[b[u]]) { if (u > 0) p[i] = b[u-1]; b[u] = i; } } for (u = b.size(), v = b.back(); u--; v = p[v]) b[u] = v; } /* Example of usage: */ #include <cstdio> int main() { int a[] = { 1, 9, 3, 8, 11, 4, 5, 6, 4, 19, 7, 1, 7 }; vector<int> seq(a, a+sizeof(a)/sizeof(a[0])); vector<int> lis; find_lis(seq, lis); for (size_t i = 0; i < lis.size(); i++) printf("%d ", seq[lis[i]]); printf("\n"); return 0; }
Source: link
I have rewritten the C++ implementation to Java a while ago, and can confirm it works. Vector alternative in python is List. But if you want to test it yourself, here is link for online compiler with example implementation loaded: link
Example data is: { 1, 9, 3, 8, 11, 4, 5, 6, 4, 19, 7, 1, 7 }
and answer: 1 3 4 5 6 7
.
Here is a pretty general solution that:
- runs in
O(n log n)
time, - handles increasing, nondecreasing, decreasing and nonincreasing subsequences,
- works with any sequence objects, including
list
,numpy.array
,str
and more, - supports lists of objects and custom comparison methods through the
key
parameter that works like the one in the builtinsorted
function, - can return the elements of the subsequence or their indices.
The code:
from bisect import bisect_left, bisect_right
from functools import cmp_to_key
def longest_subsequence(seq, mode='strictly', order='increasing',
key=None, index=False):
bisect = bisect_left if mode.startswith('strict') else bisect_right
# compute keys for comparison just once
rank = seq if key is None else map(key, seq)
if order == 'decreasing':
rank = map(cmp_to_key(lambda x,y: 1 if x<y else 0 if x==y else -1), rank)
rank = list(rank)
if not rank: return []
lastoflength = [0] # end position of subsequence with given length
predecessor = [None] # penultimate element of l.i.s. ending at given position
for i in range(1, len(seq)):
# seq[i] can extend a subsequence that ends with a lesser (or equal) element
j = bisect([rank[k] for k in lastoflength], rank[i])
# update existing subsequence of length j or extend the longest
try: lastoflength[j] = i
except: lastoflength.append(i)
# remember element before seq[i] in the subsequence
predecessor.append(lastoflength[j-1] if j > 0 else None)
# trace indices [p^n(i), ..., p(p(i)), p(i), i], where n=len(lastoflength)-1
def trace(i):
if i is not None:
yield from trace(predecessor[i])
yield i
indices = trace(lastoflength[-1])
return list(indices) if index else [seq[i] for i in indices]
I wrote a docstring for the function that I didn't paste above in order to show off the code:
"""
Return the longest increasing subsequence of `seq`.
Parameters
----------
seq : sequence object
Can be any sequence, like `str`, `list`, `numpy.array`.
mode : {'strict', 'strictly', 'weak', 'weakly'}, optional
If set to 'strict', the subsequence will contain unique elements.
Using 'weak' an element can be repeated many times.
Modes ending in -ly serve as a convenience to use with `order` parameter,
because `longest_sequence(seq, 'weakly', 'increasing')` reads better.
The default is 'strict'.
order : {'increasing', 'decreasing'}, optional
By default return the longest increasing subsequence, but it is possible
to return the longest decreasing sequence as well.
key : function, optional
Specifies a function of one argument that is used to extract a comparison
key from each list element (e.g., `str.lower`, `lambda x: x[0]`).
The default value is `None` (compare the elements directly).
index : bool, optional
If set to `True`, return the indices of the subsequence, otherwise return
the elements. Default is `False`.
Returns
-------
elements : list, optional
A `list` of elements of the longest subsequence.
Returned by default and when `index` is set to `False`.
indices : list, optional
A `list` of indices pointing to elements in the longest subsequence.
Returned when `index` is set to `True`.
"""
Some examples:
>>> seq = [0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15]
>>> longest_subsequence(seq)
[0, 2, 6, 9, 11, 15]
>>> longest_subsequence(seq, order='decreasing')
[12, 10, 9, 5, 3]
>>> txt = ("Given an input sequence, what is the best way to find the longest"
" (not necessarily continuous) non-decreasing subsequence.")
>>> ''.join(longest_subsequence(txt))
' ,abdegilnorsu'
>>> ''.join(longest_subsequence(txt, 'weak'))
' ceilnnnnrsssu'
>>> ''.join(longest_subsequence(txt, 'weakly', 'decreasing'))
'vuutttttttssronnnnngeee.'
>>> dates = [
... ('2015-02-03', 'name1'),
... ('2015-02-04', 'nameg'),
... ('2015-02-04', 'name5'),
... ('2015-02-05', 'nameh'),
... ('1929-03-12', 'name4'),
... ('2023-07-01', 'name7'),
... ('2015-02-07', 'name0'),
... ('2015-02-08', 'nameh'),
... ('2015-02-15', 'namex'),
... ('2015-02-09', 'namew'),
... ('1980-12-23', 'name2'),
... ('2015-02-12', 'namen'),
... ('2015-02-13', 'named'),
... ]
>>> longest_subsequence(dates, 'weak')
[('2015-02-03', 'name1'),
('2015-02-04', 'name5'),
('2015-02-05', 'nameh'),
('2015-02-07', 'name0'),
('2015-02-08', 'nameh'),
('2015-02-09', 'namew'),
('2015-02-12', 'namen'),
('2015-02-13', 'named')]
>>> from operator import itemgetter
>>> longest_subsequence(dates, 'weak', key=itemgetter(0))
[('2015-02-03', 'name1'),
('2015-02-04', 'nameg'),
('2015-02-04', 'name5'),
('2015-02-05', 'nameh'),
('2015-02-07', 'name0'),
('2015-02-08', 'nameh'),
('2015-02-09', 'namew'),
('2015-02-12', 'namen'),
('2015-02-13', 'named')]
>>> indices = set(longest_subsequence(dates, key=itemgetter(0), index=True))
>>> [e for i,e in enumerate(dates) if i not in indices]
[('2015-02-04', 'nameg'),
('1929-03-12', 'name4'),
('2023-07-01', 'name7'),
('2015-02-15', 'namex'),
('1980-12-23', 'name2')]
This answer was in part inspired by the question over at Code Review and in part by question asking about "out of sequence" values.
Here is some python code with tests which implements the algorithm running in O(n*log(n)). I found this on a the wikipedia talk page about the longest increasing subsequence.
import unittest
def LongestIncreasingSubsequence(X):
"""
Find and return longest increasing subsequence of S.
If multiple increasing subsequences exist, the one that ends
with the smallest value is preferred, and if multiple
occurrences of that value can end the sequence, then the
earliest occurrence is preferred.
"""
n = len(X)
X = [None] + X # Pad sequence so that it starts at X[1]
M = [None]*(n+1) # Allocate arrays for M and P
P = [None]*(n+1)
L = 0
for i in range(1,n+1):
if L == 0 or X[M[1]] >= X[i]:
# there is no j s.t. X[M[j]] < X[i]]
j = 0
else:
# binary search for the largest j s.t. X[M[j]] < X[i]]
lo = 1 # largest value known to be <= j
hi = L+1 # smallest value known to be > j
while lo < hi - 1:
mid = (lo + hi)//2
if X[M[mid]] < X[i]:
lo = mid
else:
hi = mid
j = lo
P[i] = M[j]
if j == L or X[i] < X[M[j+1]]:
M[j+1] = i
L = max(L,j+1)
# Backtrack to find the optimal sequence in reverse order
output = []
pos = M[L]
while L > 0:
output.append(X[pos])
pos = P[pos]
L -= 1
output.reverse()
return output
# Try small lists and check that the correct subsequences are generated.
class LISTest(unittest.TestCase):
def testLIS(self):
self.assertEqual(LongestIncreasingSubsequence([]),[])
self.assertEqual(LongestIncreasingSubsequence(range(10,0,-1)),[1])
self.assertEqual(LongestIncreasingSubsequence(range(10)),range(10))
self.assertEqual(LongestIncreasingSubsequence(\
[3,1,4,1,5,9,2,6,5,3,5,8,9,7,9]), [1,2,3,5,8,9])
unittest.main()
int[] a = {1,3,2,4,5,4,6,7};
StringBuilder s1 = new StringBuilder();
for(int i : a){
s1.append(i);
}
StringBuilder s2 = new StringBuilder();
int count = findSubstring(s1.toString(), s2);
System.out.println(s2.reverse());
public static int findSubstring(String str1, StringBuilder s2){
StringBuilder s1 = new StringBuilder(str1);
if(s1.length() == 0){
return 0;
}
if(s2.length() == 0){
s2.append(s1.charAt(s1.length()-1));
findSubstring(s1.deleteCharAt(s1.length()-1).toString(), s2);
} else if(s1.charAt(s1.length()-1) < s2.charAt(s2.length()-1)){
char c = s1.charAt(s1.length()-1);
return 1 + findSubstring(s1.deleteCharAt(s1.length()-1).toString(), s2.append(c));
}
else{
char c = s1.charAt(s1.length()-1);
StringBuilder s3 = new StringBuilder();
for(int i=0; i < s2.length(); i++){
if(s2.charAt(i) > c){
s3.append(s2.charAt(i));
}
}
s3.append(c);
return Math.max(findSubstring(s1.deleteCharAt(s1.length()-1).toString(), s2),
findSubstring(s1.deleteCharAt(s1.length()-1).toString(), s3));
}
return 0;
}
Here is the code and explanation with Java, may be I will add for python soon.
arr = {0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15}
- list = {0} - Initialize list to the empty set
- list = {0,8} - New largest LIS
- list = {0, 4} - Changed 8 to 4
- list = {0, 4, 12} - New largest LIS
- list = {0, 2, 12} - Changed 4 to 2
- list = {0, 2, 10} - Changed 12 to 10
- list = {0, 2, 6} - Changed 10 to 6
- list = {0, 2, 6, 14} - New largest LIS
- list = {0, 1, 6, 14} - Changed 2 to 1
- list = {0, 1, 6, 9} - Changed 14 to 9
- list = {0, 1, 5, 9} - Changed 6 to 5
- list = {0, 1, 6, 9, 13} - Changed 3 to 2
- list = {0, 1, 3, 9, 11} - New largest LIS
- list = {0, 1, 3, 9, 11} - Changed 9 to 5
- list = {0, 1, 3, 7, 11} - New largest LIS
- list = {0, 1, 3, 7, 11, 15} - New largest LIS
So the length of the LIS is 6 (the size of list).
import java.util.ArrayList;
import java.util.List;
public class LongestIncreasingSubsequence {
public static void main(String[] args) {
int[] arr = { 0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15 };
increasingSubsequenceValues(arr);
}
public static void increasingSubsequenceValues(int[] seq) {
List<Integer> list = new ArrayList<Integer>();
for (int i = 0; i < seq.length; i++) {
int j = 0;
boolean elementUpdate = false;
for (; j < list.size(); j++) {
if (list.get(j) > seq[i]) {
list.add(j, seq[i]);
list.remove(j + 1);
elementUpdate = true;
break;
}
}
if (!elementUpdate) {
list.add(j, seq[i]);
}
}
System.out.println("Longest Increasing Subsequence" + list);
}
}
Output for the above code: Longest Increasing Subsequence[0, 1, 3, 7, 11, 15]
Here's a more compact but still efficient Python implementation:
def longest_increasing_subsequence_indices(seq):
from bisect import bisect_right
if len(seq) == 0:
return seq
# m[j] in iteration i is the last index of the increasing subsequence of seq[:i]
# that ends with the lowest possible value while having length j
m = [None] * len(seq)
predecessor = [None] * len(seq)
best_len = 0
for i, item in enumerate(seq):
j = bisect_right([seq[k] for k in m[:best_len]], item)
m[j] = i
predecessor[i] = m[j-1] if j > 0 else None
best_len = max(best_len, j+1)
result = []
i = m[best_len-1]
while i is not None:
result.append(i)
i = predecessor[i]
result.reverse()
return result
def longest_increasing_subsequence(seq):
return [seq[i] for i in longest_increasing_subsequence_indices(seq)]
There are several answers in code, but I found them a bit hard to understand, so here is an explanation of the general idea, leaving out all the optimizations. I will get to the optimizations later.
We will use the sequence 2, 8, 4, 12, 3, 10 and, to make it easier to follow, we will require the input sequence to not be empty and to not include the same number more than once.
We go through the sequence in order.
As we do, we maintain a set of sequences, the best sequences we have found so far for each length. After we find the first sequence of length 1, which is the first element of the input sequence, we are guaranteed to have a set of sequences for each possible length from 1 to the longest we have found so far. This is obvious, because if we have a sequence of length 3, then the first 2 elements of that sequence are a sequence of length 2.
So we start with the first element being a sequence of length 1 and our set looks like
1: 2
We take the next element of the sequence (8) and look for the longest sequence we can add it to. This is sequence 1, so we get
1: 2
2: 2 8
We take the next element of the sequence (4) and look for the longest sequence we can add it to. The longest sequence we can add it to is the one of length 1 (which is just 2
). Here is what I found to be the tricky (or at least non-obvious) part. Because we could not add it to the end of the sequence of length 2 (2 8
) that means it must be a better choice to end the length 2 candidate. If the element were greater than 8, it would have tacked on to the length 2 sequence and given us a new length 3 sequence. So we know that it is less than 8 and therefore replace the 8 with the 4.
Algorithmically, what we say is that whatever is the longest sequence we can tack the element onto, that sequence plus this element is the best candidate for a sequence of the resulting length. Note that every element we process must belong somewhere (because we ruled out duplicate numbers in the input). If it is smaller than the element in length 1, it is the new length 1, otherwise it goes on the end of some existing sequence. Here, the length 1 sequence plus the element 4 becomes the new length 2 sequence and we have:
1: 2
2: 2 4 (replaces 2 8)
The next element, 12, gives us a sequence of length 3 and we have
1: 2
2: 2 4
3: 2 4 12
The next element, 3, gives us a better sequence of length 2:
1: 2
2: 2 3 (replaces 2 4)
3: 2 4 12
Note the we cannot alter the sequence of length 3 (substituting the 3 for the 4) because they did not occur in that order in the input sequence. The next element, 10, takes care of this. Because the best we can do with 10 is add it on to 2 3
it becomes the new list of length 3:
1: 2
2: 2 3
3: 2 3 10 (replaces 2 4 12)
Note that in terms of the algorithm, we really don't care what comes before the last element on any of our candidate sequences, but of course we need to keep track so that at the end we can output the full sequence.
We keep processing input elements like this: just tack each one onto the longest sequence we can and make that the new candidate sequence for the resulting length, because it is guaranteed not to be worse than the existing sequence of that length. At the end, we output the longest sequence we have found.
Optimizations
One optimization is that we do not really need to store the entire sequence of each length. To do so would take space of O(n^2). For the most part, we can get away with just storing the last element of each sequence, since that is all we ever compare against. (I will get to why this is not entirely sufficient in a bit. See if you can figure out why before I get to it.)
So let's say we will store our set of sequences as an array M
where M[x]
holds the last element of the sequence of length x
. If you think about it, you will realize that the elements of M
are themselves in increasing order: they are sorted. If M[x+1]
were less than M[x]
, it would have replaced M[x]
instead.
Since M
is sorted, the next optimization goes to something I totally glossed over above: how do we find the sequence to add on to? Well, since M
is sorted, we can just do a binary search to find the largest M[x]
less than the element to be added. That is the sequence we add on to.
This is great if all we want to do is find the length of the longest sequence. However, M
is not sufficient to reconstruct the sequence itself. Remember, at one point our set looked like this:
1: 0
2: 0 2
3: 0 4 12
We cannot just output M
itself as the sequence. We need more information in order to be able to reconstruct the sequence. For this, we make 2 more changes. First, we store the input sequence in an array seq
and instead of storing the value of the element in M[x]
, we store the index of the element in seq
, so the value is seq[M[x]]
.
We do this so that we can keep a record of the entire sequence by chaining subsequences. As you saw at the beginning, every sequence is created by adding a single element to the end of an already existing sequence. So, second, we keep another array P
that stores the index (in seq
) of the last element of the sequence we are adding on to. In order to make it chainable, since what we are storing in P
is an index of seq
we have to index P
itself by an index of seq
.
The way this works is that when processing element i
of seq
, we find which sequence we are adding onto. Remember, we are going to tack seq[i]
onto a sequence of length x
to create a new sequence of length x+1
for some x
, and we are storing i
, not seq[i]
in M[x+1]
. Later, when we find that x+1
is the biggest length possible, we are going to want to reconstruct the sequence, but the only starting point we have is M[x+1]
.
What we do is set M[x+1] = i
and P[i] = M[x]
(which is identical to P[M[x+1]] = M[x]
), which is to say that for every element i
we add, we store i
as the last element in the longest chain we can and we store the index of the last element of the chain we are extending in P[i]
. So we have:
last element: seq[M[x]]
before that: seq[P[M[x]]]
before that: seq[P[P[M[x]]]]
etc...
And now we are done. If you want to compare this to actual code, you can look at the other examples. The main differences are they use j
instead of x
, may store the list of length j
at M[j-1]
instead of M[j]
to avoid wasting the space at M[0]
, and may call the input sequence X
instead of seq
.
The most efficient algorithm for this is O(NlogN) outlined here.
Another way to solve this is to take the longest common subsequence (LCS) of the original array and it's sorted version, which takes O(N2) time.
here's a compact implementation using "enumerate"
def lis(l):
# we will create a list of lists where each sub-list contains
# the longest increasing subsequence ending at this index
lis = [[e] for e in l]
# start with just the elements of l as contents of the sub-lists
# iterate over (index,value) of l
for i, e in enumerate(l):
# (index,value) tuples for elements b where b<e and a<i
lower_tuples = filter(lambda (a,b): b<e, enumerate(l[:i]))
# if no such items, nothing to do
if not lower_tuples: continue
# keep the lis-es of such items
lowerlises = [lis[a] for a,b in lower_tuples ]
# choose the longest one of those and add
# to the current element's lis
lis[i] = max(lowerlises, key=len) + [e]
# retrun the longest of lis-es
return max(lis, key=len)
Here is my C++ solution of the problem. The solution is simpler than all of the provided here so far, and it is fast: N*log(N)
algorithmic time complexity. I submitted the solution at leetcode, it runs 4 ms, faster than 100% of C++ solutions submitted.
The idea is (in my opinion) clear: traverse the given array of numbers from left to right. Maintain additionally array of numbers (seq
in my code), that holds increasing subsequence. When the taken number is bigger than all numbers that the subsequence holds, put it at the end of seq
and increase the subsequence length counter by 1. When the number is smaller than the biggest number in the subsequence so far, put it anyway in seq
, in the place where it belongs to keep the subsequence sorted by replacing some existing number. The subsequence is initialized with the length of the original numbers array and with initial value -inf, what means smallest int in the given OS.
Example:
numbers = { 10, 9, 2, 5, 3, 7, 101, 18 }
seq = {-inf, -inf, -inf, -inf, -inf, -inf, -inf}
here is how the sequence changes when we traverse the numbers from left to right:
seq = {10, -inf, -inf, -inf, -inf, -inf, -inf}
seq = {9, -inf, -inf, -inf, -inf, -inf, -inf}
seq = {2, -inf, -inf, -inf, -inf, -inf, -inf}
seq = {2, 5, -inf, -inf, -inf, -inf, -inf}
seq = {2, 3, -inf, -inf, -inf, -inf, -inf}
seq = {2, 3, 7, -inf, -inf, -inf, -inf}
seq = {2, 3, 7, 101, -inf, -inf, -inf}
seq = {2, 3, 7, 18, -inf, -inf, -inf}
The longest increasing subsequence for the array has length 4.
Here is the code:
int longestIncreasingSubsequence(const vector<int> &numbers){
if (numbers.size() < 2)
return numbers.size();
vector<int>seq(numbers.size(), numeric_limits<int>::min());
seq[0] = numbers[0];
int len = 1;
vector<int>::iterator end = next(seq.begin());
for (size_t i = 1; i < numbers.size(); i++) {
auto pos = std::lower_bound(seq.begin(), end, numbers[i]);
if (pos == end) {
*end = numbers[i];
end = next(end);
len++;
}
else
*pos = numbers[i];
}
return len;
}
Well, so far so good, but how do we know that the algorithm computes the length of the longest (or one of the longest, here may be several subsequences of the same size) subsequence? Here is my proof:
Let's assume that the algorithm does not computes length of the longest subsequence. Then in the original sequence must exist a number such that the algorithm misses and that would make the subsequence longer. Let's say, for a subsequence x1, x2, ..., xn there exists a number y such that xk < y < xk+1, 1 <= k <= n. To contribute to the subsequence y must be located in the original sequence between xk and xk+1. But then we have contradiction: when the algorithm traverses original sequence from left to right, every time it meets a number bigger than any number in the current subsequence, it extends the subsequence by 1. By the time algorithm would meet such number y the subsequence would have length k and contain numbers x1, x2, ..., xk. Because xk < y, the algorithm would extend the subsequence by 1 and include y in the subsequence. The same logic applies when y is the smallest number of the subsequence and located to the left of x1 or when y is the biggest number of the subsequence and located to the right of xn. Conclusion: such number y does not exists and the algorithm computes the longest increasing subsequence. I hope that makes sense.
In the final statement, I would like to mention that the algorithm can be easily generalized to compute longest decreasing subsequence as well, for any data types which elements can be ordered. The idea is the same, here is the code:
template<typename T, typename cmp = std::less<T>>
size_t longestSubsequence(const vector<T> &elements)
{
if (elements.size() < 2)
return elements.size();
vector<T>seq(elements.size(), T());
seq[0] = elements[0];
size_t len = 1;
auto end = next(seq.begin());
for (size_t i = 1; i < elements.size(); i++) {
auto pos = std::lower_bound(seq.begin(), end, elements[i], cmp());
if (pos == end) {
*end = elements[i];
end = next(end);
len++;
}
else
*pos = elements[i];
}
return len;
}
Examples of usage:
int main()
{
vector<int> nums = { 0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15 };
size_t l = longestSubsequence<int>(nums); // l == 6 , longest increasing subsequence
nums = { 0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15 };
l = longestSubsequence<int, std::greater<int>>(nums); // l == 5, longest decreasing subsequence
vector<string> vstr = {"b", "a", "d", "bc", "a"};
l = longestSubsequence<string>(vstr); // l == 2, increasing
vstr = { "b", "a", "d", "bc", "a" };
l = longestSubsequence<string, std::greater<string>>(vstr); // l == 3, decreasing
}
def longest_sub_seq(arr):
main_arr = []
sub_arr = []
n = len(arr)
for ind in range(n):
if ind < n - 1 and arr[ind] <= arr[ind+1]:
sub_arr.append(arr[ind])
else:
sub_arr.append(arr[ind])
main_arr.append(sub_arr)
sub_arr = []
return max(main_arr, key=len)
a = [3, 10, 3, 11, 4, 5, 6, 7, 8, 12, 1, 2, 3]
print(longest_sub_seq(a)) # op: [4, 5, 6, 7, 8, 12]
The verbosity and complexity of other solutions made me uneasy.
My python answer:
def findLIS(s):
lengths = [1] * len(s)
for i in range(1, len(s)):
for j in range(i):
if s[i] > s[j] and lengths[i] <= lengths[j]:
lengths[i] += 1
return max(lengths)
FAQ
- We initialize
lengths
list[1, 1, 1, ..., 1]
because the worst case is the length of 1:[5,4,3,2]
will have result lengths,[1,1,1,1]
, and we can take max of that, i.e. 1. - Algorithm: for every number, we try to see if this new number can make the subsequence longer. The most important part is
if s[i] > s[j] and lengths[i] <= lengths[j]
: we ensure this new number is bigger and its best subsequence is not longer. If so, this is a good number to add to the old subsequnce. - My answer actually gets the increasing subsequence length (the title of the question) which is actually different to non-decreasing length (the question description). If you want to get the longest non-decreasing subsequence length, then just change
s[i] > s[j]
tos[i] >= s[j]
.
精彩评论