How to do run length encoding?
I have a long string for example it could be "aaaaaabbccc". Need to represent it as "a6b2c3". What's the best way to do this? I could do this in linear time by comparing characters and incrementing counts and then replacing the counts in the array, using two indexes in one pass. Can you guys think of a better way than t开发者_运维百科his? Are any of the encoding techniques going to work here?
The common solution for this is RLE - Run-length encoding, the Wikipedia article has sample implementation code.
I don't think there is a faster way to solve it.
Informally you can think that a sub-linear complexity implies to do less comparisons that the number of the characters in the string you want to compress. But with a number of comparisons such small you can't be sure of some character, you can't know what they contain because you don't have enough information.. this means that you can't obtain a loseless compression.
I think you're asking, "Is there a better than linear way to do run-length encoding"? If so, the answer is no.
I have a implemented an encoding for bytes though. Hope it helps.
public byte[] Encode(byte[] original)
{
// TODO: Write your encoder here
if (original==null || original.Count() == 0) // Check for invalid inputs
return new byte[0];
var encodedBytes = new List<byte>(); // Byte list to be returned
byte run = 0x01;
for (int i = 1; i < original.Length; i++)
{
if (original[i] == original[i - 1]) // Keep counting the occurences till this condition is true
run++;
else // Once false,
{
encodedBytes.Add(run); // add the total occurences followed by the
encodedBytes.Add(original[i - 1]); // actual element to the Byte List
run = 0x01; // Reset the Occurence Counter
}
if (i == original.Length - 1)
{
encodedBytes.Add(run);
encodedBytes.Add(original[i]);
}
}
return encodedBytes.Count()==0 ? new byte[0] : encodedBytes.ToArray<byte>();
}
var a = new byte[]{0x01, 0x02, 0x03, 0x04};
var b = new byte[]{0x01, 0x01, 0x01, 0x02, 0x01, 0x03, 0x01, 0x04};
var EncodedA = Encode(a);
var isAEqualB = EncodedA.SequenceEqual(b); should return true
精彩评论