How could I encode a string of 1s and 0s for transport?

2023-01-19 20:16 问答作者：

For a genetic algorithm application, I'm using a whole load of binary strings. Most of the time they literally take the form of 01001010110, so that they can be mated, mutated and "crossed-over".

For transport and storage however, this se开发者_开发知识库ems wasteful. What's the simplest way to encode this as a shorter string?

I'm guessing this is pretty trivial, but I'm not sure where to start looking.

Update: I actually need to end up with another string: one of the transport requests will be GET requests.

The simplest would be to take each digit and treat it as a bit. Each group of 8 bits can be stored in a byte. Then you can send it as a stream of bytes. You will also need to store the length of the original string so that you can distinguish between "0" and "00".

Here is one way you could write the conversion from string to a byte array:

byte[] convertToBytes(string s)
{
    byte[] result = new byte[(s.Length + 7) / 8];

    int i = 0;
    int j = 0;
    foreach (char c in s)
    {
        result[i] <<= 1;
        if (c == '1')
            result[i] |= 1;
        j++;
        if (j == 8)
        {
            i++;
            j = 0;
        }
    }
    return result;
}

Reversing the operation is very similar.

If you need to transmit the data as a string you can base 64 encode the resulting byte array.

You may also want to consider keeping it in this form in memory too. This will be much more efficient than storing it as a string where each digit is stored as a 2 byte character. You are using roughly 16 times more memory than you need to for storing your data. The disadvtange is that it is slightly more difficult to use in this form, so if you have enough memory then what you are currently doing might be just fine.

What about converting it to it's base 10 integer equivalent?

int myBin = Convert.ToInt32("01001010110", 2);

Convert.ToInt32() documentation

I would just store them as an array of bytes and use a helper function to translate between the byte array version and the string version.

Or implement Run length encoding or Huffman coding. Both are fairly easy to implement. RLE is by far the easiest, but will in most cases have worse compression ratio. If your data typically has many consecutive characters of the same value, it could still provide a substantial improvement.

Abe Miessler's answer is a good one, but with caveat mentioned in comments.

If 64 bits is not enough to represent your string, then consider using a BigInt class http://www.codeproject.com/KB/cs/BigInt.aspx (you would probably want to add to/fromBinary() extension methods to it. Alternatively represent this as a ... linked list of bytes.

Either approach has the problem of discarding any leading zeroes, so you would want to store the original length as well.

继续阅读：binary encoding string

How could I encode a string of 1s and 0s for transport?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？