How do I encode a 4-byte string as a single 32-bit integer?

2022-12-12 00:26 问答作者：

First, a disclaimer. I'm not a CS grad nor a math major, so simplicity is important.

I have a four-character string (e.g. "isoy") that I need to pass as a single 32-bit integer field. Of course at the other end, I need to decode it back to a string. The string will only contain A-Z, and case is not important, if that helps.

The funny part is that I'm starting with PowerShell on the sending end and Linux at the receiving end. I can use Perl or Python there, with a preference for Python. I don't actually need answers in each language, 开发者_开发技巧I'm most interested in a PowerShell (C# also good) example for going both ways.

To 32-bit unsigned integer:

uint x = BitConverter.ToUInt32(Encoding.ASCII.GetBytes("isoy"), 0); // 2037347177

To string:

string s = Encoding.ASCII.GetString(BitConverter.GetBytes(x));      // "isoy"

BitConverter uses the native endianness of the machine.

For Python, struct.unpack does the job (to make a 4-byte string into an int -- struct.pack goes the other way):

>>> import struct
>>> struct.unpack('i', 'isoy')[0]
2037347177
>>> struct.pack('i', 2037347177)
'isoy'
>>>

(you can use different formats to ensure big-endian or little-endian encoding, if you need that -- '>i' and '<i' respectively -- instead of just plain 'i' which uses whatever encoding is native to the machine).

// string -> int    

uint ret = 0;
for ( int i = 0; i < 4; ++i )
{
  ret |= ( str[i] << ( i * 8 ) );
}

// int -> string
for ( int i = 0; i < 4; ++i )
{
  str[i] = ( ret >> ( i * 8 ) ) & 0xff;
}

Using PowerShell syntax you can do it this way (pretty much like dtb solution):

PS> $x = [BitConverter]::ToUInt32([byte[]][char[]]'isoy', 0)
PS> [char[]][BitConverter]::GetBytes($x) -join ''
isoy

You do have to watch out for endian-ness on the Linux side. If it is running on an Intel processor I believe should be fine (same endian-ness as the PowerShell side).

Please take a look at the struct standard library module in Python's Manual. It has two functions for this: struct.pack and struct.unpack. You can use the 'L' (unsigned long) format character for this.

Aside from byte packing, you can also consider that your 26-character alphabet can be encoded as 0-25 instead of A-Z.

So without worrying about big and little endians, you can go from "letters" to a number like this:

val=letter0+letter1*26+letter2*26*26+letter3*26*26*26;

to go from val back to letters, you do something like this:

letter0=val%26;
letter1=(val/26)%26;
letter2=(val/(26*26))%26;
letter3=(val/(26*26*26))%26;

where "%" is your language's modulus operator and "/" is an integer division.

You'll obviously need a way to get from 'A'-'Z' to 0-25 and back. That's language dependent.

You can easily put this into loops. I show the loops unrolled to make things a bit more obvious.

It's more common to pack letters into bytes, so you can use shift and and bitwise operations to encode and decode. But by doing it the way I show above, you could pack six letters into a 32-bit number, rather than just four. Which is nice, since you can hold things like stock market ticker symbols in a single 32-bit value (mutual funds ticker symbols are 5 characters).

继续阅读：algorithm python

How do I encode a 4-byte string as a single 32-bit integer?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？