How to map a long integer number to a N-dimensional vector of smaller integers (and fast inverse)?

2022-12-30 19:17 问答作者：

Given a N-dimensional vector of small integers is there any simple way to map it with one-to-one correspon开发者_开发问答dence to a large integer number?

Say, we have N=3 vector space. Can we represent a vector X=[(int16)x1,(int16)x2,(int16)x3] using an integer (int48)y? The obvious answer is "Yes, we can". But the question is: "What is the fastest way to do this and its inverse operation?"

Will this new 1-dimensional space possess some very special useful properties?

For the above example you have 3 * 32 = 96 bits of information, so without any a priori knowledge you need 96 bits for the equivalent long integer.

However, if you know that your x1, x2, x3, values will always fit within, say, 16 bits each, then you can pack them all into a 48 bit integer.

In either case the technique is very simple you just use shift, mask and bitwise or operations to pack/unpack the values.

Just to make this concrete, if you have a 3-dimensional vector of 8-bit numbers, like this:

uint8_t vector[3] = { 1, 2, 3 };

then you can join them into a single (24-bit number) like so:

uint32_t all = (vector[0] << 16) | (vector[1] << 8) | vector[2];

This number would, if printed using this statement:

printf("the vector was packed into %06x", (unsigned int) all);

produce the output

the vector was packed into 010203

The reverse operation would look like this:

uint8_t v2[3];

v2[0] = (all >> 16) & 0xff;
v2[1] = (all >> 8) & 0xff;
v2[2] = all & 0xff;

Of course this all depends on the size of the individual numbers in the vector and the length of the vector together not exceeding the size of an available integer type, otherwise you can't represent the "packed" vector as a single number.

If you have sets Si, i=1..n of size Ci = |Si|, then the cartesian product set S = S1 x S2 x ... x Sn has size C = C1 * C2 * ... * Cn.

This motivates an obvious way to do the packing one-to-one. If you have elements e1,...,en from each set, each in the range 0 to Ci-1, then you give the element e=(e1,...,en) the value e1+C1*(e2 + C2*(e3 + C3*(...Cn*en...))).

You can do any permutation of this packing if you feel like it, but unless the values are perfectly correlated, the size of the full set must be the product of the sizes of the component sets.

In the particular case of three 32 bit integers, if they can take on any value, you should treat them as one 96 bit integer.

If you particularly want to, you can map small values to small values through any number of means (e.g. filling out spheres with the L1 norm), but you have to specify what properties you want to have.

(For example, one can map (n,m) to (max(n,m)-1)^2 + k where k=n if n<=m and k=n+m if n>m--you can draw this as a picture of filling in a square like so:

1 2 5   | draw along the edge of the square this way
4 3 6   v
  8 7

if you start counting from 1 and only worry about positive values; for integers, you can spiral around the origin.)

I'm writing this without having time to check details, but I suspect the best way is to represent your long integer via modular arithmetic, using k different integers which are mutually prime. The original integer can then be reconstructed using the Chinese remainder theorem. Sorry this is a bit sketchy, but hope it helps.

To expand on Rex Kerr's generalised form, in C you can pack the numbers like so:

X = e[n];

X *= MAX_E[n-1] + 1;
X += e[n-1];

/* ... */

X *= MAX_E[0] + 1;
X += e[0];

And unpack them with:

e[0] = X % (MAX_E[0] + 1);
X /= (MAX_E[0] + 1);

e[1] = X % (MAX_E[1] + 1);
X /= (MAX_E[1] + 1);

/* ... */

e[n] = X;

(Where MAX_E[n] is the greatest value that e[n] can have). Note that these maximum values are likely to be constants, and may be the same for every e, which will simplify things a little.

The shifting / masking implementations given in the other answers are a generalisation of this, for cases where the MAX_E + 1 values are powers of 2 (and thus the multiplication and division can be done with a shift, the addition with a bitwise-or and the modulus with a bitwise-and).

There is some totally non portable ways to make this real fast using packed unions and direct accesses to memory. That you really need this kind of speed is suspicious. Methods using shifts and masks should be fast enough for most purposes. If not, consider using specialized processors like GPU for wich vector support is optimized (parallel).

This naive storage does not possess any usefull property than I can foresee, except you can perform some computations (add, sub, logical bitwise operators) on the three coordinates at once as long as you use positive integers only and you don't overflow for add and sub.

You'd better be quite sure you won't overflow (or won't go negative for sub) or the vector will become garbage.

#include <stdint.h> // for uint8_t
long x;
uint8_t * p = &x;

union X {
   long L;
   uint8_t A[sizeof(long)/sizeof(uint8_t)];
};

works if you don't care about the endian. In my experience compilers generate better code with the union because it doesn't set of their "you took the address of this, so I must keep it in RAM" rules as quick. These rules will get set off if you try to index the array with stuff that the compiler can't optimize away.

If you do care about the endian then you need to mask and shift.

I think what you want can be solved using multi-dimensional space filling curves. The link gives a lot of references on this, which in turn give different methods and insights. Here's a specific example of an invertible mapping. It works for any dimension N.

As for useful properties, these mappings are related to Gray codes.

Hard to say whether this was what you were looking for, or whether the "pack 3 16-bit ints into a 48-bit int" does the trick for you.

继续阅读：algorithm c math

How to map a long integer number to a N-dimensional vector of smaller integers (and fast inverse)?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？