Is it possible to store a 1 byte number in Postgres?

2023-01-04 02:18 问答作者：

I have 7 8-bit integer values per record that I want to store in Postgr开发者_如何学JAVAes. Pg doesn't offer a single byte integer type, SMALLINT, or 2 bytes, being the smallest integer datatype. Is there anyway I can store my 7 8-bit numbers and save on space?

Would an array type with a 7 element array be more compact? Or, should I make a binary representation of my 7 numbers (for example, using pack in Perl) and store that in a single bytea field?

Any other suggestions?

Given that the overhead for any row in PostgreSQL is 23 bytes (HeapTupleHeaderData), if you really care about small amounts of space this much you've probably picked the wrong way to store your data.

Regardless, since all the more complicated types have their own overhead (bytea adds four bytes of overhead for example, bit strings 5 to 8), the only way to accomplish what you're looking for is to use a bigint (8 bytes), numerically shifting each value and OR-ing together the result. You can do this using the bit string operations to make the code easier--compute as bit string, then cast to bigint before storing--or just manually multiply/add if you want speed to be better. For example, here's how you store two bytes together into a two byte structure and then get them back again:

int2 = 256 * byte1 + byte2
byte1 = int2 / 256
byte2 = int2 % 256

You can extend the same idea into storing 7 of them that way. Retrieval overhead is still going to be terrible, but you will have actually saved some space in the process. But not very much relative to just the row header.

`"char"`

This is a one-byte type in PostgreSQL that fits in the range of -128,127. From the docs,

The type "char" (note the quotes) is different from char(1) in that it only uses one byte of storage. It is internally used in the system catalogs as a simplistic enumeration type.

You can bias this towards [-128,127], by subtracting 128 from any input in the range of [0-255] before you write to the database, and adding it back on the output when you read from the database.

-- works
SELECT (-128)::"char", 127::"char";

-- generates out of range
SELECT (-128)::"char";
SELECT 128::"char";

-- Shifts to unsigned range.
-- If you're going to be using "char"
-- review the results of this query!
SELECT
  x::int AS "inputUnsigned",
  chr(x) AS "extendedASCII",
  -- this is the "char" types representation for that input.
  signed::"char" AS "charRepresentation",

  signed     AS "inputUnsignedToSigned",
  signed+128 AS "inputUnsignedToSignedToUnsigned"
FROM generate_series(1,255) AS gs(x)
-- Here we map the input in the range of [0,255] to [-128,127]
CROSS JOIN LATERAL ( VALUES (x::int-128) )
  AS v(signed);

Small excerpt of the output

 inputUnsigned | extendedASCII | charRepresentation | inputUnsignedToSigned | inputUnsignedToSignedToUnsigned 
---------------+---------------+--------------------+-----------------------+---------------------------------
....
           190 | ¾             | >                  |                    62 |                             190
           191 | ¿             | ?                  |                    63 |                             191
           192 | À             | @                  |                    64 |                             192
           193 | Á             | A                  |                    65 |                             193
           194 | Â             | B                  |                    66 |                             194
           195 | Ã             | C                  |                    67 |                             195
           196 | Ä             | D                  |                    68 |                             196
...

We use generate_series(1,255) only because chr(0) throws because you can't generate or output the ASCII NUL (PostgreSQL uses cstrings)

`pguint` Extension

Pguint is an extension that provides two one byte representations,

int1 (signed)
uint1 (unsigned)

There is pg_catalog.char ( another notation - "char" ) type which uses only 1 byte to store its value.

select pg_column_size( 'A' );
pg_column_size
----------------
              2
(1 row)

select pg_column_size( 'A'::"char" );
pg_column_size
----------------
              1
(1 row)

Will you ever lookup records using these values?

If yes - use normal datatypes like int4 (or even int8 if you're on 64bit architecture).

If not - first ask yourself - what is the point of storing this values in Pg? You can use bytea (complicated i/o), or bitstrings (even more complicated i/o) but what is the point? How many billion records you're going to have? Did you actually check that smaller datatype uses less space (hint: it doesn't, check it - there are data alignment issues involved)? Are you working under impression that smaller datatype is faster (it isn't. It's actually more complex to compare two int2 values than two int4 values on 32bit architecture).

You will want to look into the bytea data type referenced here: http://www.postgresql.org/docs/8.4/interactive/datatype-binary.html

There is also bit string data types: http://www.postgresql.org/docs/8.4/interactive/datatype-bit.html

First you asked about 7, but now 6 bytes. Six 8-bit values corresponds exactly to MAC address size and PostgreSQL's built in type macaddr. You can insert those bytes using MAC syntax f.i. A1-B2-C3-D4-E5-F6.

I haven't tested them myself, but there are extensions for this; e.g http://pgxn.org/dist/tinyint/ .

继续阅读：postgresql

Is it possible to store a 1 byte number in Postgres?

`"char"`

`pguint` Extension

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

"char"

pguint Extension

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

`"char"`

`pguint` Extension

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？