How does this not make varchar2 inefficient?

2023-01-11 05:27 问答作者：

Supp开发者_StackOverflowose I have a table with a column name varchar(20), and I store a row with name = "abcdef".

INSERT INTO tab(id, name) values(12, 'abcdef');

How is the memory allocation for name done in this case?

There are two ways I can think of:

20 bytes is allocated but only 6 used. In this case varchar2 does not have any significant advantage over char, in terms of memory allocation.

Only 6 bytes is allocated. If this is the case, and I addded a couple of more rows after this one,

INSERT INTO tab(id, name) values(13, 'yyyy');
INSERT INTO tab(id, name) values(14, 'zzzz');

and then I do a UPDATE,

UPDATE tab SET name = 'abcdefghijkl' WHERE id = 12;

Where does the DBMS get the extra 6 bytes needed from? There can be a case that the next 6 bytes are not free (if only 6 were allocated initially, next bytes might have been allotted for something else).

Is there any other way than shifting the row out to a new place? Even shifting would be a problem in case of index organized tables (it might be okay for heap organized tables).

There may be variations depending on the rdbms you are using, but generally:

Only the actual data that you store in a varchar field is allocated. The size is only a maximum allowed, it's not how much is allocated.

I think that goes for char fields also, on some systems. Variable size data types are handled efficiently enough that there is no longer any gain in allocating the maximum.

If you update a record so that it needs more space, the record inside the same allocation block are moved down, and if the records no longer fit in the block, another block is allocated and the records are distributed between the blocks. That means that records are continous inside the allocation blocks, but the blocks doesn't have to be continous on the disk.

It certainly doesn't allocate more space then needed, this would defeat the point of using the variable length type.

In the case you mention I would think that the rows below would have to be moved down on the page, perhaps this is optimized somehow. I don't really know the exact details, perhaps someone else can comment further.

This is probably heavily database dependent.

A couple of points though: MVCC observing databases don't actually update data on disk or in memory cache. They insert a new row with the updated data and mark the old row as deleted from a certain transaction on. After a while the deleted row is not visible to any transactions and it's reclaimed.

For the space storage issue, it's usually in the form of 1-4 bytes of header + data (+ padding)

In the case of chars, the data is padded to reach the sufficient length. In the case of varchar or text, the header stores the length of the data that is following.

Edit For some reason I thought this was tagged Microsoft SQL Server. I think the answer is still relevant though

That's why the official recommendation is

Use char when the sizes of the column data entries are consistent.

Use varchar when the sizes of the column data entries vary considerably.

Use varchar(max) when the sizes of the column data entries vary considerably, and the size might exceed 8,000 bytes.

It's a trade off you need to consider when designing your table structure. Probably you would need to consider the frequency of updates vs reads in this calculation too

Worth noting that for char a NULL value still uses all the storage space. There is an addin for Management Studio called SQL Internals Viewer that allows you to see easily how your rows are stored.

Given the VARCHAR2 in the question title, I assume your question is focused around Oracle. In Oracle, you can reserve space for row expansion within a data block with the use of the PCTFREE clause. That can help mitigate the effects of updates making rows longer.

However, if Oracle doesn't have enough free space within the block to write the row back, what it does it is called row migration; it leaves the original address on disk alone (so it doesn't necessarily need to update indexes), but instead of storing the data in the original location, it stores a pointer to that row's new address.

This can cause performance problems in cases where a table is heavily accessed by indexes if a significant number of the rows have migrated, as it adds additional I/O to satisfy queries.

继续阅读：database database-design rdbms sql varchar

How does this not make varchar2 inefficient?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？