Store multiple datasets in single table

2022-12-14 00:30 问答作者：

I'm brainstorming an application to handle various data integrity checks. Each individual check could query a number of production tables, evaluate the results and report an error providing data relevant to the check. For example, one check would look for customers with a scheduled payment but no remaining balance; a different check might look for credit card transactions that have been authorized but not settled for more than 3 days. Two completely unrelated checks. The dataset from the first one would contain things like customer number, scheduled payment date, payoff date, etc. The second check would have transaction number, card type, last 4 digits of card, amount, etc.

I would like to store the result datasets in a common schema so I can query for any errors from Check A for a specific customer in the last 3 months. Or, how many times has Check B returned an error for distinct transactions. Other tables would also manage issue resolution and such. The only thing I've come up with so far is a table with ~20 columns, one column relating to the specific check, one for date/time, and the remaining would be some form of varchar capable of holding any type of data. There are any number of reasons why this makes me cringe, but performance 开发者_如何学Pythonranks pretty high up there. I'm hoping to avoid separate tables for each check, but combining that with a lookup table for secondary functionality may be the only way to go.

For the curious I'm trying to keep this in the Microsoft world (VB.NET and SQL Server), but I'm open to other ideas.

You can pull out those ~20 columns into one normalised "values" table. You could have a lookup table that marries which fields are expected for each check type.

To illustrate what I'm suggesting:

check_datasets

dataset_id (PK)
check_id (This would be the ID of the check that failed)
datetime

value_types

value_type_id (PK)
name

dataset_value_types

value_type_id (FK, PK)
dataset_id (FK, PK)

dataset_values

dataset_id (FK, PK)
value_type_id (FK, PK)
value (This column would represent the value of data)

----------

So to explain, in the above tables:

check_datasets is the table that holds each failed check (I am assuming that you are only recording failed checks - otherwise, you'd probably want to add a pass/successful flag to this table).
value_types is a simple lookup table to provide names for value fields.
dataset_value_types is used define which checks have which types of value fields (this is the many-to-many relationship between check_datasets table and value_types).
dataset_values is the table that holds the specific values for each check.

This design provides a normalised and easy way to query every value you want for each "dataset" that is returned.

The drawback to this design is that you still have mixed datatypes in the one field (value).

If you wanted to separate datatypes you could subtype dataset_values into seperate tables (ie: one table for integer values, one for varchars, LOB etc).

In the example above, this would mean creating as many sub-tables you want for datatypes (eg: dataset_values_int, dataset_values_text, etc). Each subtable need only have two columns - one holding the value and the second referencing the parent table.

(1) dataset_values would become a parent table and possibly could look like this:

dataset_values

dataset_value_id (PK)
dataset_id (FK)
value_type_id (FK)
datatype (Is this value an integer, varchar, etc. This column tells you which sub-table to check for the table)

(2) An example sub-table would be:

dataset_values_varchar

dataset_value_id (PK,FK)
value (In this table this would be a varchar - but would vary for the other subtables.)

Hmmmm. Just brainstorming here. Do you really need to store the check data? Or just a reference to be able to look it up? How about a table to store the types of checks and then a related table that stores a chekc id, the type of check, that date of the check, and any other data you might need. Then another related table that stores the check id and the record id for the records which failed the check.

An alternative is how we store our audit table data, a varchar (max for the actual data) a column that says what column it belonged to (and in your case what table it belonged to might be a separate column) and then a column that relates it to the check that was performed.

If you really feel you must store the data in one record, this might be a place to store an xml document.

继续阅读：database-design

Store multiple datasets in single table

----------

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

----------

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？