Is there a shortcut to normalizing a table where the columns=rows?

2023-01-09 11:17 问答作者：

Suppose you had the mySQL table describing if you can mix two substances

Product   A    B    C
---------------------
A         y    n    y
B         n    y    y
C         y    y    y

The first step would be to transform it like

P1   P2   ?
-----------
A    A    y
A    B    n
A    C    y
B    A    y
B    B    y
B    C    n
C    A    y
C    B    n
C    C    y

But then you have duplicate information. (eg. If A can mix with B, then B can mix with A), so, you c开发者_运维知识库an remove several rows to get

P1   P2   ?
-----------
A    A    y
A    B    n
A    C    y
B    B    y
B    C    n
C    C    y

While the last step was pretty easy with a small table, doing it manually would take forever on a larger table. How would one go about automating the removal of rows with duplicate MEANING, but not identical content?

Thanks, I hope my question makes sense as I am still learning databases

If it's safe to assume that you're starting with all relationships doubled up, e.g.

If A B is in the table, then B A is guaranteed to be in the table.

Then all you have to do is remove all rows where P2 < P1;

DELETE FROM `table_name` WHERE `P2` < `P1`;

If this isn't the case, you can make it the case by going through the table and inserting all the duplicate rows if they don't already exist, then running this.

I don't think it's necessary in your situation, but as an intellectual exercise, you could build on Jamie Wong's solution and prevent non-duplicated columns from being removed with an EXISTS clause. Something like this:

DELETE FROM `table_name` AS t1
  WHERE `P2` < `P1`
    AND EXISTS (SELECT NULL FROM `table_name` AS t2
      WHERE t1.`P1` = t2.`P2` AND t1.`P2` = t2.`P1`);

It pretty much just makes sure that there's a duplicate before deleting anything.

(My MySQL syntax might be a little off; it's been a while.)

Step 1 (as you've already done): Transform to Table2

P1   P2   ?
-----------
A    A    y
A    B    n
A    C    y
B    A    y
B    B    y
B    C    n
C    A    y
C    B    n
C    C    y

Step 2: ReOrder Columns, Select Distinct

SELECT DISTINCT
   IF P1<P2 THEN P1 ELSE P2 END as P1, -- this puts the smallest value in P1
   IF P1>P2 THEN P1 ELSE P2 END as P2 -- this puts the largest value in P2
FROM Table2
WHERE NOT P1=P2  --(Assuming records like A, A, y are not interesting)

I'm not a mySQL guy, so you might need to check the if/then syntax, but this seems conceptually ok anyway.

继续阅读：database database-normalization database-table performance

Is there a shortcut to normalizing a table where the columns=rows?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？