trying to determine unique identifier for database table

2023-01-28 06:45 问答作者：

I have a database table with many columns and there is no specified primary key. There isn't a list of super keys either. Besides iteratively trying all candidate keys/columns, is there a way for me, using SQL, to try and figure our whether a subset of keys can make a unique identifier for my table?

For example, a table may have 4 columns first name, last name, address and zip and the data I see is:

John, Smith, 1 main st, 00001  
Mary, Smith, 1 main st, 00001  
Mary, Smith, 2 sub st, 00002

In this case, I'll need first, last and zip as my unique key.

John, Smith, 1 main st, 00001  
John, Smith, 1 main st, 00001

In this case, there is no unique key.

Please don't comment on my table construction and/or normalization of databases, I'm just trying to find a practical answer. Thanks.

This is my question: Besides iteratively trying all candidate keys/columns, is there a way for me, using SQL, to try and figure our whether a subset of keys 开发者_运维问答can make a unique identifier for my table?

Looking for a subset of unique values in this case seems so specific to the particular data set. What if you arrive at a subset today and find you can't insert a new row tomorrow?

Use an artificial key, like an auto-incrementing integer.

In short: no, there's no way to do this in T-SQL really.

My advice: just add a ID INT IDENTITY PRIMARY KEY column to the table. It's guaranteed to be unique, it will be filled automagically when you create it, it's fast and easy, no messy "is this really unique or are there any combinations of rows that violate the uniqueness" questions......

Just do it - it's the easiest way to go!!

You cannot find if a combination "can" make a primary key. You can find if one WILL make a good primary key for an existing set of data.

To find if a set of fields is candidate or not, you can count the distinct of those fields (using group-by with rollup) and compare that with count (*)

There is a much faster method.

Enterprise dbms have had it for many years but MS SQL Server 2005 (useable in 2008) and later provided the HashBytes() function. Convert the columns to CHAR() (VARCHAR on MS), concatenate them; then hash them; then compare the hashes. You can compare the two tables in a single SELECT command. IIRC max 8000 characters per row.

(If you use this answer, please undo and redo your Answer choice.)

if you are comparing two databases, then you can see if any duplicate rows exist in the source db with structures like this:

select a,b,c,d
from mytable
having count(*) > 1
group by a,b,c,d

include all columns.

then use all columns as the 'row key' to see if it exists in the target system

there are update anomalies in this schema: you cannot a person without knowing his address

better approach is to separate to three tables, one for persons and one for PersonAddress

> perons: id,firstname, lastname
> address: id,address:
> personaddress: personid, addressid

You cannot find if a combination "can" make a primary key.

I actually disagree with this, I think it is possible to write a query that will SELECT all possible permutations of columns from the table and combine each permutation into a single unique value (the simplest, crudest way is to CAST them all to VARCHAR and connect them with a spacer character - a better way would be some kind of hash function).

With a single pass you would then have set of columns like P1, P12, P123, P2, P23, P3 etc (in case of three columns). Then you can do a query with COUNT(*) vs COUNT(DISTINCT) for each permutation column and you will see which permutations are unique.

Using dynamic SQL you could probably make it so that it would work on any table, although I don't know about the column limit for SQL Server.

继续阅读：sql

trying to determine unique identifier for database table

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？