开发者

Database design - exposing Primary Key

Imagine that I have a Customer table.

And in this table is an ID column, Primary Key int, Identity, all that jazz. Our Customer also has a Name.

Now, if I was to send out information in a report, or an e-mail blast, where there was a link back into my website to uniquely identify that Customer, would it be good practice to use the Primary Key ID field?

All my instincts are telling me this is awful, and that surrogate IDs should PURELY be used for the database only开发者_运维知识库 - and never exposed to the outside world in this way.

But I'm looking for that killer argument that I could use in the future, if (when) presented with this case.

Thoughts?


A unique identifier is a unique identifier and if you already have on in the primary key for your database table, use it. Creating another surrogate key would be redundant unless you have GUIDs for example and that was too long for your output format.

--- EDIT ----

Just thinking about the mailing reference, having done this in the past I'd created a table that tracked specific mailings to build metrics around the responses. Each mailing record used a GUID as a primary key to keep it unique and that record linked back to the customer record, no details from the customer record were presented back to the viewer of the link.

If you are presenting customer details then there needs to be access control, even if you obfuscate the ID, anyone getting hold of that link will have access to that data and in most systems that would be unacceptable.

I think the decision starts with 'how secure does this data need to be?' and moves on from there.


I'd suggest not adding a new field, but also don't expose the Id.

Create a table to store emails that you send out. Include a GUID as the primary key and UserId as a foreign key. In the URL you are giving them, link back to the GUID. Then if a user clicks on a URL you can updated the record based on the GUID and indicate when the user accessed the URL.


Consider that end-users might find it more easy to navigate your site if they were able to change urls because they contained readable strings, and not just ids. This is a more rest-y approach.

On the other hand, why not expose the id? If your application is vulnerable simply because someone reasoned out a primary key, then you have problems.


It is ok to use a key in the form that you are using. However, in your particular case the problem exists that the url could be played with and people could see other peoples information. It depends on what is at that url as to whether or not this would be a problem for you or not. If it is then I suggest that you toss some salt in your url (a second number, a guid, the customers email hashed, etc.) that you could then match with the ID to verify that the URL is correct. Even using IDs that are not 1, 2, 3...instead using every 7th ID...is still guessable. The other side of that is that you could just has the ID if you were so inclined...or the whole URL for that matter. Your information generally has no business being consumed by the outside world unless you really want them too. In this case your instinct was right in that the data should not be consumed by humans. But that URL is part of your system so it is really just a matter of making the information useless to the people reading the url.


The way you describe it, it sounds as if a user could look into other users data just by modifying the primary key value in the link. If that's the case, generating a unique id (GUID for example) for external purposes would be the better approach.


Use A Surrogate Key because you don't want people to be able to guess(and with a incrementing PK it's not really guessing) someone else's ID and see their information. I'd recommend added a key to your customer table.

The following is example code to add a new column that generates a guid for each record.

BEGIN TRANSACTION
GO
ALTER TABLE dbo.Customers ADD
    PublicCustomerID uniqueidentifier NULL
GO
ALTER TABLE dbo.Customers ADD CONSTRAINT
    DF_Customers_PublicCustomerID DEFAULT newid() FOR PublicCustomerID
GO
ALTER TABLE dbo.Customers SET (LOCK_ESCALATION = TABLE)
GO
COMMIT

Surrogate keys also work nicely for an email blast for providing a reasonable amount of protection of non-critical customer information.However if it's really important it's needs to be protected by an authentication system and not just an obfuscated customerID. However if it's a sales email blast, you would kill the conversion rate if they had to click the link and login to see the page instead of just click the link


Create a new field, called CustomerNumber which is unique in the system, but unrelated to the ID/primary key.

Edit: if you're sending this type of information out by e-mail, use some kind of encoding or encryption so that a random user couldn't guess another customer's information. It's really immaterial what the source of the key is, as long as it's unique. My first suggestion was just a general one for use in a system that deals with customers.


A thing to consider is to have a primary key and a natural key in the database. Primary key uniquely identifies a row in the table, while natural key uniquely identifies a customer. If you decide to keep customer history, you can add a new row with a new primary key, but same natural key for the customer -- then you mark a row "expired" or "current" in a separate column. This is frequently used in data warehousing for customer dimensions. In this case you would expose only natural keys, not primary keys. Joins still happen on primary keys.


It is not a good idea to expose plain ID to the public. The way I work around it is by encrypting the query string parameter.


I'll start with this. Using your surrogate key as a link back to some database object is VERY common. In fact, many popular frameworks do exactly that out of the box (ASP MVC, Ruby on Rails, CodeIgniter).

However, you might look into using some natural key instead. You could also try Base64 encoding the ID field. Even small measures like this will help prevent someone from "exploring" your database.


You can implement a security mechanism without requiring the users to login. You should not simply rely on obscurity to secure this data.

Requiring the ID in the URL as well as some sort of password (a randomly assigned hash) could be all the security needed.


The strong argument against using the primary key is it's easily hackable, well, hardly even hacking - just a matter of guessing input by changing the number a little. You're even exposing yourself unintentionally to a miss-type if the user tries to enter the link somewhere else manually - say if they print the email off to give to their boss who then types in the code for customer B instead of A.

That alone should be sufficient. At the very least I'd add a check-digit to the end of the code you send out to the users - the usual recursively add the digits algorithm is cheap and helps (a little).

Myself however I'd err towards using a generated GUID. Thanks to Microsoft and many others the concept of entering a 36 character string isn't as offensive to users now as it once was, most people will just be hitting a link anyway, and you have a simple, to hand. function that guarantees uniqueness so you're never going to trip over yourself in future. Simple and robust.


Yes, it is purer / cleaner / nicer to use something like a username instead of the actual primary key for external links. However, if there are no "natural" identifier to use, then use the primary key. As long as you are using it for identifying a row, and not putting some semantic value in it, there's nothing wrong with that.


Exposing the PK is a security risk. Users can go on a fishing expedition, and just try out a few interesting keys, and see what they can see this way. Let's say the link you're sending out is http://www.example.org/ShowCustomer?id=4711 . By looking at this report, a user only finds out info about customer #4711. A competitor could easily try out all customer ids starting at 1: http://www.example.org/ShowCustomer?id=1 and thus have a list of all of your customers. And what about id=0? or id=-1? Or SQL injections like http://www.example.org/ShowCustomer?id=1;DROP%20TABLE%20CUSTOMERS ?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜