Secure hashing of just a few million items
I have a database with companies and their clients. The database needs to 开发者_运维问答be able to answer the question 'which companies have a client living at address X?', which is normally very straightforward to implement of course.
What I want to avoid is that an attacker can somehow find all company-client relations. This database will reside on a webserver and queries to it will be limited to avoid brute-force downloading of all the data.
But what if the server is compromised and an attacker has access to the whole database and possibly even any private keys that are stored on the server? It's ok if the attacker can find the list of companies, or the list of clients, but he should not be able to find out which companies are related to which client and preferably he should not be able to retrieve the address of each client.
Clients are identified by their address, not by any kind of unique ID. In my country there are only about 5 million different addresses. I though of using a secure hash to protect the address, but it's very easy to compute 5 million hashes and build a mapping from hash to address. Even if the hash is salted.
The only thing I can think of is security through obscurity: I make sure that the hashing function is not clearly recognizable and is in compiled code, and hope that the attacker isn't smart enough to figure this all out.
Is there any way at all to make this really secure?
EDIT: the comments of a3_nm and Nick Johnson are correct of course: if an attacker has access to all the data, it cannot possibly be made secure. Thanks for pointing out this (obvious) flaw.
So I need something that is not stored in the database. To make sure only companies and clients can access their own data, I can encrypt it with their own password. So a company's list of clients would be encrypted with the password of that company, which will never be stored on the server and would have to be sent along with each request. I think it's ok for me to assume that the attacker cannot intercept the requests which contain the passwords.
Or is there another (obvious?) flaw in this line of thinking as well?
I'm not sure it is possible to secure that. It looks like you would like an attacker with total access to the server to be unable to get answers to queries that the server is supposed to answer. If the attacker has server access, he can use the server to answer any query the server can answer--there's no way to work around that.
You should not store your db directly on the webserver. Pack it on server that isnt accessible directly from the web. That will make it much harder for every attaker. I dont have a ready solution for you, but a better point to start: creditcard informations facing the same problem. Google for db modeling for such cases and you will find the solution.
精彩评论