Secure file server

2023-02-05 13:28 问答作者：

Introduction

I want to create a Java web application for storing and backing up user files, similar to Dropbox. One of the interesting Dropbox feature is that it can detect whether a certain file already exists on server. For example, if one user upload a file onto server, another user who tries to upload the same file will not need to upload the same file content. Server will only need mark that he has the same file. This helps to save the bandwidth/space and increases the speed in many ways.

The most basic solution to this problem is to use a file hash string, e.g. sha1, md5, etc., to identify the file. The client software check whether a certain hash exists on server or not. If it exists, then it can skip the uploading process and mark that user has the same file.

Problem

The web application is implemented based on REST architecture so that user can easily write their own client software to upload their files. For security reasons, the SSL is enabled for all transactions. But my most security concern is about users faking that they have a file without actually owning it if I use sha1 or any other standard hash alogorithms. This cannot be prevented by SSL or encryption. If a user manage to get the hash string, e.g. md5 and sha1 of many files can be found by googling, he can mark that he has the file using REST service on the web application.

So one of the possible solution is that the server requests a set of certain random bytes from the file as well as the hash of the whole file. Here is example steps:

Client checks whether a certain hash exists on server or not. Then, server returns the required positions of random bytes if the file already exists.
Client sends random bytes as per request if the server has the file. Client software will not be able to response to it without having the actual file.

In this way, it can save the bandwidth as well as ensure that user owns the file they want to upload.

Question

I am no expert in Security over the web so I have no idea whether this is a good idea or not. I have read some articles about implementing their own fancy process might lead to the reduction in security strength because the security cannot be tested and the extra information may provide a cracking method.

Does anyone has any comment on the process?

Will it reduce the sucurity?

Does anyone have an idea to solve this problem differently?

I understand that there might not be an exactly answer to this question but I would like to hear if开发者_如何学C anyone has encounter the same problem and has any good solution to it.

Rather than asking the client to upload some random bytes of the file's contents, it may be better to ask the client to upload the hash of a random region the file. That way you can use a wider range of sizes that you ask the client to verify.

Better yet, though, may be to send the client a random number and require the client to compute an HMAC of the entire file's contents using that number as the key. This is more computationally-expensive since the server must compute the HMAC too, but it verifies that the client has the entire file, not just a small portion of it.

One unavoidable side effect of this hash feature, even with a verification scheme, is that it reveals that a copy of the file already exists somewhere on the server. That by itself may be sensitive information.

For the most stringent privacy protection, you should forego this feature and make each user upload their own copy of the file. You can use hash comparison on the server to avoid storing multiple copies of the file, transparently to the clients.

继续阅读：dropbox security web-applications

Secure file server

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？