Never losing data techniques

2023-01-28 10:39 问答作者：

I'm curious about techniques used to build a system where ensuring that no data is lost is of the utmost priority. For a simplistic example, what does a financial institution do to make sure that when money is transferred between accounts, once it is withdrawn from one account it is without a doubt put in the other account. I'm not so much look开发者_高级运维ing for particular techniques like database transactions, but larger, more architecty concepts, like how the data is saved if a server goes down, or a queue runs out of space, or whatever.

If someone could point me to books or articles on I'd be much obliged.

You should read about Automated Teller Machine, Online transaction processing, and others topics about data encryption, also consider use HTTPS if you are thinking on web sites.

The basic technique is removing any single point of failure. Anything that can fail in your setup needs to have a back or multiple backups. From Multiple switches, servers, UPSs, harddrives, etc... Databases are constantly being replicated, and data is backed up and stored off site in case of a fire or other disaster which could comprimise the building.

It can all really boil down to having the same data in two places; from down to a the code which is holding a cache prior to a commitment of data, all the way up to server redundancy.

The only way to make sure you don't lose something is have multiple copies of it.

in the case of the bank example, each bank would keep a record for every transaction stating what how much and to where and from where and how much and their time order

so that later if there is a problem you compare the two transaction logs if they don't match you can identify the missing transactions

this also covers the problem that one bank can't trust another to keep records for it

as they cross check this is almost a distributed transaction protocol

You might want to read up on XA or X/Open transactions which can co-ordinate multiple systems including databases, queues, and more into ACID DB-like transactions.

I've not worked with it but I've heard it can be expensive latency-wise and computationally. But then again how much is your data integrity worth?

http://en.wikipedia.org/wiki/X/Open_XA

As you've alluded to, there are various mechanisms (like transactions) for ensuring the software based "handshake" is reliable and completes successfully.

Architectureally - yes having two copies of stuff gives you redundencey which helps not losing stuff. beyond that:

Clear Processes: people need to know exactly where information is going - both in sunny day senarios and when the brown stuff hits the fan. Having the data but not being able to find it, or recognise it is just as bad as lossing it. The clearer (and well documented) your processes are the better.
Consistency: automated is obviously better that random human error.
To specifically answer your question - but the above points should be echoed in an architecture and design that was clear, and which clearly seperated concerns.
Reduce points of failure as much as possible.
Focus attention on higher risk areas.
Use proven techniques (I guess that's what you're actually asking for).
Keep things as simple as possible.

I worked on a solution architecture for an off-the-shelf document management system a while back; no loss of data was the big driver. The system was rolled out nationally, so multi-site in terms of both 'regional' caches for servicing local users, and actual 'data centers'. Some points of interest:

All components (where possible) were deployed onto virtual boxes, which were back-ed to a SAN, so iun the event of a physcial host going down we could restore service faster. In terms of data loss it means that users are more likely to be able to use the protected system than storing stuff locally if the system was down.
Also, the SAN was seen as being more safer than local disks.
The above was part of the existing set-up, so nothing new for Ops to learn.
Failover site, with replication. This wasn't real-time, and was augmented by the transactional logs on the databases.

I guess none of this is heavily software centered, but I do think that all the good software architecture / design principles "we" use helped guide my thinking.

继续阅读：architecture

Never losing data techniques

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？