Mnesia Fragmentation and replication: resultant availability and reliability
Following the solutions to the question i asked recently about mnesia fragmentation, I still have a number of challenges. Consider the following scenario (The question I am asking is based on what follows below):
You have a data driven enterprise application which should be highly available
within the enterprise. If the internal information source is down for any reason, the enterprise applications must switch to fetch data from a recovery center which is offsite (remote).You decide to have the database replicated onto Two Nodes within the enterprise
(refered to as DB side A and DB side B). These two are running on separate Hardware but 开发者_如何转开发linked together with say, a Fast Ethernet or Optical Fibre link. Logically, you create some kind of tunnel or secure communications between these two Mnesia DBs. The two (A and B) should have the same replica of data and are in sync all the time.Now, meanwhile, the recovery center must too, have the same copy of data and in
sync all the time just in case the local data access is cutoff due to an attack or hardware failure. So the same Database schema must be replicated across the 3 sites (Side A , Side B and recovery center).
Now, within the enterprise, the application middle ware is capable of switching data requests amongst the database sites. If A is down, then without the application realizing it, the request is re-routed to Database B and so on. The middle ware layer can be configured to do load balancing (request multiplexing) or to do be flexible with fail over techniques.
Further Analysis:At Database/Schema creation time, all involved Nodes must be up and running Mnesia. To achieve this, you create say: 'db_side_A@domain.com', 'db_side_B@domain.com' and finally, 'db_recovery_center@domain.com'
Now, at Table creation, you would want to have your mnesia tables fragmented. So you decide on the following parameters:
n_disc_only_copies =:= number of nodes involved in the pool =:= 3 Reason: You are following the documentation that this parameter regulates how many disc_only_copies replicas that each fragment should have.So you want each table to have each of its fragments on each mnesia Node. node_pool =:= all nodes involved =:= ['db_side_A@domain.com', 'db_side_B@domain.com', 'db_recovery_center@domain.com']All your tables are then created based on the following arrangement
Nodes = [ 'db_side_A@domain.com', 'db_side_B@domain.com', 'db_recovery_center@domain.com' ], No_of_fragments = 16, {atomic,ok} = mnesia:create_table(TABLE_NAME,[ {frag_properties,[ {node_pool,Nodes}, {n_fragments,No_of_fragments}, {n_disc_only_copies,length(Nodes)}] }, {index,[]}, {attributes,record_info(fields,RECORD_NAME_HERE)}] ),NOTE: In the syntax above,
RECORD_NAME_HERE
cannot be a variable in reality since records must be known at compile time with Erlang.
From the installation, you see that for each table, every fragment, say, table_name_frag2
, appears on every Node's file system.
Challenges and arising Questions:
After following what is listed down above, your first database start is okay since mnesia is running on all nodes. Several challenges start to show up as the application runs and am listing the below:Supposing you decide that all writes are first tried on
DB Side A
and if side A at that instant is unavailable, the call is re-tried onDB Side B
and so on torecovery center
, and if the call fails to return on all the 3 database nodes, then the application network middle ware layer reports back that the database servers are all unavailable (this decision could have been influenced by the fact that if you let applications randomly write to your mnesia replicas, its very possible to have inconsistent database errors showing up in case your mnesia nodes lose a network connection with each other yet writes are being committed on each by different Erlang applications. If you decide on havingmaster_nodes
, then you could be at risk of losing data). So by behavior, you are forcingDB Side A
to be the master. This makes the other Database Nodes Idle for all the time as long asDB Side A
is up and running and so as many requests as hit side A and it does not go down, No request will hit side B and recovery center at all.Mnesia on start, normally, should see all involved nodes running (mnesia must be running on all involved nodes) so that it can do its negotiations and consistency checks. It means that if mnesia goes down on all nodes, mnesia must be started on all nodes before it can fully initialize and load tables. Its even worse if the Erlang VM dies along with Mnesia on a remote site. Well, several tweaks and scripts here and there could help restart the entire VM plus the intended applications if it goes down.
To cut a long story short, let me go to the questions.
Questions:
What would a Database administrator do if mnesia generates events of
inconsistent_database, starting to run database behind a partitioned network
, in a situation where setting amnesia master node
is not desirable (for fear of data loss)?What is the consequence of the mnesia event
inconsistent_database, starting to run database behind a partitioned network
as regards my application? What if I do not react to this event and let things continue the way they are? Am I losing data?In large mnesia clusters, what can one do if Mnesia goes down together with the Erlang VM on a remote site? Are there any known good methods of automatically handling this situation?
There times when one or two nodes are unreachable due to network problems or failures, and mnesia on the surviving Node reports that a given file does not exist especially in cases where you have
indexes
. So at run time, what would be the behavior of my application if some replicas go down? Would you advise me to have a master node within a mnesia cluster?
As you answer the questions above, you could also highlight on the layout described at the beginning, whether or not it would ensure availability. You can give your personal experiences on working with mnesia fragmented and replicated databases in production. In reference to the linked (quoted) question at the very beginning of this text, do provide alternative settings that could offer more reliability at database creation, say in terms of the number of fragments, operating system dependencies, node pool size, table copy types, etc.
精彩评论