开发者

offline limited multi-master in Postgres

Site A will be gene开发者_如何学编程rating a set of records. Nightly they will backup their database and ftp it to Site B. Site B will not be modifying those records at all, but will be adding more records and other tables will be creating FK's to Site A's records.

So, essentially, I need to setup a system to take all the incremental changes from Site A's dump (mostly inserts and updates, but some deletes possible) and apply them at Site B.

At this point, we're using Postgres 8.3, but could upgrade if valuable.

I believe I can do this relatively straight-forwardly with Bucardo but I'd love to hear alternatives (or confirmation of Bucardo) before I setup a linux box to test it out.


Most every replication solution would do your trick. The Postgres Wiki has a chapter on the topic. But your case is simple enough. I would just use dblink.
This is generalized from a working implementation of mine:

  1. Create a view in the master db that returns updated rows.
    Let's call it myview.

  2. Create one function per table in the slave db that fetches rows via dblink:

CREATE OR REPLACE FUNCTION f_lnk_mytbl()
  RETURNS TABLE(col_a integer, col_b text, col_c text) AS
$func$
   SELECT *
   FROM   public.dblink('SELECT col_a, col_b, col_c FROM myview')
                      AS b(col_a integer, col_b text, col_c text);
$func$  LANGUAGE sql SECURITY DEFINER;

REVOKE ALL ON FUNCTION f_lnk_mytbl() FROM public;
GRANT EXECUTE ON FUNCTION f_lnk_mytbl() TO my_user;
  1. Use above function in another function in the slave db that establishes and closes the server connection.

CREATE OR REPLACE FUNCTION f_mysync()
  RETURNS void AS
$func$
BEGIN
   PERFORM dblink_connect(
          'hostaddr=123.45.67.89 port=5432 dbname=mydb user=postgres password=secret');

   -- Fetch data into local temporary table for fast processing.
   CREATE TEMP TABLE tmp_i ON COMMIT DROP AS
   SELECT * FROM f_lnk_mytbl();

   -- *Or* read local files into temp tables with COPY so you don't need dblink.
   -- UPDATE what's already there (instead of DELETE, to keep integrity).
   UPDATE mytbl m
   SET   (  col_a,   col_b,   col_c) =
         (i.col_a, i.col_b, i.col_c)
   FROM   tmp_i i
   WHERE  m.id = i.id
   AND   (m.col_a, m.col_b, m.col_c) IS DISTINCT FROM
         (i.col_a, i.col_b, i.col_c);

   -- INSERT new rows
   INSERT INTO mytbl
   SELECT * FROM tmp_i i
   WHERE  NOT EXISTS (SELECT 1 FROM mytbl m WHERE m.id = i.id);

   -- DELETE anything? More tables?

   PERFORM dblink_disconnect();
END
$func$  LANGUAGE plpgsql SECURITY DEFINER;

REVOKE ALL ON FUNCTION f_mysync() FROM public;
GRANT EXECUTE ON FUNCTION f_mysync() TO my_user;
  1. Now, this call is all you need. Call as superuser or as my_user. Schedule a cronjob or something.

SELECT f_sync_mytbl();

In PostgreSQL 9.1 or later there is also the new CREATE FOREIGN TABLE. Might be more elegant.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜