How do I specify columns when loading new rows into PostgreSQL using pg_bulkload
I'm experimenting with using the pg_bulkload project to import millions of rows of data into a database. However, none of the new rows have a primary key and only two of several columns are avalable in my input file. How do I tell pg_bulkload which columns I'm importing and how do I generate the primary 开发者_开发知识库key field? Do I need to edit my import file to match exactly what the output of a COPY command would be and generate the id
field myself?
For example, lets say my database columns might be:
id title body published
The data that I have is limited to title
and published
and are listed in a tab delimited file. My .ctl
file looks like this:
TABLE = posts
INFILE = stdin
TYPE = CSV
DELIMITER = " "
You can use FILTER
functionality of pg_loader. Something like:
In database
CREATE FUNCTION pg_bulkload_filter(text, text) RETURNS record
AS $$
SELECT nextval('tablename_id_seq'), NULL, NULL, $1, $2, NULL
$$ LANGUAGE SQL;
And in pg_bulkload control file:
FILTER = pg_bulkload_filter
精彩评论