how to tap into PostgreSQL DDL parser?
The DDL / SQL scripts used to create my PostgreSQL database are under version control. In theory, any change to the database model is tracked in the source code repo开发者_StackOverflow中文版sitory.
In practice however, it happens that the structure of a live database is altered 'on the fly' and if any client scripts fail to insert / select / etc. data, I am put in charge of fixing the problem.
It would help me enormously if I could run a quick test to verify the database still corresponds to the creation scripts in the repo, i.e. is still the 'official' version.
I started using pgTAP for that purpose and so far, it works great. However, whenever a controlled, approved change is done to the DB, the test scripts need changing, too.
Therefore, I considered creating the test scripts automatically. One general approach could be to
- run the scripts to create the DB
- access DB metadata on the server
- use that metadata to generate test code
I would prefer though not having to create the DB, but instead read the DB creation scripts directly. I tried to google a way to tap into the DDL parser and get some kind of metadata representation I could use, but so far, I have learned a lot about PostgreSQL internals, but couldn't really find a solution to the issue.
Can someone think of a way to have a PostgreSQL DDL script parsed ?
Here is my method for ensuring that the live database schema matches the schema definition under version control: As part of the "build" routine of your database schema, set up a temporary database instance, load in all the schema creation scripts the way it was intended, then run a pg_dump -s
off that, and compare that with a schema dump of your production database. Depending your exact circumstances, you might need to run a little bit of sed
over the final product to get an exact match, but it's usually possible.
You can automate this procedure completely. Run the database "build" on SCM checking (using a build bot, continuous integration server, or similar), and get the dumps from the live instance by a cron job. Of course, this way you'd get an alert every time someone checks in a database change, so you'll have to tweak the specifics a little.
There is no pgTAP involved there. I love pgTAP and use it for unit testing database functions and the like (also done on the CI server), but not for verifying schema properties, because the above procedure makes that unnecessary. (Also, generating tests automatically from what they are supposed to test seems a little bit like the wrong way around.)
There is a lot of database metadata to be concerned about here. I've been poking around the relevant database internals for a few years, and I wouldn't consider the project you're considering feasible to build without dumping a few man months of programming time just to get a rough alpha quality tool that handles some specific subset of changes you're concerned about supporting. If this were easy, there wouldn't be a long standing (as in: people have wanted it for a decade) open item to build DDL Triggers into the database, which is exactly the thing you'd like to have here.
In practice, there are two popular techniques people use to make this class of problem easier to deal with:
- Set log_statement to 'ddl' and try to parse the changes it records.
- Use pg_dump --schema-only to make a regular snapshot of the database structure. Put that under version control, and use changes in its diff to find the information you're looking for.
Actually taking either of these and deriving the pgTAP scripts you want directly is its own challenge. If the change made is small enough, you might be able to automate that to some degree. At least you'd be starting with a reasonably sized problem to approach from that angle.
精彩评论