开发者

XML output from MySQL query

is there any chanc开发者_如何学运维e of getting the output from a MySQL query directly to XML?

Im referring to something like MSSQL has with SQL-XML plugin, for example:

SELECT * FROM table WHERE 1 FOR XML AUTO

returns text (or xml data type in MSSQL to be precise) which contains an XML markup structure generated according to the columns in the table.

With SQL-XML there is also an option of explicitly defining the output XML structure like this:

SELECT
  1       AS tag,
  NULL    AS parent,
  emp_id  AS [employee!1!emp_id],
  cust_id    AS [customer!2!cust_id],
  region    AS [customer!2!region]
 FROM table
 FOR XML EXPLICIT

which generates an XML code as follows:

<employee emp_id='129'>
   <customer cust_id='107' region='Eastern'/>
</employee>

Do you have any clues how to achieve this in MySQL?

Thanks in advance for your answers.


The mysql command can output XML directly, using the --xml option, which is available at least as far back as MySql 4.1.

However, this doesn't allow you to customize the structure of the XML output. It will output something like this:

<?xml version="1.0"?>
<resultset statement="SELECT * FROM orders" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <row>
    <field name="emp_id">129</field>
    <field name="cust_id">107</field>
    <field name="region">Eastern</field>
  </row>
</resultset>

And you want:

<?xml version="1.0"?>
<orders>
  <employee emp_id="129">
    <customer cust_id="107" region="Eastern"/>
  </employee>
</orders>

The transformation can be done with XSLT using a script like this:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

  <xsl:output indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="resultset">
    <orders>
      <xsl:apply-templates/>
    </orders>
  </xsl:template>

  <xsl:template match="row">
    <employee emp_id="{field[@name='emp_id']}">
      <customer
        cust_id="{field[@name='cust_id']}"
        region="{field[@name='region']}"/>
    </employee>
  </xsl:template>

</xsl:stylesheet>

This is obviously way more verbose than the concise MSSQL syntax, but on the other hand it is a lot more powerful and can do all sorts of things that wouldn't be possible in MSSQL.

If you use a command-line XSLT processor such as xsltproc or saxon, you can pipe the output of mysql directly into the XSLT program. For example:

mysql -e 'select * from table' -X database | xsltproc script.xsl -


Using XML with MySQL seems to be a good place to start with various different ways to get from MySQL query to XML.

From the article:

   use strict;
   use DBI;
   use XML::Generator::DBI;
   use XML::Handler::YAWriter;

   my $dbh = DBI->connect ("DBI:mysql:test",
                           "testuser", "testpass",
                           { RaiseError => 1, PrintError => 0});
   my $out = XML::Handler::YAWriter->new (AsFile => "-");
   my $gen = XML::Generator::DBI->new (
                                   Handler => $out,
                                   dbh => $dbh
                               );
   $gen->execute ("SELECT name, category FROM animal");
   $dbh->disconnect ();


Do you have any clue how to achieve this in MySQL?

Yes, go by foot and make the xml yourself with CONCAT strings. Try

SELECT concat('<orders><employee emp_id="', emp_id, '"><customer cust_id="', cust_id, '" region="', region, '"/></employee></orders>') FROM table

I took this from a 2009 answer How to convert a MySQL DB to XML? and it still seems to work. Not very handy, and if you have large trees per item, they will all be in one concatenated value of the root item, but it works, see this test with dummies:

SELECT concat('<orders><employee emp_id="', 1, '"><customer cust_id="', 2, '" region="', 3, '"/></employee></orders>') FROM DUAL

gives

<orders><employee emp_id="1"><customer cust_id="2" region="3"/></employee></orders>

With "manual coding" you can get to this structure.

<?xml version="1.0"?>
<orders>
  <employee emp_id="1">
    <customer cust_id="2" region="3" />
  </employee>
</orders>

I checked this with a larger tree per root item and it worked, but I had to run an additional Python code on it to get rid of the too many openings and closings generated when you have medium level nodes in an xml path. It is possible using backward-looking lists together with entries in a temporary set, and I got it done, but an object oriented way would be more professional. I just coded to drop the last x items from the list as soon as a new head item was found, and some other tricks for nested branches. Worked.

I puzzled out a Regex that found each text between tags:

string = "     <some tag><another tag>test string<another tag></some tag>"
pattern = r'(?:^\s*)?(?:(?:<[^\/]*?)>)?(.*?)?(?:(?:<\/[^>]*)>)?'
p = re.compile(pattern)
val = r''.join(p.findall(string))
val_escaped = escape(val)
if val_escaped != val:
    string.replace(val, val_escaped)

This Regex helps you to access the text between the tags. If you are allowed to use CDATA, it is easiest to use that everywhere. Just make the content "CDATA" (character data) already in MySQL:

<Title><![CDATA[', t.title, ']]></Title>

And you will not have any issues anymore except for very strange characters like (U+001A) which you should replace already in MySQL. You then do not need to care for escaping and replacing the rest of the special characters at all. Worked for me on a 1 Mio. lines xml file with heavy use of special characters.

Yet: you should validate the file against the needed xml schema file using Python's module xmlschema. It will alert you when you are not allowed to use that CDATA trick.

If you need a fully UTF-8 formatted content without CDATA, which might often be the task, you can reach that even in a 1 Mio lines file by validating the code output (= xml output) step by step against the xml schema file (xsd that is the aim). It is a bit fiddly work, but it can be done with some patience. Replacements are possible with:

  • MySQL using replace()
  • Python using string.replace()
  • Python using Regex replace (though I did not need it in the end, it would look like: re.sub(re.escape(val), 'xyz', i))
  • string.encode(encoding = 'UTF-8', errors = 'strict')

Mind that encoding as utf-8 is the most powerful step, it could even put aside all three other replacement ways above. Mind also: It makes the text binary, you then need to treat it as binary b'...' and you can thus write it to a file only in binary mode using wb.

As the end of it all, you may open the XML output in a normal browser like Firefox for a final check and watch the XML at work. Or check it in vscode/codium with an xml Extension. But these checks are not needed, in my case the xmlschema module has shown everything very well. Mind also that vscode/codium can can handle xml problems quite easily and still show a tree when Firefox cannot, therefore, you will need a validator or a browser to see all xml errors.

Quite a huge project could be done using this xml-building-with-mysql, at the end there was a triple nested xml tree with many repeating tags inside parent nodes, all made from a two-dimensional MySQL output.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜