开发者

XML/XSLT/Access/VBA: how can i merge all child elements (even unknown elements) into one, before importing to Access database?

CURRENT XML:

<?xml version="1.0"?>

<form1>
   <page1>
      <first_name></first_name>
      <last_name></last_name>
      .
      .
   </page1>
   <page2>
      <address></address>
      <phone_number></phone_number>
      .
      .
   </page2>
   <page3>
      <company_name></company_name>
      <job_title></job_title>
      .
      .
   </page3>
</form1>

DESIRED XML - i want to merge all child elements and rename the parent:

<?xml version="1.0"?>

<form>
   <page>
      <first_name></first_name>
      <last_name></last_name>
      .
      .
      <address></address>
      <phone_number></phone_number>
      .
      .
      <company_name></company_name>
      <job_title></job_title>
      .
      .
   </page>
</form>

then, since i have thousands of XML files with some unknown elements, i want to find all of them before bulk importing the XML into Access database, because any new elements in subsequent files will be dropped if they are not defined in the schema.

not all child elements are known. not all fil开发者_Python百科e names are known.

so, how can i check all files for all elements, fill the Access table with them all, then bulk import all the XML records to fit into the desired schema as shown above?

EDIT:

ok, i see - there are no attributes. what i meant was all child elements. thanks for pointing that out Oded, I updated the question with corrections.

this is the VBA code I am using in Access for bulk importing the files:

 Private Sub cmdImport_Click()
 Dim strFile As String 'Filename
 Dim strFileList() As String 'File Array
 Dim intFile As Integer 'File Number
 Dim strPath As String ' Path to file folder

 strPath = "C:\Users\Main\Desktop\XML-files"
 strFile = Dir(strPath & "*.XML")

 While strFile <> ""
      'add files to the list
     intFile = intFile + 1
     ReDim Preserve strFileList(1 To intFile)
     strFileList(intFile) = strFile
     strFile = Dir()
 Wend
 'see if any files were found
 If intFile = 0 Then
     MsgBox "No files found"
     Exit Sub
 End If

 'cycle through the list of files
 For intFile = 1 To UBound(strFileList)
     Application.ImportXML strPath & strFileList(intFile), acAppendData

 Next intFile
MsgBox "Import Completed"

End Sub

i can use the stylesheet to transform the XML as such:

  For intFile = 1 To UBound(strFileList)
     Application.TransformXML strPath & strFileList(intFile), _
     "C:\Users\Main\Desktop\stylesheet2.xslt", _
     "C:\Users\Main\Desktop\temp.xml", True
     Application.ImportXML "C:\Users\Main\Desktop\temp.xml", acAppendData
   Next intFile

 MsgBox "Import Completed"
End Sub

however, it does not merge all the file elements into one table. am i missing something? do i need to save a variable list? or create some attribute ids?

EDIT: From comments

my file names are 1.xml, 2.xml, 3.xml, 4.xml, etc. But like i said have thousands


Suppose this input documents:

1.xml

<form1>
   <page1>
        <first_name>D</first_name>
        <last_name>E</last_name>
   </page1>
   <page2>
        <address>F</address>
        <phone_number>1</phone_number>
   </page2>
   <page3>
        <company_name>G</company_name>
   </page3>
</form1>

2.xml

<form2>
   <page1>
        <first_name>A</first_name>
   </page1>
   <page2>
        <address>B</address>
   </page2>
   <page3>
        <company_name>C</company_name>
        <job_title>H</job_title>
   </page3>
</form2>

This stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="kElementByName" match="/*/*/*" use="name()"/>
    <xsl:param name="pMaxFileNumber" select="2"/>
    <xsl:template match="/">
        <xsl:variable name="vFieldsNames">
            <xsl:call-template name="names">
                <xsl:with-param name="pFrom" select="1"/>
                <xsl:with-param name="pTo" select="$pMaxFileNumber"/>
                <xsl:with-param name="pFieldsNames" select="'|'"/>
            </xsl:call-template>
        </xsl:variable>
        <form>
            <xsl:call-template name="merge">
                <xsl:with-param name="pFrom" select="1"/>
                <xsl:with-param name="pTo" select="$pMaxFileNumber"/>
                <xsl:with-param name="pFieldsNames" select="$vFieldsNames"/>
            </xsl:call-template>
        </form>
    </xsl:template>
    <xsl:template name="names">
        <xsl:param name="pFrom"/>
        <xsl:param name="pTo"/>
        <xsl:param name="pFieldsNames"/>
        <xsl:choose>
            <xsl:when test="$pFrom = $pTo">
                <xsl:value-of select="$pFieldsNames"/>
                <xsl:apply-templates
                     select="document(concat($pFrom,'.xml'),/)/*/*/*
                                       [count(.|key('kElementByName',
                                                    name())[1])=1]
                                       [not(contains($pFieldsNames,
                                                     concat('|',name(),'|')))]"
                     mode="names"/>
            </xsl:when>
            <xsl:otherwise>
                <xsl:variable name="vNewTop"
                              select="floor(($pTo - $pFrom) div 2) + $pFrom"/>
                <xsl:variable name="vNewFieldsNames">
                    <xsl:call-template name="names">
                        <xsl:with-param name="pFrom" select="$pFrom"/>
                        <xsl:with-param name="pTo" select="$vNewTop"/>
                        <xsl:with-param name="pFieldsNames"
                                        select="$pFieldsNames"/>
                    </xsl:call-template>
                </xsl:variable>
                <xsl:call-template name="names">
                    <xsl:with-param name="pFrom" select="$vNewTop + 1"/>
                    <xsl:with-param name="pTo" select="$pTo"/>
                    <xsl:with-param name="pFieldsNames"
                                    select="$vNewFieldsNames"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
    <xsl:template name="merge">
        <xsl:param name="pFrom"/>
        <xsl:param name="pTo"/>
        <xsl:param name="pFieldsNames"/>
        <xsl:choose>
            <xsl:when test="$pFrom = $pTo">
                <page>
                    <xsl:apply-templates
                     select="document(concat($pFrom,'.xml'),/)/*/*[1]/*[1]">
                        <xsl:with-param name="pFieldsNames"
                                        select="$pFieldsNames"/>
                    </xsl:apply-templates>
                </page>
            </xsl:when>
            <xsl:otherwise>
                <xsl:variable name="vNewTop"
                              select="floor(($pTo - $pFrom) div 2) + $pFrom"/>
                <xsl:call-template name="merge">
                    <xsl:with-param name="pFrom" select="$pFrom"/>
                    <xsl:with-param name="pTo" select="$vNewTop"/>
                    <xsl:with-param name="pFieldsNames" select="$pFieldsNames"/>
                </xsl:call-template>
                <xsl:call-template name="merge">
                    <xsl:with-param name="pFrom" select="$vNewTop + 1"/>
                    <xsl:with-param name="pTo" select="$pTo"/>
                    <xsl:with-param name="pFieldsNames" select="$pFieldsNames"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
    <xsl:template match="/*/*">
        <xsl:param name="pFieldsNames"/>
        <xsl:apply-templates select="*[1]">
            <xsl:with-param name="pFieldsNames" select="$pFieldsNames"/>
        </xsl:apply-templates>
    </xsl:template>
    <xsl:template match="/*/*/*" name="copy">
        <xsl:param name="pFieldsNames"/>
        <xsl:copy>
            <xsl:value-of select="."/>
        </xsl:copy>
        <xsl:variable name="vName" select="concat('|',name(),'|')"/>
        <xsl:apply-templates select="following::*[1]">
            <xsl:with-param name="pFieldsNames"
                            select="concat(substring-before($pFieldsNames,
                                                            $vName),
                                           '|',
                                           substring-after($pFieldsNames,
                                                           $vName))"/>
        </xsl:apply-templates>
    </xsl:template>
    <xsl:template match="/*/*[last()]/*[last()]">
        <xsl:param name="pFieldsNames"/>
        <xsl:call-template name="copy"/>
        <xsl:variable name="vName" select="concat('|',name(),'|')"/>
        <xsl:call-template name="empty">
            <xsl:with-param name="pFieldsNames"
                            select="substring(
                                      concat(substring-before($pFieldsNames,
                                                              $vName),
                                             '|',
                                             substring-after($pFieldsNames,
                                                             $vName)),
                                      2)"/>
        </xsl:call-template>
    </xsl:template>
    <xsl:template match="/*/*/*" mode="names">
        <xsl:value-of select="concat(name(),'|')"/>
    </xsl:template>
    <xsl:template name="empty">
        <xsl:param name="pFieldsNames"/>
        <xsl:if test="$pFieldsNames!=''">
            <xsl:element name="{substring-before($pFieldsNames,'|')}"/>
            <xsl:call-template name="empty">
                <xsl:with-param name="pFieldsNames"
                           select="substring-after($pFieldsNames,'|')"/>
            </xsl:call-template>
        </xsl:if>
    </xsl:template>
</xsl:stylesheet>

Output:

<form>
    <page>
        <first_name>D</first_name>
        <last_name>E</last_name>
        <address>F</address>
        <phone_number>1</phone_number>
        <company_name>G</company_name>
        <job_title />
    </page>
    <page>
        <first_name>A</first_name>
        <address>B</address>
        <company_name>C</company_name>
        <job_title>H</job_title>
        <last_name />
        <phone_number />
    </page>
</form>

Note: If this blows up your memory, then you need to split this in two stylesheets: first, output the names; second, merge. If you can't pass param with Application.TransformXML, then the max number of files is fixed. Also, there must not be any hole: if max number of files is 3, 2.xml can't be missed (this is because fn:document throws an error)

EDIT: For a two pass transformation.

This stylesheet with any input (not used):

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:param name="pMaxFileNumber" select="2"/>
    <xsl:template match="/">
        <form>
            <xsl:call-template name="copy">
                <xsl:with-param name="pFrom" select="1"/>
                <xsl:with-param name="pTo" select="$pMaxFileNumber"/>
            </xsl:call-template>
        </form>
    </xsl:template>
    <xsl:template name="copy">
        <xsl:param name="pFrom"/>
        <xsl:param name="pTo"/>
        <xsl:choose>
            <xsl:when test="$pFrom = $pTo">
                <page>
                    <xsl:copy-of 
                     select="document(concat($pFrom,'.xml'),/)/*/*/*"/>
                </page>
            </xsl:when>
            <xsl:otherwise>
                <xsl:variable name="vMiddle"
                      select="floor(($pTo - $pFrom) div 2) + $pFrom"/>
                <xsl:call-template name="copy">
                    <xsl:with-param name="pFrom" select="$pFrom"/>
                    <xsl:with-param name="pTo" select="$vMiddle"/>
                </xsl:call-template>
                <xsl:call-template name="copy">
                    <xsl:with-param name="pFrom" select="$vMiddle + 1"/>
                    <xsl:with-param name="pTo" select="$pTo"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
</xsl:stylesheet>

Output:

<form>
    <page>
        <first_name>D</first_name>
        <last_name>E</last_name>
        <address>F</address>
        <phone_number>1</phone_number>
        <company_name>G</company_name>
    </page>
    <page>
        <first_name>A</first_name>
        <address>B</address>
        <company_name>C</company_name>
        <job_title>H</job_title>
    </page>
</form>

And this stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="kElementByName" match="/*/*/*" use="name()"/>
    <xsl:variable name="vElements"
                  select="/*/*/*[count(.|key('kElementByName',name())[1])=1]"/>
    <xsl:template match="form">
        <xsl:copy>
            <xsl:apply-templates/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="page">
        <xsl:copy>
            <xsl:apply-templates select="$vElements">
                <xsl:with-param name="pContext" select="."/>
            </xsl:apply-templates>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="/*/*/*">
        <xsl:param name="pContext"/>
        <xsl:element name="{name()}">
            <xsl:value-of select="$pContext/*[name()=name(current())]"/>
        </xsl:element>
    </xsl:template>
</xsl:stylesheet>

With previus output as input, result:

<form>
    <page>
        <first_name>D</first_name>
        <last_name>E</last_name>
        <address>F</address>
        <phone_number>1</phone_number>
        <company_name>G</company_name>
        <job_title></job_title>
    </page>
    <page>
        <first_name>A</first_name>
        <last_name></last_name>
        <address>B</address>
        <phone_number></phone_number>
        <company_name>C</company_name>
        <job_title>H</job_title>
    </page>
</form>


This stylesheet produces the output that you described.

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" />

<xsl:template match="/">
<!--generate standard document element and it's child element-->
<form>
    <page>
            <!--Apply templates to children of document element's, child element's, children-->
        <xsl:apply-templates select="/*/*/node()" />
    </page>
</form>
</xsl:template>

<!--Identity template copies all content forward-->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>

If you just want to copy the elements under the page elements, rather than any node() (element, text, comment, or processing instruction), then you could change the XPATH from: /*/*/node() to: /*/*/*


If you really want to just copy all the contents of the page{N} elements, then this transformation is probably the shortest:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="/">
   <form>
    <page>
      <xsl:copy-of select="/*/*/node()"/>
    </page>
   </form>
 </xsl:template>
</xsl:stylesheet>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜