XML/XSLT/Access/VBA: how can i merge all child elements (even unknown elements) into one, before importing to Access database?
CURRENT XML:
<?xml version="1.0"?>
<form1>
<page1>
<first_name></first_name>
<last_name></last_name>
.
.
</page1>
<page2>
<address></address>
<phone_number></phone_number>
.
.
</page2>
<page3>
<company_name></company_name>
<job_title></job_title>
.
.
</page3>
</form1>
DESIRED XML - i want to merge all child elements and rename the parent:
<?xml version="1.0"?>
<form>
<page>
<first_name></first_name>
<last_name></last_name>
.
.
<address></address>
<phone_number></phone_number>
.
.
<company_name></company_name>
<job_title></job_title>
.
.
</page>
</form>
then, since i have thousands of XML files with some unknown elements, i want to find all of them before bulk importing the XML into Access database, because any new elements in subsequent files will be dropped if they are not defined in the schema.
not all child elements are known. not all fil开发者_Python百科e names are known.
so, how can i check all files for all elements, fill the Access table with them all, then bulk import all the XML records to fit into the desired schema as shown above?
EDIT:
ok, i see - there are no attributes. what i meant was all child elements. thanks for pointing that out Oded, I updated the question with corrections.
this is the VBA code I am using in Access for bulk importing the files:
Private Sub cmdImport_Click()
Dim strFile As String 'Filename
Dim strFileList() As String 'File Array
Dim intFile As Integer 'File Number
Dim strPath As String ' Path to file folder
strPath = "C:\Users\Main\Desktop\XML-files"
strFile = Dir(strPath & "*.XML")
While strFile <> ""
'add files to the list
intFile = intFile + 1
ReDim Preserve strFileList(1 To intFile)
strFileList(intFile) = strFile
strFile = Dir()
Wend
'see if any files were found
If intFile = 0 Then
MsgBox "No files found"
Exit Sub
End If
'cycle through the list of files
For intFile = 1 To UBound(strFileList)
Application.ImportXML strPath & strFileList(intFile), acAppendData
Next intFile
MsgBox "Import Completed"
End Sub
i can use the stylesheet to transform the XML as such:
For intFile = 1 To UBound(strFileList)
Application.TransformXML strPath & strFileList(intFile), _
"C:\Users\Main\Desktop\stylesheet2.xslt", _
"C:\Users\Main\Desktop\temp.xml", True
Application.ImportXML "C:\Users\Main\Desktop\temp.xml", acAppendData
Next intFile
MsgBox "Import Completed"
End Sub
however, it does not merge all the file elements into one table. am i missing something? do i need to save a variable list? or create some attribute ids?
EDIT: From comments
my file names are 1.xml, 2.xml, 3.xml, 4.xml, etc. But like i said have thousands
Suppose this input documents:
1.xml
<form1>
<page1>
<first_name>D</first_name>
<last_name>E</last_name>
</page1>
<page2>
<address>F</address>
<phone_number>1</phone_number>
</page2>
<page3>
<company_name>G</company_name>
</page3>
</form1>
2.xml
<form2>
<page1>
<first_name>A</first_name>
</page1>
<page2>
<address>B</address>
</page2>
<page3>
<company_name>C</company_name>
<job_title>H</job_title>
</page3>
</form2>
This stylesheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:key name="kElementByName" match="/*/*/*" use="name()"/>
<xsl:param name="pMaxFileNumber" select="2"/>
<xsl:template match="/">
<xsl:variable name="vFieldsNames">
<xsl:call-template name="names">
<xsl:with-param name="pFrom" select="1"/>
<xsl:with-param name="pTo" select="$pMaxFileNumber"/>
<xsl:with-param name="pFieldsNames" select="'|'"/>
</xsl:call-template>
</xsl:variable>
<form>
<xsl:call-template name="merge">
<xsl:with-param name="pFrom" select="1"/>
<xsl:with-param name="pTo" select="$pMaxFileNumber"/>
<xsl:with-param name="pFieldsNames" select="$vFieldsNames"/>
</xsl:call-template>
</form>
</xsl:template>
<xsl:template name="names">
<xsl:param name="pFrom"/>
<xsl:param name="pTo"/>
<xsl:param name="pFieldsNames"/>
<xsl:choose>
<xsl:when test="$pFrom = $pTo">
<xsl:value-of select="$pFieldsNames"/>
<xsl:apply-templates
select="document(concat($pFrom,'.xml'),/)/*/*/*
[count(.|key('kElementByName',
name())[1])=1]
[not(contains($pFieldsNames,
concat('|',name(),'|')))]"
mode="names"/>
</xsl:when>
<xsl:otherwise>
<xsl:variable name="vNewTop"
select="floor(($pTo - $pFrom) div 2) + $pFrom"/>
<xsl:variable name="vNewFieldsNames">
<xsl:call-template name="names">
<xsl:with-param name="pFrom" select="$pFrom"/>
<xsl:with-param name="pTo" select="$vNewTop"/>
<xsl:with-param name="pFieldsNames"
select="$pFieldsNames"/>
</xsl:call-template>
</xsl:variable>
<xsl:call-template name="names">
<xsl:with-param name="pFrom" select="$vNewTop + 1"/>
<xsl:with-param name="pTo" select="$pTo"/>
<xsl:with-param name="pFieldsNames"
select="$vNewFieldsNames"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template name="merge">
<xsl:param name="pFrom"/>
<xsl:param name="pTo"/>
<xsl:param name="pFieldsNames"/>
<xsl:choose>
<xsl:when test="$pFrom = $pTo">
<page>
<xsl:apply-templates
select="document(concat($pFrom,'.xml'),/)/*/*[1]/*[1]">
<xsl:with-param name="pFieldsNames"
select="$pFieldsNames"/>
</xsl:apply-templates>
</page>
</xsl:when>
<xsl:otherwise>
<xsl:variable name="vNewTop"
select="floor(($pTo - $pFrom) div 2) + $pFrom"/>
<xsl:call-template name="merge">
<xsl:with-param name="pFrom" select="$pFrom"/>
<xsl:with-param name="pTo" select="$vNewTop"/>
<xsl:with-param name="pFieldsNames" select="$pFieldsNames"/>
</xsl:call-template>
<xsl:call-template name="merge">
<xsl:with-param name="pFrom" select="$vNewTop + 1"/>
<xsl:with-param name="pTo" select="$pTo"/>
<xsl:with-param name="pFieldsNames" select="$pFieldsNames"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="/*/*">
<xsl:param name="pFieldsNames"/>
<xsl:apply-templates select="*[1]">
<xsl:with-param name="pFieldsNames" select="$pFieldsNames"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="/*/*/*" name="copy">
<xsl:param name="pFieldsNames"/>
<xsl:copy>
<xsl:value-of select="."/>
</xsl:copy>
<xsl:variable name="vName" select="concat('|',name(),'|')"/>
<xsl:apply-templates select="following::*[1]">
<xsl:with-param name="pFieldsNames"
select="concat(substring-before($pFieldsNames,
$vName),
'|',
substring-after($pFieldsNames,
$vName))"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="/*/*[last()]/*[last()]">
<xsl:param name="pFieldsNames"/>
<xsl:call-template name="copy"/>
<xsl:variable name="vName" select="concat('|',name(),'|')"/>
<xsl:call-template name="empty">
<xsl:with-param name="pFieldsNames"
select="substring(
concat(substring-before($pFieldsNames,
$vName),
'|',
substring-after($pFieldsNames,
$vName)),
2)"/>
</xsl:call-template>
</xsl:template>
<xsl:template match="/*/*/*" mode="names">
<xsl:value-of select="concat(name(),'|')"/>
</xsl:template>
<xsl:template name="empty">
<xsl:param name="pFieldsNames"/>
<xsl:if test="$pFieldsNames!=''">
<xsl:element name="{substring-before($pFieldsNames,'|')}"/>
<xsl:call-template name="empty">
<xsl:with-param name="pFieldsNames"
select="substring-after($pFieldsNames,'|')"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
Output:
<form>
<page>
<first_name>D</first_name>
<last_name>E</last_name>
<address>F</address>
<phone_number>1</phone_number>
<company_name>G</company_name>
<job_title />
</page>
<page>
<first_name>A</first_name>
<address>B</address>
<company_name>C</company_name>
<job_title>H</job_title>
<last_name />
<phone_number />
</page>
</form>
Note: If this blows up your memory, then you need to split this in two stylesheets: first, output the names; second, merge. If you can't pass param with Application.TransformXML
, then the max number of files is fixed. Also, there must not be any hole: if max number of files is 3, 2.xml
can't be missed (this is because fn:document
throws an error)
EDIT: For a two pass transformation.
This stylesheet with any input (not used):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:param name="pMaxFileNumber" select="2"/>
<xsl:template match="/">
<form>
<xsl:call-template name="copy">
<xsl:with-param name="pFrom" select="1"/>
<xsl:with-param name="pTo" select="$pMaxFileNumber"/>
</xsl:call-template>
</form>
</xsl:template>
<xsl:template name="copy">
<xsl:param name="pFrom"/>
<xsl:param name="pTo"/>
<xsl:choose>
<xsl:when test="$pFrom = $pTo">
<page>
<xsl:copy-of
select="document(concat($pFrom,'.xml'),/)/*/*/*"/>
</page>
</xsl:when>
<xsl:otherwise>
<xsl:variable name="vMiddle"
select="floor(($pTo - $pFrom) div 2) + $pFrom"/>
<xsl:call-template name="copy">
<xsl:with-param name="pFrom" select="$pFrom"/>
<xsl:with-param name="pTo" select="$vMiddle"/>
</xsl:call-template>
<xsl:call-template name="copy">
<xsl:with-param name="pFrom" select="$vMiddle + 1"/>
<xsl:with-param name="pTo" select="$pTo"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
Output:
<form>
<page>
<first_name>D</first_name>
<last_name>E</last_name>
<address>F</address>
<phone_number>1</phone_number>
<company_name>G</company_name>
</page>
<page>
<first_name>A</first_name>
<address>B</address>
<company_name>C</company_name>
<job_title>H</job_title>
</page>
</form>
And this stylesheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:key name="kElementByName" match="/*/*/*" use="name()"/>
<xsl:variable name="vElements"
select="/*/*/*[count(.|key('kElementByName',name())[1])=1]"/>
<xsl:template match="form">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="page">
<xsl:copy>
<xsl:apply-templates select="$vElements">
<xsl:with-param name="pContext" select="."/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
<xsl:template match="/*/*/*">
<xsl:param name="pContext"/>
<xsl:element name="{name()}">
<xsl:value-of select="$pContext/*[name()=name(current())]"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
With previus output as input, result:
<form>
<page>
<first_name>D</first_name>
<last_name>E</last_name>
<address>F</address>
<phone_number>1</phone_number>
<company_name>G</company_name>
<job_title></job_title>
</page>
<page>
<first_name>A</first_name>
<last_name></last_name>
<address>B</address>
<phone_number></phone_number>
<company_name>C</company_name>
<job_title>H</job_title>
</page>
</form>
This stylesheet produces the output that you described.
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" />
<xsl:template match="/">
<!--generate standard document element and it's child element-->
<form>
<page>
<!--Apply templates to children of document element's, child element's, children-->
<xsl:apply-templates select="/*/*/node()" />
</page>
</form>
</xsl:template>
<!--Identity template copies all content forward-->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
If you just want to copy the elements under the page elements, rather than any node() (element, text, comment, or processing instruction), then you could change the XPATH from: /*/*/node()
to: /*/*/*
If you really want to just copy all the contents of the page{N}
elements, then this transformation is probably the shortest:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<form>
<page>
<xsl:copy-of select="/*/*/node()"/>
</page>
</form>
</xsl:template>
</xsl:stylesheet>
精彩评论