开发者

how to convert ms excel xml using a simple stylesheet?

I need to convert a excel spreadsheet saved in xml format using a stylesheet...but I keep getting hung up on the many namespaces ms uses. I need an XSLT stylesheet that allows per element passing (the default seems to开发者_运维百科 pass ALL the text when I use a stylesheet that should otherwise only pass the element templates I am writing) of the input xml.

Here's the input xml:

http://apriority.dyndns.biz/im_cache/u_1/j_2390/2_9/tf/testexcel.xml

:I need to simply extract data from both sheets in columns, like:

Output:

Sheet 1
1) blah, blue , burn, baste, belly, belie, bestow, betrothed, bemoan
2) quack, quagmire, quick, quantum, quant
3) alimony, ashy, amber, absolute, astrology, alabaster, angry
4) cost, curry, candor, cabin, capability, castor, canada

Sheet 2
1) 3
2) 32
3) 322
....etc.

I have trouble with the stylesheet...something to do with the namespaces that MS uses in the input xml ...I seem to be getting in my output ALL of the data even if i only have one element template (say for extracting the rows of only the first sheet)...what stylesheet code will get the output above???

Here is the stylesheet I have so far:

http://apriority.dyndns.biz/im_cache/u_1/j_2390/2_9/tf/learningmap.xsl

File put on non https url so no virus risk plus I am using my real name, google me!

Here is the edited stylesheet after use of suggested code by Mads!

http://apriority.dyndns.biz/im_cache/u_1/j_2390/2_9/tf/learningmap_mod.xsl

I still have a problem where the actual text data is not being extracted in that follow up code, why am I not able to get the text data. I can output the first variable "snid" but all the text variables don't appear in the output even though I am selecting them and the original source xml has stuff in those entries. Any help on this new problem would be greatly appreciated!

Update Feb. 9:

I solved the issue with the mapping failure to the variables. It was a simple xpath mistake I was addressing nodes that didn't exist. There is a one to one mapping between the Cells and their rows so the variables should have extracted as Cell[k].Row[1]...Cell[k+1].Row[1]...etc.

The output transformation occurs as I need it to, thanks for the contributions toward the answer. This one is hard to judge which answer is accepted as both submissions were of aid but I'll have to give it to Mads Hanson this time. Thanks!


  • Namespaces are inherited from the parents, and the <Worksheet> element uses it without a namespace prefix, so it may not have been apparent that Worksheet, Data, Row, and Cell were all bound to the same namespace and needed ss: for each of the match criteria.
    • You had correctly declared the urn:schemas-microsoft-com:office:spreadsheet namespace, but were not using it to match all of the content.
  • In the template for ss:Worksheet you were not applying templates, so processing was stopping. I added an apply-templates for ss:Table/ss:Row
  • It looked as if you were creating text output
    • I changed the output method="xml" to method="text"
    • I replaced <br/> with &#xA; (a newline)

I have modified your stylesheet to produce the output that you had shown.

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" 
         xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
         xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" >
    <xsl:output method="text"/>

    <xsl:template match="ss:Workbook">
        <xsl:apply-templates select="ss:Worksheet"/>
    </xsl:template>

    <xsl:template match="ss:Worksheet">
        <xsl:value-of select="@ss:Name"/>
        <xsl:text>:&#xA;</xsl:text>
        <xsl:apply-templates select="ss:Table/ss:Row" />
    </xsl:template>

    <xsl:template match="ss:Row">
        <xsl:apply-templates select="ss:Cell"/>
    </xsl:template>

    <xsl:template match="ss:Cell">
        <xsl:apply-templates select="ss:Data"/>
    </xsl:template>

    <xsl:template match="ss:Data">
        <xsl:value-of select="count(../preceding-sibling::ss:Cell) + 1"/>
        <xsl:text>)</xsl:text>
        <xsl:value-of select="."/>
        <xsl:text>&#xA;</xsl:text>
    </xsl:template>
</xsl:stylesheet>


XSLT version to produce XML output:

<?xml version="1.0"?>

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">
<xsl:output method="xml"/>

<xsl:template match="/">
    <Workbook>
        <xsl:for-each select="ss:Workbook/ss:Worksheet">
            <Worksheet>
                <Name><xsl:value-of select="@ss:Name"/></Name>
                <xsl:for-each select="ss:Table/ss:Row">
                    <Row>
                        <Number><xsl:number value="position()"/></Number>
                        <Data><xsl:for-each select="ss:Cell"><xsl:value-of select="ss:Data"/><xsl:if test="not(position()=last())">, </xsl:if></xsl:for-each></Data>
                    </Row>
                </xsl:for-each>
            </Worksheet>
        </xsl:for-each>
    </Workbook> 
</xsl:template>

</xsl:stylesheet>

The required namespace declaration was copied over from the Excel XML file. Note that elements in the output such as Workbook don't belong to any namespace


You'd try this good article Dive into SpreadsheetML (Part 2 of 2)

Anyway, very complex excel can generate very big xml files, so do some tests on real data before doing a full implementation

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜