Sometimes source system produces thousands of source files. If you open and process each of them individually, it takes plenty of time. Sometimes it is must faster to merge files together and process larger file.
<?xml
version="1.0" encoding="UTF-8"?>
<comics>
<comic>
<name>Moomin</name>
<authors>
<author>Tove
Jansson</author>
<author>Lars
Jansson</author>
</authors>
<started>19470101</started>
<ended>19750101</ended>
<publisher>Associated Newspapers</publisher>
</comic>
</comics>
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output name="XML-format" method="xml" indent="yes" encoding="utf-8"/>
<!-- Patch size. We don't want put all files togheter. We want to have chunks of controllable size. In this example size 5 is okay. --><xsl:param name="patchSize" select="5"/>
<!-- Source path variable with filemask. -->
<xsl:variable name="sourceFiles" select="collection('file:///C:/comic/input/?select=comic*.xml')"/>
<xsl:template match="/">
<!-- group-by with patch size -->
<xsl:for-each-group select="$sourceFiles/comics" group-by="(position() - 1) idiv $patchSize">
<xsl:variable name="patchID" select="position()"/
<xsl:result-document format="XML-format" href="Comics_{$patchID}.xml"> <!-- Output file name-->
<comics>
<xsl:for-each select="current-group()"> <xsl:sequence select="*"/>
</xsl:for-each>
</comics>
</xsl:result-document>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
We will save xsl file as: C:/comic/merge_comic_files.xsl
Input folder will be: C:/comic/input
Output folder will be: c:/comic/output/
Let's open command prompt. Give this kind of command:
> java -jar "c:/saxonb9-1-0-8j/saxon9.jar" -s:"c:/comic/input/" -o:"c:/comic/output/" -xsl:"c:/comic/merge_comic_files.xsl"
Command will use Java JRE with -jar parameter. Second parameter is location of Saxon-B processor.
-s parameter is source, -o is parameter for output and -xsl is path of xsl file.
No comments:
Post a Comment