20210430

Merging XML files togheter with XSL using Saxon-B

Sometimes source system produces thousands of source files. If you open and process each of them individually, it takes plenty of time. Sometimes it is must faster to merge files together and process larger file.

 Here are example. Source system produces tons of XML-files about comics. Single XML file looks like this:

<?xml version="1.0" encoding="UTF-8"?>

<comics>
    <comic>
        <name>Moomin</name>
        <authors>
           <author>Tove Jansson</author>
           <author>Lars Jansson</author>
        </authors>
        <started>19470101</started>
        <ended>19750101</ended>
        <publisher>Associated Newspapers</publisher>
    </comic>
</comics>

File merging XSL looks like this.


<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output name="XML-format" method="xml" indent="yes" encoding="utf-8"/>

    <!-- Patch size. We don't want put all files togheter. We want to have chunks of controllable size. In this example size 5 is okay. --><xsl:param name="patchSize" select="5"/>

    <!-- Source path variable with filemask. -->
    <xsl:variable name="sourceFiles" select="collection('file:///C:/comic/input/?select=comic*.xml')"/>

    <xsl:template match="/">

        <!-- group-by with patch size -->
        <xsl:for-each-group select="$sourceFiles/comics" group-by="(position() - 1) idiv $patchSize">
           <xsl:variable name="patchID" select="position()"/
              <xsl:result-document format="XML-format" href="Comics_{$patchID}.xml"> <!-- Output file name-->
                <comics>
                   <xsl:for-each select="current-group()">                                <xsl:sequence select="*"/>
                   </xsl:for-each>
                </comics>
              </xsl:result-document>
        </xsl:for-each-group>

    </xsl:template>
</xsl:stylesheet>


We will save xsl file as: C:/comic/merge_comic_files.xsl 

Input folder will be: C:/comic/input

Output folder will be: c:/comic/output/


Let's open command prompt. Give this kind of command:

> java -jar "c:/saxonb9-1-0-8j/saxon9.jar" -s:"c:/comic/input/" -o:"c:/comic/output/" -xsl:"c:/comic/merge_comic_files.xsl"

Command will use Java JRE with -jar parameter. Second parameter is location of Saxon-B processor.
-s parameter is source, -o is parameter for output and -xsl is path of xsl file.

You can download example files here.

20210429

Saxon-B Free XSLT and XQuery Processor

 I got assignment to modify XML files to CSV and HTML. With freeware of cource. Actually I was told to make processing with Saxon-B XSLT and XQuery Processor.

You can load Saxon from here.

Download file:

saxonb9-1-0-8j.zip

You will need Java runtime environment(JRE) or -SDK to Run Saxon-B. Runs with pretty old Java but does not have problems with later ones either.

You can download Java JRE from here.

I will make some examples about use of Saxon-B parser with future posts.

Testing connection string with Powershell

Some systems wants connection string from you. Whent you give connection string to them they just tell you "ERROR!" or "No connection!". It is not helpful. What went wrong?


Sometimes I test connection strings with powershell. Write this kind of script with notepad or similar editor:

$connectionString = 'Data Source=123.123.123.123,1234;database=NameOfYourDatabase;User ID=migtyuser;Password=secret123!' -f $ServerName,$DatabaseName,$userName,$password

$sqlConnection = New-Object System.Data.SqlClient.SqlConnection $connectionString

$sqlConnection.Open()

$sqlConnection.Close()


Save it as connectiontest.ps1.

123.123.123.123 is IP number of your server and 1234 is the port number (not mandatory). Save your file as Powershell file. 


Then start Windows powershell as admin user and give next kind of commands:

>set-executionpolicy remotesigned

 - Answer Y-when system asks permission. 

Okay. Now we can use connection tester. Give powershell command:

> ./connectiontest.ps1 


If everything is just ok. Nothing happens. Hurray! But if there is something wrong with your connection string Powershell will give nicely detailed error message. It is much better error message than with most other systems.

When scheduled BAT-files cascades

Server was acting slowly. What have happened? When I opened Task Manager. There were thousands of cmd-programs open. I noticed there were so...