Parse XML using XmlParser or XmlSlurper

In Teneo Studio, you can use the XmlParser class or the XmlSlurper class to read in XML formatted data which is a common format for structured documents. These two classes are accessible by the Teneo Engine by default, so you can directly use them in any script nodes in Teneo Studio without import declaration.

Suppose that you have the following XML saved in a string named sXml:

def sXml= '''
<recruit>
    <jobtitle>System Developer</jobtitle>
    <location>New York</location>
    <applicants>
        <applicant>
            <name>Tommy</name>
            <programming_languages>
                <programming_language>Groovy</programming_language>
                <programming_language>Python</programming_language>
                <programming_language>Java</programming_language>
            </programming_languages>
        </applicant>
        <applicant>
            <name>John</name>
            <programming_languages>
                <programming_language>Ruby</programming_language>
                <programming_language>php</programming_language>
            </programming_languages>
        </applicant>
    </applicants>
</recruit>
'''

Use the following code to parse this XML by XmlParser and save it in an object called parsedXml.

def parsedXml = new XmlParser().parseText(sXml)

You can also use XmlSlurper to do the same thing:

def parsedXml = new XmlSlurper().parseText(sXml)

If you have the XML file uplodaded in your Resource files , please use the following code to parse it:

URL url = this.getClass().getClassLoader().getResource('your_file.xml')
URI uri = url.toURI()
File file = new File(uri)
def parsedXml = new XmlParser().parseText(file.getText()) // Or use XmlSlurper().parseText(file.getText())

The XmlParser().parseText() method returns an object of the groovy.util.Node class, while the XmlSlurper().parseText() method returns an object of the groovy.util.slurpersupport.NodeChild class. You can find an article here for comparison between XmlParser and XmlSlurper.

Although they belong to different classes, you can use the same way to extract information or to iterate through the data. For example, you can use this code to get the job title from the parsed data parsedXml:

parsedXml.jobtitle.text()

And use the following code to iterate through the applicants, extract the name and the programming languages that each applicant dominates and put the information in a list named lApplicantsInfo:

def lApplicantsInfo = []
parsedXml.applicants.applicant.each{
    def mApplicant = [:]
    mApplicant.name = it.name.text()
    mApplicant.programming_languages = []
    it.programming_languages.programming_language.each{
        mApplicant.programming_languages << it.text()
    }
    lApplicantsInfo << mApplicant
}

See the content of the list lApplicantsInfo below:

The code in this post is an example on how you can parse XML formatted data in your Teneo solution. You can find another post here which shows an example on how to parse YAML formatted data, and here for parsing and building of JSON formatted data. Hope this post can help in your journey with Teneo Studio!

1 Like