Read Files from Resources

Every project has its own peculiarities, and it requires different resources to complement the functionality of Teneo Studio. These additional files will generally provide data for the solution, either to be used directly in the conversation, but also to provide the client-side configuration for APIs and other web-based services that the solution must consume in order to achieve its goals.

To provide a real-life example, a customer needed to be able to create a contract at the end of a conversation. However, each contract had to have different clauses depending on geography, business rules and other considerations. By loading an Excel file with all the possible clauses, which can be updated when the contract terms change, the bot was able to create ad hoc contracts for each user.

In order to access those files during runtime, they should be saved in the “/script_lib” path, otherwise they cannot be used in script nodes in the solution. Please be aware that when you click on the “save” button on the top right corner, you must have the file closed. For more details about Resource file manage in Studio, please click here.
image

Other than Excel files, the resources can also be used to store *.jar files, whose classes will then be loaded into the solution at startup, allowing for an easy extension of functionality while keeping control of the code. In the same vein, *.groovy or *.java files can be uploaded and used in the same manner. It has the same effect as defining the classes and methods in Globals > Scripts > Solution loaded but compared with thousands of lines with various classes and methods all together in the Solution loaded script, using a groovy file to define classes and then saving it in Resources > File is more user friendly in the maintenance phase.

If you upload text files or spreadsheets, the correct way to get the file address is:

Url url = this.getClass().getClassLoader().getResource(fileName);

In which “url” stands for the address of the file and “fileName” stands for the name of the file. Please remember that you must set up the Published Location as “/script_lib” otherwise the file won’t be accessible. After getting the URL of the file, you can use different kinds of reader to extract the information, for example BufferedReader for text and XSSFWorkbook for xlsx.

We have some scripts collected from old projects for reading txt files and spreadsheets. Feel free to copy them and use in your own project.

The following method reads a txt file and generates a map. Each line of the txt file should be a key and a value, separated with a certain separator (the default separator in the script is “,”). In order to use this method please copy it to Global script > Solution Loaded. No need to import other package for this method.

public static HashMap<String,String> createMapFromTxtFile(callingScript, String fileName, String separator = ",") {

	HashMap<String,String> map = new HashMap<String,String>();

	URL url = callingScript.getClass().getClassLoader().getResource(fileName);
	if (url==null) throw new RuntimeException("Failure to read file ["+fileName+"] as resource, no URL could be constructed for it");
	URI uri = url.toURI();
	File file = new File(uri);
	BufferedReader reader = new BufferedReader(new FileReader(file));
	String fileLine;
	try {
		while ((fileLine = reader.readLine()) != null) {
			String[] tt = fileLine.split(separator);
			String key = tt[0], value = tt[1];
			map.put(key,value);
		}
	} finally {
		try {
			reader.close();
		} catch (err) {}
	}

       return map;

}

In case of Excel files, you will need to import the jar files from Apache POI. The following class is used to extract information row to row from an xlsx file. To read Excel 2003 files, you can replace “xssf” with “hssf” in the script, although it is not recommended due to the inherent complexity of the old Excel format. It is preferred if possible to save the files in the 2007 format rather than legacy.

// START -- Read an xlsx resource file
// Class to import and read excel resource files
import org.apache.poi.xssf.usermodel.XSSFCell;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;

class ImportExcel {

    // Read excel sheet according to the file name + sheet number (starting from 0)
    // The content of the sheet is intended to be distributed in 2 columns, key and value

    public static List<List<String>> createTablefromExcelSheet(callingScript, String fileName, int readSheet, List<Integer> lRelevantColumns) {

        List<List<String>> llTable = [];

        URL url = callingScript.getClass().getClassLoader().getResource(fileName);
        if (url==null) throw new RuntimeException("Failure to read file ["+fileName+"] as resource, no URL could be constructed for it");
        URI uri = url.toURI();
        List<List<String>> dataList = new ArrayList<List<String>>();

        InputStream is = null;

        // Call the method "Read Excel file according to inputStream" in this class
        File file = new File(uri);
        is = new FileInputStream(file);
        dataList = readFile(is, readSheet);
        is.close();

        // Keep only columns you are interested in
        for(int i=0;i<dataList.size();i++){

            List<String> lRow = [];

            for(int j=0;j<lRelevantColumns.size();j++){

                lRow << dataList.get(i).get(lRelevantColumns.get(j));

            }

            llTable << lRow;

        }

        return llTable;

    }

    // Read Excel file according to inputStream
    public static List<List<String>> readFile(InputStream inputStream, int readSheet) {
        List<List<String>> dataList = null;
        try {
            // Choose the method for creating Workbook
            Workbook wb = null;
            wb = new XSSFWorkbook(inputStream);
            // sheet loop
            int sheetNum = sheetCirculation(wb);
            dataList = read(wb, readSheet);
        } catch (IOException e) {
            e.printStackTrace();
        }
        return dataList;
    }

    // read data
    private static List<List<String>> read(Workbook wb,
            int sheets) {
        List<List<String>> dataList = new ArrayList<List<String>>();
        // total rows
        int totalRows = 0;
        // total cells
        int totalCells = 0;
        // first sheet
        Sheet sheet = wb.getSheetAt(sheets);
        // Number of row
        totalRows = sheet.getPhysicalNumberOfRows();
        // Number of cell
        if (totalRows >= 1 && sheet.getRow(0) != null) {
            totalCells = sheet.getRow(0).getPhysicalNumberOfCells();
        }

        // Iterate all rows
        for (int r = 0; r < totalRows; r++) {
            Row row = sheet.getRow(r);
            if (row == null) {
                continue;
            }
            List<String> rowLst = new ArrayList<String>();
            // Iterate all cells
            for (int c = 0; c < totalCells; c++) {
                Cell cell = row.getCell(c);
                String cellValue = "";
                if (null != cell) {
                    // Analyze the data type
                    switch (cell.getCellType()) {
                    case 'NUMERIC': // Number
                        cellValue = cell.getNumericCellValue() + "";
                        break;
                    case 'STRING': // String
                        cellValue = cell.getStringCellValue();
                        break;
                    case 'BOOLEAN': // Boolean
                        cellValue = cell.getBooleanCellValue() + "";
                        break;
                    case 'FORMULA': // Formula
                        cellValue = cell.getCellFormula() + "";
                        break;
                    case 'BLANK': // Blank
                        cellValue = "";
                        break;
                    case 'ERROR': // Error
                        cellValue = "Illegal character";
                        break;
                    default:
                        cellValue = "Unknown type";
                        break;
                    }
                }
                rowLst.add(cellValue);
            }
            // save the row r and the cell c
            dataList.add(rowLst);
        }
        return dataList;
    }

    private static int sheetCirculation(Workbook wb) {
        int sheetCount = -1;
        sheetCount = wb.getNumberOfSheets();
        return sheetCount;
    }
}

// END -- Read an xlsx resource file

The jar file needed are (taken from Apache POI 5.0.0, you can also use other versions):
image

To use this class, please import these jar and Excel files into Resources > File and set up the Published Location as “/script_lib” . Then copy the script to Global script > Solution Loaded and reload the engine in try out panel to check if there are any warnings. If everything goes well, you will be able to call the method ImportExcel.createTablefromExcelSheet with the following arguments: this, filename, sheet index, column index. The output of this method will be an array in which each item is a list containing the cell values of each row with the column specified in the argument. For example, if you run

ImportExcel.createTablefromExcelSheet(this, “filename.xlsx”, 0, [0,1,2,3]) 

(which standing for extracting the data of the first 4 columns of the first sheet of filename.xlsx) with the following spreadsheet:
image

You will get the following result:

[
    [Name, Type, Origen, Price], 
    [Apple, Fruit, Germany, 1.50/kg], 
    [Orange, Fruit, Spain, 0.80/kg], 
    [Kaki, Fruit, Spain, 2.60/kg], 
    [Kiwi, Fruit, New Zealand, 4.00/kg], 
    [Melon, Fruit, China, 2.00/unit], 
    [Mango, Fruit, Peru, 1.50/unit], 
    [Tomato, Fruit, Spain, 0.50/kg], 
    [Cucumber, Vegetable, France, 1.60/kg], 
    [Celery, Vegetable, Italy, 2.30/kg], 
    [Eggplant, Vegetable, England, 3.50/kg], 
    [Pea, Vegetable, Spain, 3.20/kg], 
    [Potato, Vegetable, Spain, 0.50/kg], 
    [Spinach, Vegetable, Spain, 1.00/kg]
]

Hope you enjoyed this article, feel free to ask here any questions you might have!

1 Like

This is good stuff! Resource files are a great way to store and use static information inside a solution - and all the same parsing techniques (and largely code) can be used for the results of API calls if the data can be made available dynamically via an endpoint.

I wanted to point out some interesting “Groovy” ways to handle some of these file (or in fact any collection / stream / etc) parsing / traversing actions too

[resource].withXXX

Groovy has a load of “withXXX” methods for handling the creation and cleanup of stream based resources (think of them as a specialised implementation of Java’s “try with resources”). They function like a try ... finally { close() } in that any resources they open will be closed / tidied when the closure returns OR throws an Exception

So the first example could remove most of the FileReader / BufferedReader and the try ... finally and instead use withReader:

new File(fileName).withReader('UTF-8') { reader ->
  // read file content
}

Additionally the Reader class provides some really useful closure based methods. In this case splitEachLine seems the perfect fit!

def dataMap = [:]
new File(fileName).withReader('UTF-8') { reader ->
  reader.splitEachLine(separator) { values -> {
    dataMap.put(values[0], values[1])
  }}
}

API calls?

Yes this works for API calls as well… assuming [endpoint path] points to a public endpoint returning the same data, the above example would work equally well with:

"[endpoint path]".toURL().withReader('UTF-8') { reader ->
  ...
}

.collect { ... } and withCloseable

The second Excel example could also be made “more Groovy” by using:

  • withCloseable {...} which will automatically close any resource it is called on at the end / failure of the closure - similar to withReader but called on any java.io.Closeable resource (here it is used for the XSSFWorkbook which is required to close and release the file)

  • collect { ... } which is similar to Javascript map() in that it allows the same action to be called on each entry in a collection to transform each entry into a result and the results of all these calls is returned as a single array

new XSSFWorkbook(new File(fileName)).withCloseable { workbook ->
    if (readSheet < 0 || readSheet >= workbook.numberOfSheets) {
        throw new RuntimeException("Invalid sheet number '${readSheet}' for resource [${fileName}]. Sheet Count: ${workbook.numberOfSheets}")
    }
    return workbook.getSheetAt(readSheet).collect { row ->
        lRelevantColumns.collect { column ->
            getCellValueAsString(row.getCell(column))
        }
    }
}

Bonus Excel FORMULA Values!

This is very specific to Excel parsing, not at all generic Groovy and in many cases irrelevant / not desired behaviour - depending on the Excel content. However you can get the most recent formula calculation from the Excel by checking for the cached data type and returning the typed value. Note this is the value as was last time it was calculated (most likely last save) so any formulas using “fluid” data such as today’s date will not have the same value as if the file was opened in Excel

private static String getCellValueAsString(Cell cell) {
    if (null != cell) {
        switch (cell.cellType) {
            case CellType.FORMULA: {
                // When the cell is of type formula
                // get the previously cached value (on save in Excel)
                switch (cell.cachedFormulaResultType) {
                    case CellType.NUMERIC:
                        return cell.numericCellValue
                    case CellType.STRING:
                        return cell.stringCellValue
                    case CellType.BOOLEAN:
                        return cell.booleanCellValue
                    case CellType.ERROR:
                        return "Error: ${FormulaError.forInt(cell.errorCellValue.string)}"
                }
            }
            case CellType.BLANK:
                return ''
            case CellType.ERROR:
                return "Error: ${FormulaError.forInt(cell.errorCellValue.string)}"
            default:
                return cell // All other types the toString is the value
        }
        return ''
    }
}
2 Likes