Obscuring Structured Data in Logs

Background

In many solutions it is important to be able to use sensitive data, which therefore needs to be accessible for the lifetime of the session. However this data is sensitive and every effort should be made to ensure this is not persisted outside the lifetime of the session.

There are 2 types of data within session logs (broadly speaking and for the sake of this discussion at least!):

Unstructured Data

This is data such as inputs (and parameters), outputs (and parameters) - anything which cannot be controlled by the solution because it is entered by the user - or can contain data which was entered by the user and must be in plain text. This data cannot be controlled at design time or necessarily even at runtime - hence a different post-processing approach is needed to control sensitive data found in unstructured data.

Structured Data

This is data which is fully under the control of the solution and does not need to be read outside the system in any given format (eg: variable values) - here it is often possible to know at design time which variables will contain sensitive data and so to prevent the data contained from entering the logs.

This document describes one approach to prevent the sensitive data in these known structured locations from reaching persisted storage (logs)

Approach

The simplest approach is to create a single object to contains all the project sensitive data (or a number of different objects if this is more appropriate for the solution scenario).

Scenario

For the sake of this example the solution is a very basic (contrived!) user welcome solution where the user will tell the system their username - then they can say hello and the system will greet them by name.

My username is JudeTheObscure

Hi JudeTheObscure

Hello

Passed secure data: SecureDataStore@d9d55dc
Username from secure data: JudeTheObscure

SecureDataStore Definition

In the Solution loaded global script add the class definition

class SecureDataStore {
    public String UserName;
}

The name is not important - SecureDataStore is used here to describe the usage - and to help in the logs (and warnings - see Serialisation section) to understand why the content is not readable. In this example there is a single value UserName which is a string - but any number and type of properties can be included depending on the data which needs storing.

Instantiation

The simplest way to use this object is to define a Global Variable with an instance of the class - this ensures there is one per session and it is always available.

  1. Create a Global Variable: sessionDataStore
  2. Default value: new SecureDataStore()
  3. Done!

Usage

This sessionDataStore can now be used anywhere a global variable can be accessed.

Set in a Trigger

my >> username >> is >> (*)^{ sessionDataStore.UserName = _USED_WORDS }

my username is JudeTheObscure

Will set the Global Variable instance for this user session to contain the UserName: “JudeTheObscure”

Read in an Output

Session secure data: ${sessionDataStore}
Username from secure data: ${sessionDataStore.UserName}

Will be output as:

Session secure data: SecureDataStore@21e5e9df
Username from secure data: JudeTheObscure

The first line is to indicate what will be seen in the logs, the second shows the value as it could be accessed from within scripts - ie. the data is still there it is just not persisted to text when logged.

Check the logs

Initialisation

This element in the log file is logging the initialisation of the Global Variable value at the start of the session

<element type="variable-change">
  <change-type>initialization</change-type>
  <variable name="SessionDataStore" scope="session">{"untraceable-object":"SecureDataStore","cause":"untraceable object type"}</variable>
</element>

You can see that this value does not expose the sensitive content as it has been obscured within the SessionDataStore object

Search for “Jude”

A search of the raw session data to find all occurrences of the un-obscured data “Jude”

<request-text>my username is JudeTheObscure</request-text>
<value>my username is JudeTheObscure</value>
<response-text>Hi JudeTheObscure</response-text>
<sentence begin-index="0" text="my username is JudeTheObscure">
<word begin-index="15" original="JudeTheObscure" simplified="judetheobscure" final="judetheobscure"/>
<word>JudeTheObscure</word>
<output-text>Hi JudeTheObscure</output-text>
<response-text>&lt;br/&gt;Username from secure data: JudeTheObscure</response-text>
<output-text>&lt;br/&gt;Username from secure data: JudeTheObscure</output-text>

This shows that as described in the introduction the sensitive data still exists in unstructured data - inputs, outputs and input processing. Unstructured data is not the focus of this document - however it is possible (if the frontend and solution are written to support it) to pass data in an encrypted form in input parameters. It is important then to handle that data within the solution with care so that if it requires decrypting within the solution, the unencrypted value it is only ever stored or passed around as a SecureDataStore / similarly non-serialisable object.

How it works / Things to note

The basic approach here is to store the data in an object which is not text serialisable. That is to say that when the engine attempts to write it to text, to store it in the logs, the content is not written in a retrievable form. This has an impact on how the engine treats these objects internally as it is not able to serialise it across boundaries (eg: when passing to a flow) so there will be warnings at the point that this is attempted:

Warning: 04/19/2018 10:51:52: Flow [Flow 2] (ID=346c3a52-91bc-4d4d-b364-4b71cca92483), vertex (ID=14188df9-e8b9-46d0-aa9b-0d138111b9ff), in-transfer variable mapping [sessionDataStore → sds]: Value of source variable [sessionDataStore] is not cloneable, transfering variable via reference

As described in the warning this results in the object being passed by reference - meaning all flows have the same instance of this variable.

If an object of this type is passed around (instead of being kept as a global object) then these warnings will occur in Tryout, but the logs will still not contain the contents of the object.

The logs will instead contain similar to the initialisation:

<element type="flow-trigger">
  <trigger-id>81795e1f-e666-402f-8056-e05c86f2ad7b</trigger-id>
  <variable name="SessionDataStore" scope="session">{"untraceable-object":"SecureDataStore","cause":"untraceable object type"}</variable>
3 Likes