Array Types in Turbo Integrator

Introduction

I noticed a 4-year old post over on the TM1 Forum site about simulating array variables in Turbo Integrator.

It has been a persistent topic, so I went back through the thread and was surprised to see that there has been no mention of using character-delimited strings are array variables. I do this often, and it is one of the foundational concepts in libraries like Bedrock, so I thought it was worth covering in some detail.

In this article, I’ll also be using the concepts covered in my Returning Values from Turbo Integrator article, so be sure to read that too if you haven’t already.

Array Approaches

Most scripting and programming languages have support for arrays. However, unfortunately TI is not one of these, and there are many times when such a data type would be advantageous.

The thread on TM1 Forums covers a few approaches, but here are the main ones:

  1. Multiple variables
    This approach is as simple as it gets. Simple define variables with sequential names, such as V1, V2, V3, V4, etc, up to the maximum size you expect your array to be. You can iterate through these variables in a WHILE loop by incrementing an index variable, then having an IF statement for each array variable you have declared.You can immediately see how this can become highly cumbersome and unmaintainable for arrays of more than a few elements, as you need an IF statement for each one!
  2. Using a temporary dimension
    This approach is a bit smarter, as it avoids the multiple IF statements. You simply have your TI process create a temporary dimension and use that to store your array values. The element names can be strings, but if you need to store numbers, you can use elements named Item1, Item2, Item3, etc, and store the value in a numeric attribute.This can be a great approach, but can be a bit messy, and has concurrency issues. You need to create a temp dimension name that would be unique each time the TI process is run, and if you’re running many TIs frequently, finding a unique name can be difficult. You also need to handle deleting the dimension once you’re done.

    It is also possible to use a permanent dimension for this purpose and just create and destroy attributes as needed. Again, this has concurrency issues, as you need unique attribute names, and can be confusing to the user.

    In both cases, it seems a very heavyweight solution for something as simple as arrays, and might not perform as well as a solution that avoids TM1 objects.

    You also need to worry about the maximum number of elements in the item dimension, as that forms the upper limit of the array.

  3. Using a temporary cube
    This approach is similar to the above, but the values are stored in a cube instead of element names or attributes. This has all the inherent caveats and benefits as the dimension approach, but is slightly more visible and accessible to a user, which can be an advantage in some cases.
None of these solutions are perfect, but they will serve their purpose. There is a fourth option, however, which is the focus of this article.

Using Character-Delimited Strings

The concept behind this is very simple. Instead of using TM1 objects to store arrays of values, we use a string separated by a particular character. A simple example would be “V1,V2,V3″.

In Turbo Integrator, it’s possible to split this string using SCAN to find the locations of the commas and SUBST to split the list into its component values.

This does not use any TM1 objects, and is appealing in that it’s a native TI solution. It should also be relatively fast, as SCAN and SUBST are very simple functions and well-optimized for performance. It is also very easy to pass values between processes using the technique detailed in my previous article Returning Values from Turbo Integrator.

There are a few drawbacks:

  • There is an upper limit to string length in TI, which you can hit with large lists. For TM1 10, the limit is 65k characters.
  • Storing numeric values involves a conversion from string to number, which can affect performance
  • Coding such an approach is cumbersome, and you often find yourself writing the same or similar WHILE loops, which clutter up your TI processes
For small lists, the first two points are not a major issue. However, the third point never goes away, and writing the same code over and over makes your processes hard to maintain and error-prone. Since debugging is very difficult in TI, you don’t want to write more code than is absolutely necessary or you can quickly find yourself chasing your tail!
To mitigate this, we need to use a best practice approach to re-usable code. This can be done in TI by creating a series of well-defined library processes. Combining this with the concept of returning values, we can encapsulate all of our array processing tasks in our library processes, avoiding the need to repeat the code patterns and making your code maintainable once more.

An Array Processing Library

Rather than leave the task of writing library functions as an exercise for the reader, I decided to take a shot at writing it myself.
The Flow array processing library contains a number of useful functions to assist in using delimited lists as arrays:
  • ItemRead: reads an item from the array at a specified index
  • ItemCount: Returns the number of items in the list
  • ItemAdd: Adds an item to the list at the specified index
  • ItemRemove: Deletes an item from the list at the specified index
  • ItemReplace: Update the value at a specified index with a new value
  • ItemFind: Locate a value in the list and return the index of the value, if found
Using these basic functions, one can perform most array functions without worrying about the implementation details. I make no warranties about the quality of the library, but it is, at least, a great starting point for a more robust implementation.
It has the following features:
  • Value quoting: What is the value in the list wants to include the character that is being used as a delimiter? The library supports this by supporting quoting. An example would be: “1,2,|3,4|,5″. If you specify the bar character (|)as your quote character, the library will retrieve the third value in the list as “|3,4|”.
  • Opening and closing quote characters: The library supports specifying an opening and closing quote character. This allows you to specify the above example as “1,2,[3,4],5″, which is much more readable. If the closing quote character is not specified, it is assumed to be the same as the opening quote character.
Desired features:
  • Escaping quote characters: Currently the library does not support escaping the quote characters within the value. This means you cannot use either of your quote characters in the value, or it you will get unpredictable results. Ideally, the library would detect quote characters within the list values and escape them automatically, and un-escape them when reading the back out.
  • Error Handling: At present the error detection and handling in the library is rudimentary. If a list is badly formed, it would be difficult to detect and resolve in code.
  • Performance: The current implementation reads the entire list multiple times, which can become exponentially slower in certain implementation patterns. Ideally, the library would support ways to optimize algorithms and perhaps a caching option. However, as this library is designed for small lists, the law of diminishing returns may apply to such features.

Using the library

A sample process is included to demonstrate the basic functions of the library.

The following code:

pItemList = 'A,B,C,D,E,F';
pOutputFolder = 'C:\Temp';

NumericGlobalVariable('OutputItemCount');
StringGlobalVariable('OutputItem');
StringGlobalVariable('OutputItemList');
NumericGlobalVariable('OutputItemIndex');
NumericGlobalVariable('OutputItemLocation');
NumericGlobalVariable('OutputSearchItemIndex');

vProcessName = 'Flow.String.List.Sample1';
vOutputFile = pOutputFolder | '\' | vProcessName | '.Output.txt';

ASCIIOUTPUT(vOutputFile, 'Working on list [' | pItemList | ']');

ExecuteProcess('Flow.String.List.ItemCount', 'pItemList', pItemList);
vItemCount = OutputItemCount;
ASCIIOUTPUT(vOutputFile, 'Item count: ' | TRIM(STR(vItemCount, 4, 0)));

ExecuteProcess('Flow.String.List.ItemAdd', 'pItemList', pItemList, 'pItemIndex', 4, 'pNewItem', 'Added');
ASCIIOUTPUT(vOutputFile, 'Item added at index 4: [' | OutputItemList | ']');

ExecuteProcess('Flow.String.List.ItemRemove', 'pItemList', OutputItemList, 'pItemIndex', 2);
ASCIIOUTPUT(vOutputFile, 'Item removed at index 2: [' | OutputItemList | ']');

ExecuteProcess('Flow.String.List.ItemReplace', 'pItemList', OutputItemList, 'pItemIndex', 5, 'pNewItem', 'Replaced');
ASCIIOUTPUT(vOutputFile, 'Item replaced at index 5: [' | OutputItemList | ']');

ASCIIOUTPUT(vOutputFile, 'Finding index of item "F"...');
ExecuteProcess('Flow.String.List.ItemFind', 'pItemList', OutputItemList, 'pSearchItem', 'F');
ASCIIOUTPUT(vOutputFile, 'Index of search item: ' | TRIM(STR(OutputSearchItemIndex, 15, 0)) );

ASCIIOUTPUT(vOutputFile, 'Finding index of item "G"...');
ExecuteProcess('Flow.String.List.ItemFind', 'pItemList', OutputItemList, 'pSearchItem', 'G');
ASCIIOUTPUT(vOutputFile, 'Index of search item: ' | TRIM(STR(OutputSearchItemIndex, 15, 0)) );

ASCIIOUTPUT(vOutputFile, 'Listing all current items...');
vCurrentItemIndex = 1;
vCurrentItem = '';
WHILE(vCurrentItemIndex <= vItemCount);
	ExecuteProcess('Flow.String.List.ItemRead', 'pItemList', OutputItemList, 'pItemIndex', vCurrentItemIndex);
	vCurrentItem = OutputItem;
	ASCIIOUTPUT(vOutputFile, '[' | TRIM(STR(vCurrentItemIndex, 4, 0)) | ']' | ' = ' | vCurrentItem);
	vCurrentItemIndex = vCurrentItemIndex + 1;
END;

ASCIIOUTPUT(vOutputFile, 'Listing all original items...');
vCurrentItemIndex = 1;
vCurrentItem = '';
WHILE(vCurrentItemIndex <= vItemCount);
	ExecuteProcess('Flow.String.List.ItemRead', 'pItemList', pItemList, 'pItemIndex', vCurrentItemIndex);
	vCurrentItem = OutputItem;
	ASCIIOUTPUT(vOutputFile, '[' | TRIM(STR(vCurrentItemIndex, 4, 0)) | ']' | ' = ' | vCurrentItem);
	vCurrentItemIndex = vCurrentItemIndex + 1;
END;

Yields the following output file:

Working on list [A,B,C,D,E,F]
Item count: 6
Item added at index 4: [A,B,C,Added,D,E,F]
Item removed at index 2: [A,C,Added,D,E,F]
Item replaced at index 5: [A,C,Added,D,Replaced,F]
Finding index of item "F"...
Index of search item: 6
Finding index of item "G"...
Index of search item: 0
Listing all current items...
[1] = A
[2] = C
[3] = Added
[4] = D
[5] = Replaced
[6] = F
Listing all original items...
[1] = A
[2] = B
[3] = C
[4] = D
[5] = E
[6] = F

As you can see, the sample code is much cleaner and more maintainable than more ad-hoc implementations of the same technique.

Conclusion

There is no perfect solution to simulating array variables in Turbo Integrator, but there are some work-arounds that can work in various scenarios.

Hopefully the library I have provided will help other developers get started using the character-delimited string technique, and will at least serve as an example of reusability and maintainability in Turbo Integrator processes.

Flow.String.List.zip (8.03 kb)