Numeric vs String Attributes – which is more efficient?

Introduction

An interesting question was posted on the TM1Forum about the most efficient way to craft a rule. This raised the subsequent question of how TM1 attributes are stored behind the scenes, and the impact of that on performance.

The documentation is vague on this point, and, although the arguments presented in the TM1forum post made sense, there wasn’t a definitive answer.

I thought I’d put it to the test.

The Method

I set up a cube with a large item dimension (4,000 elements) and a measures dimension containing an element for each calculation technique I wanted to test. I then wrote a rule for each measure, using one of the methods suggested in the TM1Forum post, as follows:

  • Measure 1: Used an IF statement to test if the category name ends with a ‘1’ or a ‘2’, and returned 1 if it does.
  • Measure 2: Added a numeric attribute to the category dimension which would contain 1 if the rule should return 1.
  • Measure 3: Added a string attribute to the category dimension which would contain ‘1’ if the rule should return 1.
  • Measure 4: Used the string attribute, but converted it to a numeric value before checking if it was 1 or 0.

I crafted a cube view that contained roll-ups of all the dimensions except the measures. I then calculated the cube view for each measure and timed how long it took to get a result.

I repeated the test three times for each measure, arbitrarily altering the rule and re-saving to ensure TM1 wouldn’t use cached values.

The Results

Interestingly, the test seems to have confirmed the assertion on the TM1 forum post that numeric values are actually stored as strings and converted. Here are the numbers, in seconds and %, for the detail-oriented:

Measure 1

Measure 2

Measure 3

Measure 4

Total

                     39

                     91

                     88

                   108

           326

                     43

                     99

                     97

                   125

           364

                     43

                   100

                     97

                   125

           365

                     42

                     97

                     94

                   119

           352

12%

27%

27%

34%

100%

Note:  the discrepancy between the first test and the subsequent ones can be explained by the fact that my laptop was running on battery for the latter ones.

It is clear that using a simple IF statement, even one containing two SUBST calls in its conditional, is more than double as efficient as using an attribute to determine behaviour. Of course, you get many benefits from using the attribute approach, so in many cases, it is still worth the overhead.

Converting the attribute to its numeric equivalent using NUMBR within your rule is markedly slower than comparing the string without conversion, but there is very little difference between using a string attribute over a numeric one. One is roughly as efficient as the other when reading from the attribute cube.

This means that numeric attributes are indeed most likely stored as strings, but the overhead of the conversion is negligible, and does not operate in the same way as the NUMBR function.

I would guess that internally TM1 skips validation of the string during the conversion, as the UI makes it impossible to enter a non-numeric value into a numeric attribute. The NUMBR function can’t do this, as it can accept any string as input.

I found these results very interesting, and they give us a framework to determine which method is best practice.

Conclusion

So what can we to take away from this small test? Here are a few tips:

  1. Using hard-coded logic is likely to be more efficient. However, hard-coding always goes against maintainability, so only do this if you’re sure the logic will never change, nor will the elements involved! This is a big assumption to make, so be very careful with it! Even when you can be sure, only choose this method if speed is the #1 priority in the model.
  2. When flexibility is needed, use a numeric attribute as a flag to determine rule behaviour. Since there is no significant overhead to using a numeric attribute over a string, the best choice is to go with the numeric. TM1 could one day change the way it stores numeric attributes, which could result in this approach being more efficient than storing at a string. However, it is unlikely that any changes in TM1 will lead to a string attribute becoming more efficient to read than a numeric one.
  3. Never convert your attributes manually using the NUMBR function. There is no benefit to this method, and it is significantly slower, so steer well clear of it.
I hope this article helps you make the right call when writing dynamic rules in the future! Happy modelling!
I’ve attached the model I used for testing, and the spreadsheet containing the results I obtained.

TST Model Data.zip (69.00 kb)

TM1 Attribute Tests.xlsx (10.09 kb)