We applied a fair bit of thought to how we wanted to calculate the index, our thinking guided by how we wanted the index to behave. Since we were going to track ‘going-up’, ‘going-down’ or ‘stayed-the-same’ for each of the questions, one obvious choice was to use a diffusion index type methodology (ie index value = [# gone-up + 50%* #stayed-the-same]) or some variation of the same.
The issue with both the above approaches was that the index would have stayed range bound, i.e., theoretically it would always be in the 0 -100 range (if diffusion), or something else similarly bounded. This would have provided a good comparison to the previous month, but would not have allowed the kind of perspective we needed over time. In other words, the diffusion or weighted average approaches would have resulted in relative values – relative only to individual previous months – and would not have provided an absolute perspective over time.
For example, consider the following extreme sequence of events:
The issue is that the +100 in July is not the same as the +100 in the December of the prior year because the 6 months of continued decline is ignored for the July index value.
Comparing to a stock index, this would be akin to seeing the monthly returns, and not the absolute level of the index. In other words, consecutive monthly returns of –20%, +15%, –10%, 0%, +20%, –5% amount to index values (assuming a starting value of 100) of 80, 92, 82.8, 99.36, 94.39 respectively. The question we need to address is whether we want the -20%, +15%, -10%, 0%, +20%, -5% to be the index; or should it be the absolute values of 80, 92, 82.8, 99.36, 94.39 that allow much better comparison over time.
We preferred the latter. To continue our example of Dec being +100; Jan-Jun 2011 being -100; and July being +100 as a diffusion index; and assuming we have set ‘gone-up’ = 15%, stayed-the-same = 0%, and gone-down = -15%; the absolute values would be 100 (beginning Dec), 115 (Dec end), 97.75 (Jan end), 83.09 (Feb end), 70.62 (Mar end), 60.03 (Apr end), 51.03 (May end), 43.37 (Jun end), 49.88 (Jul end).
This is more useful; one can see that the cumulative index value is half at the end of July when compared to the beginning of Dec. And even with this approach, the relative changes across months are not lost, they are trackable as the first derivative of the cumulative index.
We also made an additional refinement at this point, which was to use continuous compounding. Assume we start a month with an index value of 100. We get one month of ‘gone-down’ followed by an immediately following month of ‘gone-up’. We would like to come back to a value of 100 if this happens, as the ‘gone-down’ has been offset by the ‘gone-up’.
Without using continuous compounding the index will go first from 100 to 85 [=100*(1 – 15%)]; and then to 97.75 [=85*(1 + 15%)]. Which is not the result we want, because we want the result to come back up to 100. Therefore, we use continuously compounded rates and not discrete rates. If we do that, the index will go from 100 to 86.07 [=e^(-0.15), or exp(-0.15) in Excel)] and back to 100 [=86.07*exp(0.15)]. If you are still reading, thank you.
The ‘absolute’ version of the index provides another advantage – which is that the index is no longer predictable over time and follows a random walk based on a stationary process. Which is very similar to trade-able securities whose price can be anything, based only upon their previous value. Each month’s ‘returns’ (ie the responses) are the stationary process, and their integration into the higher level index is calculated as: ICS_{t} = ICS_{t - 1} * e^{St} where S_{t} represents the month’s responses converted to a percentage (explained below). ICS_{t} is the ‘Index of Cyber Security’ at time t.
A higher index value indicates a perception of increasing risk, while a lower index value indicates the opposite.
The approach to address all of the above was decided to be as follows: