Technical

This page summarises the technical implementation of MoodSnap and its analytics framework. The full source code is available at www.github.com/peterrohde/MoodSnap.

Mathematical techniques

Below we summarise the mathematical definitions of the various statistical measures provided by MoodSnap.

Moving average

The moving average is calculated as a sliding simple moving average (SMA).

B(s,w)=\frac{1}{w}\sum_{i=s-w}^{s}x_i
Where s is the current point in time and w is the window size.

MoodSnap uses a default window size of two weeks (14 days). The moving average is a lagging indicator, meaning it’s value lags behind the current value as a result of being averaged over a past period. It has the interpretation as a trend.

Volatility

The volatility measure is calculated as a sliding standard deviation,

V(s,w)=\sqrt{\frac{1}{w}\sum_{i=s}^{s+w}(x_i-\mu)^2},

where \mu=B(s,w) is the mean within the window (SMA). Like the moving average, the volatility metric is a lagging indicator. It has the interpretation as to how stable a variable is within the window. If the variable remains constant the volatility is zero. It reaches its maximum value when within the window it spends half its time at each of the opposing extremes. This could be as a result of a sudden flip from one extreme to another, or as a result of oscillations between the extremes.

TRANSIENTS

The transient plots are calculated as the SMA from a given central point of reference, going in both the forward and backward directions in time,

T(c,w)=\frac{1}{w}\sum_{i=c}^{c+w}x_i

where c is the centre-point, w is the window size, and x are the data-points. T(c,w) is then translated such that the y-intercept at w=0 is zero, providing a normalised transient,

T_0(c,w)=T(c,w)-T(c,0),

such that T_0(c,0)=0. This provides an interpretation that the mood levels are relative to the point of reference rather than absolute.

The above is for a single point of reference. This is applicable for events, which are defined in a one-off manner. For points of reference that may occur multiple times, such as symptoms and activities, we average over all instances,

T_0(\vec{c},w)=\frac{1}{|\vec{c}|}\sum_{i=1}^{|\vec{c}|} T_0(c_i,w),

where \vec{c} is the vector of points of reference.

The transient plot shows the window size w on the horizontal axis and the mean transient value T_0(c,w) on the vertical axis.

The transients have a time-symmetric interpretation and are therefore agnostic to cause and effect.

INFLUENCES

Influences show the relative change in average mood levels before and after a point of reference of set of points of reference. This are calculated as the difference between the rightmost and leftmost extremes of the transient plot,

I(c)=T(c,w_{max})-T(c,-w_{max}),

where w_{max} is the maximum window size used in the respective transient plot. In addition to being displayed in the influences section, these numbers are shown above the respective transient plot.

CORRELATIONS

In the health section of the insights sheet the displayed correlation metric corresponds to the Pearson correlation coefficient, a measure of how well correlated mood is with the respective health parameter and the direction of the correlation. This is defined as,

r_{xy} = \frac{\sum_i (x_i-\bar{x})(y_i-\bar{y})}{\sqrt{\sum_i (x_i-\bar{x})^2}\sqrt{\sum_i (y_i-\bar{y})^2}},

where x and y are the two variables.

A positive value indicates a positive correlation between the two variables, conversely for negative values. The value ranges between -1 and +1, where the magnitude represents how closely correlated the two variables are. A value close to +1 or -1 implies a strong correlation. Values close to zero are weak correlations. Note, this is not the same as the slope of the line of best fit, but rather how closely the data points are to a line of best fit.

In the example below this means that active energy positively correlates with elevation relative to the average, but the correlation is not especially tight.