This page summarises the technical implementation of MoodSnap and its analytics framework. The full source code is available at www.github.com/peterrohde/MoodSnap.
Below we summarise the mathematical definitions of the various statistical measures provided by MoodSnap.
The moving average is calculated as a sliding simple moving average (SMA).
Where is the current point in time and is the window size.
MoodSnap uses a default window size of two weeks (14 days). The moving average is a lagging indicator, meaning it’s value lags behind the current value as a result of being averaged over a past period. It has the interpretation as a trend.
The volatility measure is calculated as a sliding standard deviation,
where is the mean within the window (SMA). Like the moving average, the volatility metric is a lagging indicator. It has the interpretation as to how stable a variable is within the window. If the variable remains constant the volatility is zero. It reaches its maximum value when within the window it spends half its time at each of the opposing extremes. This could be as a result of a sudden flip from one extreme to another, or as a result of oscillations between the extremes.
The transient plots are calculated as the SMA from a given central point of reference, going in both the forward and backward directions in time,
where is the centre-point, is the window size, and are the data-points. is then translated such that the y-intercept at is zero, providing a normalised transient,
such that . This provides an interpretation that the mood levels are relative to the point of reference rather than absolute.
The above is for a single point of reference. This is applicable for events, which are defined in a one-off manner. For points of reference that may occur multiple times, such as symptoms and activities, we average over all instances,
where is the vector of points of reference.
The transient plot shows the window size on the horizontal axis and the mean transient value on the vertical axis.
The transients have a time-symmetric interpretation and are therefore agnostic to cause and effect.
Influences show the relative change in average mood levels before and after a point of reference of set of points of reference. This are calculated as the difference between the rightmost and leftmost extremes of the transient plot,
where is the maximum window size used in the respective transient plot. In addition to being displayed in the influences section, these numbers are shown above the respective transient plot.
In the health section of the insights sheet the displayed correlation metric corresponds to the Pearson correlation coefficient, a measure of how well correlated mood is with the respective health parameter and the direction of the correlation. This is defined as,
where and are the two variables.
A positive value indicates a positive correlation between the two variables, conversely for negative values. The value ranges between and , where the magnitude represents how closely correlated the two variables are. A value close to or implies a strong correlation. Values close to zero are weak correlations. Note, this is not the same as the slope of the line of best fit, but rather how closely the data points are to a line of best fit.
In the example below this means that active energy positively correlates with elevation relative to the average, but the correlation is not especially tight.