Center a collection of covariance matrices

Center a dataset of covariance matrices by parallel transport

Usage

spd.center(d, covmats, group_by, from_constraints = "", to_constraints = "")

Arguments

d: A dataframe. See details.
covmats: An optional list of covariance matrices.
group_by: Variable in d to group by.
from_constraints: Logical expression indicating subsets to use for group means.
to_constraints: Logical expression indicating subset to use for target mean.

Value

A named list containing the centered covariance matrices, and the target mean covariance.

Details

A convenience wrapper implementing the centering of a covariance matrix valued dataset by parallel transport. The essential goal of this procedure is to accomplish something like the following: We have measured multiple covariance matrices per test subject, and we would like to align the mean covariance of each subject to a common (overall) mean. This is accomplished by

1. Computing the grand mean covariance
2. Computing the mean covariance for each subject
3. Projecting each of the subjects' observations onto the tangent space around their mean
4. Parallel transporting each of these tangent vectors along a geodesic to the grand mean
5. Exponential mapping back onto the space of covariance matrices

Consider a neuroimaging experiment in which we measure functional connectivity (a covariance matrix) in multiple conditions within each subject, including a baseline condition. Let covmats be a list of covariance matrices, and let d be a dataframe (with nrows(d) = length(covmats)) containing columns for Subject and Condition. In order to center each subjects' mean to the grand mean, the user can specify group_by = "Subject" -- leaving the other constraints empty. If the user wished to align the subject baselines to the grand mean baseline, the user can set from_constraints = "Condition == 'Baseline'" -- instructing that only the baseline condition should be used to construct the subject meann -- and set to_constraints the same way.

More generally, the data are split according to group_by, and the group means are computed using only the observations satisfying from_constraints. The grand mean is then computed using all observations satisfying to_constraints.

Note that, rather than providing a list of covariance matrices, the provided dataframe may include a field named COVMATS containing such a list. In that case, these covariance matrices will be used. Note that if both are provided, the field COVMATS in d will be overwritten.