Skip to contents

Center a dataset of covariance matrices by parallel transport

Usage

spd.center(d, covmats, group_by, from_constraints = "", to_constraints = "")

Arguments

d

A dataframe. See details.

covmats

An optional list of covariance matrices.

group_by

Variable in d to group by.

from_constraints

Logical expression indicating subsets to use for group means.

to_constraints

Logical expression indicating subset to use for target mean.

Value

A named list containing the centered covariance matrices, and the target mean covariance.

Details

A convenience wrapper implementing the centering of a covariance matrix valued dataset by parallel transport. The essential goal of this procedure is to accomplish something like the following: We have measured multiple covariance matrices per test subject, and we would like to align the mean covariance of each subject to a common (overall) mean. This is accomplished by

  • 1. Computing the grand mean covariance

  • 2. Computing the mean covariance for each subject

  • 3. Projecting each of the subjects' observations onto the tangent space around their mean

  • 4. Parallel transporting each of these tangent vectors along a geodesic to the grand mean

  • 5. Exponential mapping back onto the space of covariance matrices

Consider a neuroimaging experiment in which we measure functional connectivity (a covariance matrix) in multiple conditions within each subject, including a baseline condition. Let covmats be a list of covariance matrices, and let d be a dataframe (with nrows(d) = length(covmats)) containing columns for Subject and Condition. In order to center each subjects' mean to the grand mean, the user can specify group_by = "Subject" -- leaving the other constraints empty. If the user wished to align the subject baselines to the grand mean baseline, the user can set from_constraints = "Condition == 'Baseline'" -- instructing that only the baseline condition should be used to construct the subject meann -- and set to_constraints the same way.

More generally, the data are split according to group_by, and the group means are computed using only the observations satisfying from_constraints. The grand mean is then computed using all observations satisfying to_constraints.

Note that, rather than providing a list of covariance matrices, the provided dataframe may include a field named COVMATS containing such a list. In that case, these covariance matrices will be used. Note that if both are provided, the field COVMATS in d will be overwritten.