stats module

Collection of tools for statistical analysis of neural data @Author: Mehrdad Kashefi

class stats.GDecompose

Analysis tool for decomposing structure of data G matrix into a series of model G matrices built from features.

add_model_from_features(features, feature_names, normalize=True): Add a feature and calculate its G matrix :param features: The feature matrix (num_conditions, num_features) :type features: np.array :param feature_names: Name of the feature :type feature_names: str :param normalize: If True, the G will be normalized by its trace :type normalize: bool

fit(data, normalize_G_emp=True)

Fit the model Gs to the data G on every timepoint

Parameters:

data (np.array) – The hidden data (num_conditions, num_hidden_variables) or (num_conditions, num_timepoints, num_hidden_variables)
normalize_G_emp (bool) – If True, the empirical G will be normalized by its trace

Returns:

df_beta (pd.DataFrame) – A dataframe containing the beta values for each model at each timepoint
df_fit (pd.DataFrame) – A dataframe containing the fss and tss values for each timepoint

plot_models(what, w=300, h=300, colormap='inferno')

Plot the model Gs or RDMs

Parameters:

what (str) – ‘G’ or ‘RDM’
w (int) – Width of the figure, default is 300
h (int) – Height of the figure, default is 300
colormap (str) – Colormap for the figure, default is ‘inferno’

class stats.Model(name, M, fit_intercept, **kwargs)

RUN RMD-like models on each time point

Parameters:

name (str) – Name of the model
M (np.array) – The model matrix (num_samples, num_features)
fit_intercept (bool) – If True, the model will fit an intercept

Kwargs:

feature_indicator (np.array): A binary array indicating the features that should be included in the model

fit(Y, method, **kwargs)

Fit the model to the data on each timepoint

Parameters:

Y (np.array) – The hidden data (num_conditions, num_timepoints, num_hidden_variables)
method (str) – The method of fitting the model

Kwargs:

n_kfold (int): Number of folds for cross-validation, default is 4
unit_eval (bool): If True, the evaluation will be done on each unit, default is False
n_kfold_in (int): Number of folds for inner cross-validation, default is 2
lambda_list (list): List of regularization parameters, default is [1e-2,1e-1,1,1e1,1e2, 1e3]
fit_score (str): The score for fitting the model, default is ‘r’

class stats.TimePointClassifier(num_fold=5, num_core=10, num_sampling_rep=30)

An anlysis tool for classification of experimental conditions from continuous variables like position, velocity, average FR, etc. The continuous data (X) should be in the shape of (num_conditions, num_timepoints, num_variables) The associated class value (y) (num_conditions, )

Example

TClassifier = ST.TimepointClassifier() acc, acc_chance = TClassifier.fit(X, y)

Parameters:

num_fold (int) – Number of folds for cross-validation, default is 5
num_core (int) – Number of cores for parallel processing, default is 10
num_sampling_rep (int) – Number of sampling repetitions, default is 30

fit(X, y)

Run the classification models on every timepoint

Parameters:

X (np.array) – The continuous data (num_conditions, num_timepoints, num_variables)
y (np.array) – The associated class value (num_conditions, )

Returns:

acc (np.array) – Accuracy of the model
acc_chance (np.array) – Chance level accuracy

class stats.VarDecompose(Indicators, ortho_ineraction=1, verbose=1)

An anlysis tool for decomposing the variance of a set of hidden variables into different components. The components are defined by a set of indicator variables. The hidden data (Y) should be in the shape of (num_conditions, num_timepoints, num_hidden_variables) The indicator variables should be in the shape of (num_conditions, 1), similar conditions will have the same values

Parameters:

Indicators (dict) – A dictionary of indicator variables
ortho_ineraction (bool) – If True, the interaction terms will be orthogonalized
verbose (bool) – If True, the covariance matrices will be plotted

fit(Y)

Fit the model to the data

Parameters:

Y (np.array) – The hidden data (num_conditions, num_timepoints, num_hidden_variables)

Returns:

tss (np.array) – Total variance of hidden variables
fss (np.array) – Explained variance by each model

plot(**kwargs)

Plot the results

Parameters:

width (int) – Width of the figure, default is 15
height (int) – Height of the figure, default is 5
save_dir (str) – Directory to save the figures
name (str) – Name of the figures