CONCORD

`concord.Concord`

A contrastive learning framework for single-cell data analysis.

CONCORD performs dimensionality reduction, denoising, and batch correction in an unsupervised manner while preserving local and global topological structures.

Attributes:

Name	Type	Description
`adata`	`AnnData`	Input AnnData object.
`save_dir`	`Path`	Directory to save outputs and logs.
`config`	`Config`	Configuration object storing hyperparameters.
`model`	`ConcordModel`	The main contrastive learning model.
`trainer`	`Trainer`	Handles model training.
`loader`	`DataLoaderManager or ChunkLoader`	Data loading utilities.

`init(adata, save_dir='save/', inplace=True, verbose=False, **kwargs)`

Initializes the Concord framework.

Parameters:

Name	Type	Description	Default
`adata`	`AnnData`	Input single-cell data in AnnData format.	required
`save_dir`	`str`	Directory to save model outputs. Defaults to 'save/'.	`'save/'`
`inplace`	`bool`	If True, modifies `adata` in place. Defaults to True.	`True`
`verbose`	`bool`	Enable verbose logging. Defaults to False.	`False`
`**kwargs`		Additional configuration parameters.	`{}`

Raises:

Type	Description
`ValueError`	If `inplace` is set to True on a backed AnnData object.

`get_default_params()`

Returns the default hyperparameters used in CONCORD.

Returns:

Name	Type	Description
`dict`		A dictionary containing default configuration values.

`setup_config(**kwargs)`

Sets up the configuration for training.

Parameters:

Name	Type	Description	Default
`**kwargs`		Key-value pairs to override default parameters.	`{}`

Raises:

Type	Description
`ValueError`	If an invalid parameter is provided.

`init_model()`

Initializes the CONCORD model and loads a pre-trained model if specified.

Raises:

Type	Description
`FileNotFoundError`	If the specified pre-trained model file is missing.

`init_trainer()`

Initializes the model trainer, setting up loss functions, optimizer, and learning rate scheduler.

`init_dataloader(input_layer_key='X_log1p', preprocess=True, train_frac=1.0, use_sampler=True)`

Initializes the data loader for training and evaluation.

Parameters:

Name	Type	Description	Default
`input_layer_key`	`str`	Key in `adata.layers` to use as input. Defaults to 'X_log1p'.	`'X_log1p'`
`preprocess`	`bool`	Whether to apply preprocessing. Defaults to True.	`True`
`train_frac`	`float`	Fraction of data to use for training. Defaults to 1.0.	`1.0`
`use_sampler`	`bool`	Whether to use the probabilistic sampler. Defaults to True.	`True`

Raises:

Type	Description
`ValueError`	If `train_frac < 1.0` and contrastive loss mode is 'nn'.

`train(save_model=True, patience=2)`

Trains the model on the dataset.

Parameters:

Name	Type	Description	Default
`save_model`	`bool`	Whether to save the trained model. Defaults to True.	`True`
`patience`	`int`	Number of epochs to wait for improvement before early stopping. Defaults to 2.	`2`

`predict(loader, sort_by_indices=False, return_decoded=False, decoder_domain=None, return_latent=False, return_class=True, return_class_prob=True)`

Runs inference on a dataset.

Parameters:

Name	Type	Description	Default
`loader`	`DataLoader or list`	Data loader or chunked loader for batch processing.	required
`sort_by_indices`	`bool`	Whether to return results in original cell order. Defaults to False.	`False`
`return_decoded`	`bool`	Whether to return decoded gene expression. Defaults to False.	`False`
`decoder_domain`	`str`	Specifies a domain for decoding. Defaults to None.	`None`
`return_latent`	`bool`	Whether to return latent variables. Defaults to False.	`False`
`return_class`	`bool`	Whether to return predicted class labels. Defaults to True.	`True`
`return_class_prob`	`bool`	Whether to return class probabilities. Defaults to True.	`True`

Returns:

Name	Type	Description
`tuple`		Encoded embeddings, decoded matrix (if requested), class predictions, class probabilities, true labels, and latent variables.

`encode_adata(input_layer_key='X_log1p', output_key='Concord', preprocess=True, return_decoded=False, decoder_domain=None, return_latent=False, return_class=True, return_class_prob=True, save_model=True)`

Encodes an AnnData object using the CONCORD model.

Parameters:

Name	Type	Description	Default
`input_layer_key`	`str`	Input layer key. Defaults to 'X_log1p'.	`'X_log1p'`
`output_key`	`str`	Output key for storing results in AnnData. Defaults to 'Concord'.	`'Concord'`
`preprocess`	`bool`	Whether to apply preprocessing. Defaults to True.	`True`
`return_decoded`	`bool`	Whether to return decoded gene expression. Defaults to False.	`False`
`decoder_domain`	`str`	Specifies domain for decoding. Defaults to None.	`None`
`return_latent`	`bool`	Whether to return latent variables. Defaults to False.	`False`
`return_class`	`bool`	Whether to return predicted class labels. Defaults to True.	`True`
`return_class_prob`	`bool`	Whether to return class probabilities. Defaults to True.	`True`
`save_model`	`bool`	Whether to save the model after training. Defaults to True.	`True`

`get_domain_embeddings()`

Retrieves domain embeddings from the trained model.

Returns:

Type	Description
	pd.DataFrame: A dataframe containing domain embeddings.

`get_covariate_embeddings()`

Retrieves covariate embeddings from the trained model.

Returns:

Name	Type	Description
`dict`		A dictionary of DataFrames, each containing embeddings for a covariate.

`save_model(model, save_path)`

Saves the trained model to a file.

Parameters:

Name	Type	Description	Default
`model`	`Module`	The trained model.	required
`save_path`	`str or Path`	Path to save the model file.	required

Returns:

Type	Description
	None

CONCORD

concord.Concord

__init__(adata, save_dir='save/', inplace=True, verbose=False, **kwargs)

get_default_params()

setup_config(**kwargs)

init_model()

init_trainer()

init_dataloader(input_layer_key='X_log1p', preprocess=True, train_frac=1.0, use_sampler=True)

train(save_model=True, patience=2)

predict(loader, sort_by_indices=False, return_decoded=False, decoder_domain=None, return_latent=False, return_class=True, return_class_prob=True)

encode_adata(input_layer_key='X_log1p', output_key='Concord', preprocess=True, return_decoded=False, decoder_domain=None, return_latent=False, return_class=True, return_class_prob=True, save_model=True)

get_domain_embeddings()

get_covariate_embeddings()

save_model(model, save_path)

`concord.Concord`

`init(adata, save_dir='save/', inplace=True, verbose=False, **kwargs)`

`get_default_params()`

`setup_config(**kwargs)`

`init_model()`

`init_trainer()`

`init_dataloader(input_layer_key='X_log1p', preprocess=True, train_frac=1.0, use_sampler=True)`

`train(save_model=True, patience=2)`

`predict(loader, sort_by_indices=False, return_decoded=False, decoder_domain=None, return_latent=False, return_class=True, return_class_prob=True)`

`encode_adata(input_layer_key='X_log1p', output_key='Concord', preprocess=True, return_decoded=False, decoder_domain=None, return_latent=False, return_class=True, return_class_prob=True, save_model=True)`

`get_domain_embeddings()`

`get_covariate_embeddings()`

`save_model(model, save_path)`