anneal.models package¶
Submodules¶
anneal.models.HmmMIxtureRNA module¶
anneal.models.HmmSimple module¶
-
class
congas.models.HmmSimple.
HmmSimple
(data_dict)[source]¶ Bases:
congas.models.Model.Model
Simple Hmm, models the CNV event as a Categorical variable. It does not cluster the data
- Model parameters:
- T = max number of clusters (default = 6) init_probs = prior probs for initial state CNV probabilities (default=torch.tensor([0.1,0.1,0.2,0.3,0.2,0.1])) hidden_dim = hidden dimensions (should be len(probs)) theta_scale = scale for the normalization factor variable (default = 3) theta_rate = rate for the normalization factor variable (default = 1) batch_size = batch size (default = None) t = probability of remaining in the same state (default=0.1)
-
data_name
= {'data', 'mu', 'pld', 'segments'}¶
-
params
= {'batch_size': None, 'hidden_dim': 6, 'init_probs': tensor([0.1000, 0.1000, 0.2000, 0.3000, 0.2000, 0.1000]), 't': 0.1, 'theta_rate': 3, 'theta_scale': 9}¶
anneal.models.MixtureCategorical module¶
anneal.models.MixtureDirichlet module¶
-
class
congas.models.MixtureDirichlet.
MixtureDirichlet
(data_dict)[source]¶ Bases:
congas.models.Model.Model
-
data_name
= {'data', 'mu', 'pld', 'segments'}¶
-
params
= {'K': 2, 'batch_size': None, 'cnv_mean': 2, 'gamma_multiplier': 5, 'hidden_dim': 6, 'mixture': tensor([1., 1.]), 'probs': tensor([0.1000, 0.1000, 0.2000, 0.3000, 0.2000, 0.1000]), 'theta_rate': 1, 'theta_scale': 3}¶
-
anneal.models.MixtureGaussian module¶
-
class
congas.models.MixtureGaussian.
MixtureGaussian
(data_dict)[source]¶ Bases:
congas.models.Model.Model
A simple mixture model for CNV inference, it assumes independence among the different segments, needs to be used after calling CNV regions with bulk DNA or RNA. CNVs events are modelled as LogNormal distributions.
- Model parameters:
- K = number of clusters (default = 2)
- **cnv_var = var of the LogNorm prior (default = 0.6)
- theta_scale = scale for the normalization factor variable (default = 3)
- theta_rate = rate for the normalization factor variable (default = 1)
- batch_size = batch size (default = None)
- mixture = prior for the mixture weights (default = 1/torch.ones(K))
-
data_name
= {'data', 'mu', 'pld'}¶
-
params
= {'K': 2, 'assignments': None, 'batch_size': None, 'cnv_locs': None, 'cnv_sd': 0.1, 'kmeans': True, 'mixture': None, 'norm_factor': None, 'norm_init_factors': None, 'theta_rate': 1, 'theta_scale': 3}¶
anneal.models.MixtureGaussianDMP module¶
-
class
congas.models.MixtureGaussianDMP.
MixtureGaussianDMP
(data_dict)[source]¶ Bases:
congas.models.Model.Model
-
data_name
= {'data', 'mu', 'pld', 'segments'}¶
-
params
= {'T': 6, 'alpha': 0.0001, 'batch_size': None, 'cnv_mean': 2, 'cnv_var': 0.6, 'gamma_multiplier': 5, 'mixture': tensor([1, 1]), 'theta_rate': 1, 'theta_scale': 3}¶
-
anneal.models.MixtureGaussianEXP module¶
-
class
congas.models.MixtureGaussianEXP.
MixtureGaussianEXP
(data_dict)[source]¶ Bases:
congas.models.Model.Model
A simple mixture model for CNV inference, it assumes independence among the different segments, needs to be used after calling CNV regions with bulk DNA or RNA. CNVs events are modelled as LogNormal distributions.
- Model parameters:
- K = number of clusters (default = 2)
- **cnv_var = var of the LogNorm prior (default = 0.6)
- theta_scale = scale for the normalization factor variable (default = 3)
- theta_rate = rate for the normalization factor variable (default = 1)
- batch_size = batch size (default = None)
- mixture = prior for the mixture weights (default = 1/torch.ones(K))
-
data_name
= {'data', 'pld'}¶
-
params
= {'K': 2, 'batch_size': None, 'cnv_sd': 0.6, 'kmeans': True, 'mixture': None, 'norm_init_factors': None, 'seg_sd': 0.1, 'theta_rate': 1, 'theta_scale': 1}¶
anneal.models.MixtureGaussianGenes module¶
anneal.models.Model module¶
-
class
congas.models.Model.
Model
(data_dict, data_name)[source]¶ Bases:
abc.ABC
All the models in the package have more or less the same structure. Cells are assumed to come from different population based on their copy-number profiles. Each model treats the CNV in a different way, but the common idea is to have an explicit formula for the counts of a gene or a genomic segment and treat them as if they depends only on the CNV and a cell specific factor (for segments, for genes we also have a gene dependent effect).
Moreover gene/segment counts are independent over the genome (or at least they hava a temporal dependency of first order). Given that we can write a mixture model factorizing the CNV for every segment.