Abstract
 Here we will discuss unsupervised manifold learning and introduce Multi-OMICs Factor Analysis (MOFA) as an example of unsupervised data exploration. Further, we will perform integrative OMICs analysis of chromatin accessibility, DNA methylation and transcriptome profiling data from the scNMT-seq (single-cell nucleosome, methylation and transcription sequencing) using MOFA.
See the original version of this notebook from the 2021 Sweden course for background information. (Schedule page with all course materials from 2021 Sweden course)
In this section we will apply MOFA to the single cell multi-OMICs data set scNMT which profiles chromatine accessebility (scATACseq), DNA methylation (scBSseq) and gene expression (scRNAseq) information from the same biological cells belonging to two types: differentiating mouse embryonic stem cells (ESCs) and embrionic bodies (EBs).
Load R packages. Activate Python virtual environment (behind the scenes Python will be used to run MOFA2):
library(mixOmics)
## Loading required package: MASS
## Loading required package: lattice
## Loading required package: ggplot2
##
## Loaded mixOmics 6.24.0
## Thank you for using mixOmics!
## Tutorials: http://mixomics.org
## Bookdown vignette: https://mixomicsteam.github.io/Bookdown
## Questions, issues: Follow the prompts at http://mixomics.org/contact-us
## Cite us: citation('mixOmics')
library(MOFA2)
##
## Attaching package: 'MOFA2'
## The following object is masked from 'package:stats':
##
## predict
reticulate::use_virtualenv("venv", required = TRUE)
We will start with reading and having a look at the individual OMICs data sets:
Read single-cell RNAseq data and use a lookup table to match the IDs of the dataset with gene names. Display the first 5 rows and columns of the dataset.
scRNAseq<-read.delim("data/scRNAseq.txt", header = TRUE, check.names = FALSE, row.names = 1, sep="\t")
ens2genesymbol<-read.delim("data/ENSEMBLE_TO_GENE_SYMBOL_MOUSE.txt")
ens2genesymbol<-ens2genesymbol[match(colnames(scRNAseq),as.character(ens2genesymbol$ensembl_gene_id)),]
colnames(scRNAseq)<-ens2genesymbol$external_gene_name
scRNAseq<-as.data.frame(t(scRNAseq))
scRNAseq[1:5,1:5]
## ESC_A02 ESC_A03 ESC_A04 ESC_A05 ESC_A06
## Mrpl15 557.4201 130.8536 305.7356 417.1656 573.6504
## Lypla1 410.5958 138.8325 186.3237 106.0895 662.7008
## Tcea1 437.9119 356.3899 276.9121 161.8315 213.3068
## Atp6v1h 345.7199 500.5417 195.5884 133.0614 325.1376
## Oprk1 0.0000 0.0000 0.0000 0.0000 0.0000
Read two additional omics datasets: single-cell BSseq (bisulfite) and ATACseq. Transpose them so they are in the same format as the RNAseq. Display the first 5 rows and columns of each dataset.
scBSseq<-read.delim("data/scBSseq.txt", header = TRUE, check.names = FALSE, row.names = 1, sep="\t")
scBSseq<-as.data.frame(t(scBSseq))
scBSseq[1:5,1:5]
## ESC_A02 ESC_A03 ESC_A04 ESC_A05 ESC_A06
## 1_101606896 0.00000 0 0 0 0
## 1_101935054 88.88889 100 100 100 30
## 1_101935061 88.88889 100 100 100 90
## 1_102002184 100.00000 100 0 0 0
## 1_102238204 100.00000 100 100 100 100
scATACseq<-read.delim("data/scATACseq.txt", header = TRUE, check.names = FALSE, row.names = 1, sep="\t")
scATACseq<-as.data.frame(t(scATACseq))
scATACseq[1:5,1:5]
## ESC_A02 ESC_A03 ESC_A04 ESC_A05 ESC_A06
## 1_100392139 100 100 100 100 67
## 1_100392151 100 100 100 50 67
## 1_100668590 50 86 44 45 28
## 1_100738967 83 88 83 83 88
## 1_100994324 0 0 11 75 0
For the scRNAseq OMIC layer we will only select highly expressed genes (mean >= 1) in order to remove noisy features that might contaminate the further downstream analysis. We will also perform log-transform of the data which can be seen as a mild normalization. Look at the first 5 rows and columns of the updated dataset, and check its size.
scRNAseq<-scRNAseq[rowMeans(scRNAseq)>=1,]
scRNAseq<-log10(scRNAseq + 1)
scRNAseq[1:5,1:5]
## ESC_A02 ESC_A03 ESC_A04 ESC_A05 ESC_A06
## Mrpl15 2.746961 2.120092 2.486764 2.621348 2.759404
## Lypla1 2.614471 2.145608 2.272593 2.029747 2.821972
## Tcea1 2.642377 2.553142 2.443907 2.211738 2.331036
## Atp6v1h 2.539979 2.700307 2.293558 2.127304 2.513401
## Oprk1 0.000000 0.000000 0.000000 0.000000 0.000000
dim(scRNAseq)
## [1] 12145 113
Since scBSseq and scATACseq OMIC layers should be almost binary
(methylated vs unmethylated for scBSseq and open vs. close for
scATACseq) data sets, a good way to get rid of redundancy in the scBSseq
and scATACseq data (and thus perform feature pre-selection and reduce
dimensions) is to select only features with high variability. The
function nearZeroVar()
from the mixOmics
package is used for this.
my_nearZeroVar<-nearZeroVar(as.data.frame(t(scBSseq)))
head(my_nearZeroVar$Metrics)
## freqRatio percentUnique
## 1_101606896 112.00 1.7699115
## 1_105525627 35.00 5.3097345
## 1_105539120 26.25 5.3097345
## 1_10605572 0.00 0.8849558
## 1_107485472 0.00 0.8849558
## 1_109269874 20.00 7.9646018
dim(my_nearZeroVar$Metrics)
## [1] 3290 2
scBSseq <- scBSseq[-which(rownames(scBSseq)%in%rownames(my_nearZeroVar$Metrics)),]
scBSseq[1:5,1:5]
## ESC_A02 ESC_A03 ESC_A04 ESC_A05 ESC_A06
## 1_101935054 88.88889 100 100 100 30
## 1_101935061 88.88889 100 100 100 90
## 1_102002184 100.00000 100 0 0 0
## 1_102238204 100.00000 100 100 100 100
## 1_102255678 100.00000 100 100 100 0
dim(scBSseq)
## [1] 5285 113
my_nearZeroVar<-nearZeroVar(as.data.frame(t(scATACseq)), uniqueCut = 1)
head(my_nearZeroVar$Metrics)
## freqRatio percentUnique
## 1_10401731 0 0.8849558
## 1_128314944 0 0.8849558
## 1_13628215 0 0.8849558
## 1_178804829 0 0.8849558
## 1_183332775 0 0.8849558
## 1_183944178 0 0.8849558
dim(my_nearZeroVar$Metrics)
## [1] 91 2
scATACseq <- scATACseq[-which(rownames(scATACseq)%in%rownames(my_nearZeroVar$Metrics)),]
scATACseq[1:5,1:5]
## ESC_A02 ESC_A03 ESC_A04 ESC_A05 ESC_A06
## 1_100392139 100 100 100 100 67
## 1_100392151 100 100 100 50 67
## 1_100668590 50 86 44 45 28
## 1_100738967 83 88 83 83 88
## 1_100994324 0 0 11 75 0
dim(scATACseq)
## [1] 11709 113
Let us now have a look at the histograms of individual OMICs layers in order to decide what distribution they follow and how we should model these distributions with MOFA:
hist(rowMeans(scRNAseq),breaks = 100, main = "scRNAseq")
hist(rowMeans(scBSseq), breaks = 100, main = "scBSseq")
hist(rowMeans(scATACseq), breaks = 100, main = "scATACseq")
We conclude that while scRNAseq data looks fairly Gaussian (or at least exponential), we should probably model the scBSseq and scATACseq data following Bernoulli distribution. They look quite bimodal, indicating the binary nature of the data, i.e. methylated vs. unmethylated for scBSseq and open vs. close for scATACseq. To make the scBSseq and scATACseq data purely Bernoulli-like, we will make them binary by encoding values below 50 as 0 and above 50 as 1. Since binary data typically have vey low variation compared to continuous data, we need to remove low-variance features in this case again:
scBSseq<-ifelse(scBSseq<50,0,1)
scBSseq[1:5,1:5]
## ESC_A02 ESC_A03 ESC_A04 ESC_A05 ESC_A06
## 1_101935054 1 1 1 1 0
## 1_101935061 1 1 1 1 1
## 1_102002184 1 1 0 0 0
## 1_102238204 1 1 1 1 1
## 1_102255678 1 1 1 1 0
my_nearZeroVar<-nearZeroVar(as.data.frame(t(scBSseq)))
head(my_nearZeroVar$Metrics)
## freqRatio percentUnique
## 1_110547534 21.60 1.7699115
## 1_133172191 27.25 1.7699115
## 1_139117169 27.25 1.7699115
## 1_146093078 21.60 1.7699115
## 1_194260338 112.00 1.7699115
## 1_24611678 0.00 0.8849558
dim(my_nearZeroVar$Metrics)
## [1] 761 2
scBSseq <- as.data.frame(scBSseq[-which(rownames(scBSseq)%in%rownames(my_nearZeroVar$Metrics)),])
scBSseq[1:5,1:5]
## ESC_A02 ESC_A03 ESC_A04 ESC_A05 ESC_A06
## 1_101935054 1 1 1 1 0
## 1_101935061 1 1 1 1 1
## 1_102002184 1 1 0 0 0
## 1_102238204 1 1 1 1 1
## 1_102255678 1 1 1 1 0
dim(scBSseq)
## [1] 4524 113
scATACseq<-ifelse(scATACseq<50,0,1)
my_nearZeroVar<-nearZeroVar(as.data.frame(t(scATACseq)))
head(my_nearZeroVar$Metrics)
## freqRatio percentUnique
## 1_100392139 112.00 1.769912
## 1_100392151 27.25 1.769912
## 1_100738967 55.50 1.769912
## 1_101897864 55.50 1.769912
## 1_102283335 55.50 1.769912
## 1_102563492 112.00 1.769912
dim(my_nearZeroVar$Metrics)
## [1] 2238 2
scATACseq <- as.data.frame(scATACseq[-which(rownames(scATACseq)%in%rownames(my_nearZeroVar$Metrics)),])
scATACseq[1:5,1:5]
## ESC_A02 ESC_A03 ESC_A04 ESC_A05 ESC_A06
## 1_100668590 1 1 0 0 0
## 1_100994324 0 0 0 1 0
## 1_100994328 0 0 0 0 0
## 1_100994336 0 0 0 0 0
## 1_100994343 0 0 0 0 1
dim(scATACseq)
## [1] 9471 113
Now the data cleaning step is finished and the OMICs are prepared to be integrated in an unsupervised way with MOFA.
Let us continue by creating a MOFA object. Combine the three Omics datasets into a list and look at the size of each matrix. MOFA allows for a handy overview of the data by displaying dimensions of each OMIC. Missing data are shown in gray but there are no missing values in this example dataset.
omics<-list(scRNAseq=as.matrix(scRNAseq),scBSseq=as.matrix(scBSseq),scATACseq=as.matrix(scATACseq))
lapply(omics,dim)
## $scRNAseq
## [1] 12145 113
##
## $scBSseq
## [1] 4524 113
##
## $scATACseq
## [1] 9471 113
MOFAobject <- create_mofa_from_matrix(omics)
plot_data_overview(MOFAobject, )
Now we will set some model training parameters such as distribution types corresponding to each OMIC, number of iterations, number of factors to be computed, etc.
#DEFINE DATA OPTIONS
DataOptions <- get_default_data_options(MOFAobject)
DataOptions
## $scale_views
## [1] FALSE
##
## $scale_groups
## [1] FALSE
##
## $center_groups
## [1] TRUE
##
## $use_float32
## [1] TRUE
##
## $views
## [1] "scRNAseq" "scBSseq" "scATACseq"
##
## $groups
## [1] "group1"
#DEFINE MODEL OPTIONS
ModelOptions <- get_default_model_options(MOFAobject)
mydistr <- c("gaussian","bernoulli","bernoulli")
names(mydistr) <- c("scRNAseq","scBSseq","scATACseq")
ModelOptions$likelihoods <- mydistr
ModelOptions$num_factors <- 20
ModelOptions
## $likelihoods
## scRNAseq scBSseq scATACseq
## "gaussian" "bernoulli" "bernoulli"
##
## $num_factors
## [1] 20
##
## $spikeslab_factors
## [1] FALSE
##
## $spikeslab_weights
## [1] FALSE
##
## $ard_factors
## [1] FALSE
##
## $ard_weights
## [1] TRUE
#DEFINE TRAIN OPTIONS
TrainOptions <- get_default_training_options(MOFAobject)
TrainOptions$seed <- 2018
# Automatically drop factors that explain less than 3% of variance in all omics
TrainOptions$drop_factor_threshold <- 0.03
TrainOptions$maxiter <- 1000
TrainOptions$verbose <- TRUE
TrainOptions
## $maxiter
## [1] 1000
##
## $convergence_mode
## [1] "fast"
##
## $drop_factor_threshold
## [1] 0.03
##
## $verbose
## [1] TRUE
##
## $startELBO
## [1] 1
##
## $freqELBO
## [1] 5
##
## $stochastic
## [1] FALSE
##
## $gpu_mode
## [1] FALSE
##
## $gpu_device
## NULL
##
## $seed
## [1] 2018
##
## $outfile
## NULL
##
## $weight_views
## [1] FALSE
##
## $save_interrupted
## [1] FALSE
Finally, we are ready to run MOFA. This will take about 5 minutes.
MOFAobject <- prepare_mofa(MOFAobject, data_options = DataOptions, model_options = ModelOptions, training_options = TrainOptions)
## Checking data options...
## Checking training options...
## Checking model options...
MOFAobject <- run_mofa(MOFAobject, use_basilisk = FALSE)
## Connecting to the mofapy2 python package using reticulate (use_basilisk = FALSE)...
## Please make sure to manually specify the right python binary when loading R with reticulate::use_python(..., force=TRUE) or the right conda environment with reticulate::use_condaenv(..., force=TRUE)
## If you prefer to let us automatically install a conda environment with 'mofapy2' installed using the 'basilisk' package, please use the argument 'use_basilisk = TRUE'
##
## #########################################################
## ### __ __ ____ ______ ###
## ### | \/ |/ __ \| ____/\ _ ###
## ### | \ / | | | | |__ / \ _| |_ ###
## ### | |\/| | | | | __/ /\ \_ _| ###
## ### | | | | |__| | | / ____ \|_| ###
## ### |_| |_|\____/|_|/_/ \_\ ###
## ### ###
## #########################################################
##
##
##
## use_float32 set to True: replacing float64 arrays by float32 arrays to speed up computations...
##
## Successfully loaded view='scRNAseq' group='group1' with N=113 samples and D=12145 features...
## Successfully loaded view='scBSseq' group='group1' with N=113 samples and D=4524 features...
## Successfully loaded view='scATACseq' group='group1' with N=113 samples and D=9471 features...
##
##
## Model options:
## - Automatic Relevance Determination prior on the factors: False
## - Automatic Relevance Determination prior on the weights: True
## - Spike-and-slab prior on the factors: False
## - Spike-and-slab prior on the weights: False
## Likelihoods:
## - View 0 (scRNAseq): gaussian
## - View 1 (scBSseq): bernoulli
## - View 2 (scATACseq): bernoulli
##
##
##
## Dropping factors with minimum threshold of 0.03% variance explained
##
## Convergence mode: fast
##
##
##
## ######################################
## ## Training the model with seed 2018 ##
## ######################################
##
##
## ELBO before training:
## Z=-189.48 W=-150884.14 Tau=-64835.45 Y=-21753519.02 AlphaW=-1839.54
## Total: -21971267.63
##
## Iteration 1: time=2.39, ELBO=-3652316.68, deltaELBO=18318950.950 (83.37685042%), Factors=19
## - ELBO decomposition: Z=-8189.48 W=-123201.04 Tau=-91215.04 Y=-3427718.14 AlphaW=-1992.98
## - Time spent in ELBO computation: 9.1%
## - Variance explained: View 0: 14.06% View 1: 21.74% View 2: 15.73%
## - Fraction of zero weights: View 0: 5% View 1: 1% View 2: 1%
## - Maximum correlation between factors: 0.41
## - Factor norms: 4.68 7.61 3.71 3.60 2.03 2.41 4.10 2.24 1.64 2.12 2.08 1.69 2.12 2.14 1.81 2.06 1.81 2.06 2.04
## - Tau per view (average): View 0: 1.54 View 1: 0.10 View 2: 0.10
##
##
## Iteration 2: time=2.25, Factors=18
## - Variance explained: View 0: 17.93% View 1: 17.78% View 2: 12.64%
## - Fraction of zero weights: View 0: 3% View 1: 2% View 2: 2%
## - Maximum correlation between factors: 0.44
## - Factor norms: 2.49 3.87 1.35 1.11 0.64 0.49 1.89 0.37 0.29 0.31 0.33 0.22 0.30 0.25 0.24 0.22 0.26 0.23
## - Tau per view (average): View 0: 5.29 View 1: 0.23 View 2: 0.23
##
##
## Iteration 3: time=1.87, Factors=17
## - Variance explained: View 0: 18.76% View 1: 13.33% View 2: 8.95%
## - Fraction of zero weights: View 0: 3% View 1: 4% View 2: 3%
## - Maximum correlation between factors: 0.38
## - Factor norms: 2.54 3.42 1.23 0.84 0.72 0.43 1.89 0.28 0.21 0.23 0.16 0.21 0.17 0.13 0.14 0.13 0.12
## - Tau per view (average): View 0: 6.89 View 1: 0.24 View 2: 0.25
##
##
## Iteration 4: time=1.71, Factors=16
## - Variance explained: View 0: 19.11% View 1: 13.64% View 2: 8.40%
## - Fraction of zero weights: View 0: 3% View 1: 4% View 2: 3%
## - Maximum correlation between factors: 0.36
## - Factor norms: 2.52 3.73 1.25 0.83 0.75 0.43 1.93 0.26 0.19 0.22 0.16 0.21 0.15 0.10 0.10 0.10
## - Tau per view (average): View 0: 7.62 View 1: 0.25 View 2: 0.25
##
##
## Iteration 5: time=1.48, Factors=15
## - Variance explained: View 0: 19.41% View 1: 13.84% View 2: 7.94%
## - Fraction of zero weights: View 0: 3% View 1: 4% View 2: 3%
## - Maximum correlation between factors: 0.36
## - Factor norms: 2.49 3.84 1.26 0.83 0.77 0.43 1.95 0.25 0.23 0.18 0.24 0.15 0.11 0.09 0.09
## - Tau per view (average): View 0: 8.02 View 1: 0.25 View 2: 0.25
##
##
## Iteration 6: time=1.60, ELBO=-2192613.62, deltaELBO=1459703.064 (6.64369070%), Factors=14
## - ELBO decomposition: Z=-4473.63 W=-83246.20 Tau=-91280.55 Y=-2012144.62 AlphaW=-1468.62
## - Time spent in ELBO computation: 12.7%
## - Variance explained: View 0: 18.60% View 1: 13.10% View 2: 7.35%
## - Fraction of zero weights: View 0: 3% View 1: 4% View 2: 4%
## - Maximum correlation between factors: 0.38
## - Factor norms: 2.50 3.88 1.26 0.83 0.43 2.06 0.26 0.25 0.20 0.26 0.17 0.11 0.09 0.10
## - Tau per view (average): View 0: 8.00 View 1: 0.25 View 2: 0.25
##
##
## Iteration 7: time=1.36, Factors=13
## - Variance explained: View 0: 19.02% View 1: 13.00% View 2: 7.03%
## - Fraction of zero weights: View 0: 3% View 1: 5% View 2: 4%
## - Maximum correlation between factors: 0.37
## - Factor norms: 2.46 3.84 1.25 0.83 0.42 2.08 0.25 0.27 0.22 0.28 0.18 0.12 0.12
## - Tau per view (average): View 0: 8.13 View 1: 0.25 View 2: 0.25
##
##
## Iteration 8: time=1.32, Factors=12
## - Variance explained: View 0: 19.01% View 1: 12.84% View 2: 6.68%
## - Fraction of zero weights: View 0: 2% View 1: 5% View 2: 4%
## - Maximum correlation between factors: 0.37
## - Factor norms: 2.43 3.80 1.23 0.82 0.42 2.10 0.25 0.29 0.29 0.20 0.15 0.16
## - Tau per view (average): View 0: 8.12 View 1: 0.25 View 2: 0.25
##
##
## Iteration 9: time=1.18, Factors=11
## - Variance explained: View 0: 18.38% View 1: 11.78% View 2: 5.90%
## - Fraction of zero weights: View 0: 2% View 1: 5% View 2: 4%
## - Maximum correlation between factors: 0.34
## - Factor norms: 2.40 3.79 1.21 0.45 2.15 0.26 0.31 0.30 0.21 0.17 0.21
## - Tau per view (average): View 0: 7.99 View 1: 0.25 View 2: 0.25
##
##
## Iteration 10: time=1.12, Factors=10
## - Variance explained: View 0: 18.24% View 1: 11.46% View 2: 5.69%
## - Fraction of zero weights: View 0: 2% View 1: 5% View 2: 3%
## - Maximum correlation between factors: 0.34
## - Factor norms: 2.39 3.79 1.18 0.45 2.14 0.27 0.32 0.23 0.21 0.26
## - Tau per view (average): View 0: 7.84 View 1: 0.25 View 2: 0.25
##
##
## Iteration 11: time=1.15, ELBO=-2188167.82, deltaELBO=4445.801 (0.02023461%), Factors=9
## - ELBO decomposition: Z=-3118.52 W=-72739.71 Tau=-91277.77 Y=-2020087.81 AlphaW=-944.02
## - Time spent in ELBO computation: 17.6%
## - Variance explained: View 0: 17.95% View 1: 11.26% View 2: 5.47%
## - Fraction of zero weights: View 0: 2% View 1: 5% View 2: 3%
## - Maximum correlation between factors: 0.35
## - Factor norms: 2.37 3.78 1.16 0.45 2.12 0.28 0.24 0.23 0.30
## - Tau per view (average): View 0: 7.75 View 1: 0.25 View 2: 0.25
##
##
## Iteration 12: time=0.92, Factors=8
## - Variance explained: View 0: 16.51% View 1: 8.89% View 2: 5.05%
## - Fraction of zero weights: View 0: 2% View 1: 5% View 2: 3%
## - Maximum correlation between factors: 0.37
## - Factor norms: 2.38 3.99 0.46 2.14 0.31 0.26 0.25 0.33
## - Tau per view (average): View 0: 7.43 View 1: 0.25 View 2: 0.25
##
##
## Iteration 13: time=0.96, Factors=7
## - Variance explained: View 0: 16.24% View 1: 8.53% View 2: 4.99%
## - Fraction of zero weights: View 0: 2% View 1: 5% View 2: 4%
## - Maximum correlation between factors: 0.37
## - Factor norms: 2.35 3.98 0.47 2.12 0.33 0.28 0.26
## - Tau per view (average): View 0: 7.44 View 1: 0.25 View 2: 0.25
##
##
## Iteration 14: time=0.78, Factors=6
## - Variance explained: View 0: 15.42% View 1: 8.18% View 2: 4.83%
## - Fraction of zero weights: View 0: 2% View 1: 6% View 2: 3%
## - Maximum correlation between factors: 0.37
## - Factor norms: 2.37 4.00 2.11 0.36 0.32 0.28
## - Tau per view (average): View 0: 7.35 View 1: 0.25 View 2: 0.25
##
##
## Iteration 15: time=0.68, Factors=5
## - Variance explained: View 0: 14.84% View 1: 7.56% View 2: 4.75%
## - Fraction of zero weights: View 0: 2% View 1: 6% View 2: 3%
## - Maximum correlation between factors: 0.37
## - Factor norms: 2.33 4.05 2.08 0.36 0.29
## - Tau per view (average): View 0: 7.25 View 1: 0.25 View 2: 0.25
##
##
## Iteration 16: time=0.77, ELBO=-2194943.42, deltaELBO=-6775.600 (0.03083846%), Factors=4
## Warning, lower bound is decreasing...
## - ELBO decomposition: Z=-1617.85 W=-50180.78 Tau=-91267.70 Y=-2051457.50 AlphaW=-419.58
## - Time spent in ELBO computation: 22.7%
## - Variance explained: View 0: 14.43% View 1: 7.19% View 2: 4.67%
## - Fraction of zero weights: View 0: 2% View 1: 7% View 2: 2%
## - Maximum correlation between factors: 0.37
## - Factor norms: 2.31 4.07 2.05 0.41
## - Tau per view (average): View 0: 6.89 View 1: 0.25 View 2: 0.25
##
##
## Iteration 17: time=0.52, Factors=3
## - Variance explained: View 0: 13.32% View 1: 6.82% View 2: 4.54%
## - Fraction of zero weights: View 0: 2% View 1: 9% View 2: 2%
## - Maximum correlation between factors: 0.38
## - Factor norms: 2.28 4.10 2.02
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 18: time=0.43, Factors=3
## - Variance explained: View 0: 13.32% View 1: 6.69% View 2: 4.55%
## - Fraction of zero weights: View 0: 2% View 1: 9% View 2: 2%
## - Maximum correlation between factors: 0.38
## - Factor norms: 2.25 4.11 1.98
## - Tau per view (average): View 0: 6.67 View 1: 0.25 View 2: 0.25
##
##
## Iteration 19: time=0.49, Factors=3
## - Variance explained: View 0: 13.32% View 1: 6.59% View 2: 4.56%
## - Fraction of zero weights: View 0: 2% View 1: 9% View 2: 2%
## - Maximum correlation between factors: 0.38
## - Factor norms: 2.22 4.13 1.94
## - Tau per view (average): View 0: 6.67 View 1: 0.25 View 2: 0.25
##
##
## Iteration 20: time=0.42, Factors=3
## - Variance explained: View 0: 13.32% View 1: 6.52% View 2: 4.57%
## - Fraction of zero weights: View 0: 2% View 1: 10% View 2: 2%
## - Maximum correlation between factors: 0.39
## - Factor norms: 2.20 4.16 1.91
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 21: time=0.59, ELBO=-2198382.28, deltaELBO=-3438.861 (0.01565163%), Factors=3
## Warning, lower bound is decreasing...
## - ELBO decomposition: Z=-1289.07 W=-44235.69 Tau=-91265.34 Y=-2061277.48 AlphaW=-314.70
## - Time spent in ELBO computation: 24.2%
## - Variance explained: View 0: 13.32% View 1: 6.47% View 2: 4.58%
## - Fraction of zero weights: View 0: 2% View 1: 10% View 2: 2%
## - Maximum correlation between factors: 0.39
## - Factor norms: 2.17 4.18 1.88
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 22: time=0.42, Factors=3
## - Variance explained: View 0: 13.32% View 1: 6.42% View 2: 4.59%
## - Fraction of zero weights: View 0: 2% View 1: 11% View 2: 2%
## - Maximum correlation between factors: 0.39
## - Factor norms: 2.14 4.20 1.84
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 23: time=0.44, Factors=3
## - Variance explained: View 0: 13.32% View 1: 6.39% View 2: 4.60%
## - Fraction of zero weights: View 0: 1% View 1: 12% View 2: 2%
## - Maximum correlation between factors: 0.39
## - Factor norms: 2.12 4.22 1.81
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 24: time=0.43, Factors=3
## - Variance explained: View 0: 13.32% View 1: 6.35% View 2: 4.61%
## - Fraction of zero weights: View 0: 1% View 1: 12% View 2: 2%
## - Maximum correlation between factors: 0.39
## - Factor norms: 2.09 4.24 1.78
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 25: time=0.45, Factors=3
## - Variance explained: View 0: 13.32% View 1: 6.33% View 2: 4.62%
## - Fraction of zero weights: View 0: 2% View 1: 13% View 2: 2%
## - Maximum correlation between factors: 0.39
## - Factor norms: 2.07 4.26 1.75
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 26: time=0.55, ELBO=-2198013.73, deltaELBO=368.546 (0.00167740%), Factors=3
## - ELBO decomposition: Z=-1285.37 W=-44116.87 Tau=-91265.36 Y=-2061031.44 AlphaW=-314.69
## - Time spent in ELBO computation: 28.9%
## - Variance explained: View 0: 13.33% View 1: 6.30% View 2: 4.62%
## - Fraction of zero weights: View 0: 2% View 1: 14% View 2: 2%
## - Maximum correlation between factors: 0.39
## - Factor norms: 2.04 4.28 1.72
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 27: time=0.46, Factors=3
## - Variance explained: View 0: 13.33% View 1: 6.27% View 2: 4.63%
## - Fraction of zero weights: View 0: 1% View 1: 14% View 2: 3%
## - Maximum correlation between factors: 0.38
## - Factor norms: 2.02 4.29 1.70
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 28: time=0.44, Factors=3
## - Variance explained: View 0: 13.33% View 1: 6.25% View 2: 4.64%
## - Fraction of zero weights: View 0: 1% View 1: 15% View 2: 3%
## - Maximum correlation between factors: 0.38
## - Factor norms: 1.99 4.31 1.67
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 29: time=0.43, Factors=3
## - Variance explained: View 0: 13.33% View 1: 6.23% View 2: 4.64%
## - Fraction of zero weights: View 0: 1% View 1: 15% View 2: 3%
## - Maximum correlation between factors: 0.38
## - Factor norms: 1.97 4.32 1.64
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 30: time=0.45, Factors=3
## - Variance explained: View 0: 13.33% View 1: 6.20% View 2: 4.65%
## - Fraction of zero weights: View 0: 1% View 1: 16% View 2: 3%
## - Maximum correlation between factors: 0.38
## - Factor norms: 1.94 4.34 1.62
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 31: time=0.57, ELBO=-2197684.67, deltaELBO=329.066 (0.00149771%), Factors=3
## - ELBO decomposition: Z=-1281.44 W=-44008.81 Tau=-91265.15 Y=-2060814.57 AlphaW=-314.70
## - Time spent in ELBO computation: 28.2%
## - Variance explained: View 0: 13.33% View 1: 6.18% View 2: 4.65%
## - Fraction of zero weights: View 0: 1% View 1: 16% View 2: 3%
## - Maximum correlation between factors: 0.37
## - Factor norms: 1.92 4.35 1.59
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 32: time=0.51, Factors=3
## - Variance explained: View 0: 13.33% View 1: 6.15% View 2: 4.66%
## - Fraction of zero weights: View 0: 1% View 1: 17% View 2: 3%
## - Maximum correlation between factors: 0.37
## - Factor norms: 1.90 4.36 1.57
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 33: time=0.47, Factors=3
## - Variance explained: View 0: 13.33% View 1: 6.13% View 2: 4.67%
## - Fraction of zero weights: View 0: 1% View 1: 17% View 2: 3%
## - Maximum correlation between factors: 0.37
## - Factor norms: 1.88 4.37 1.55
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 34: time=0.50, Factors=3
## - Variance explained: View 0: 13.33% View 1: 6.11% View 2: 4.67%
## - Fraction of zero weights: View 0: 1% View 1: 18% View 2: 3%
## - Maximum correlation between factors: 0.36
## - Factor norms: 1.85 4.38 1.53
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 35: time=0.49, Factors=3
## - Variance explained: View 0: 13.33% View 1: 6.08% View 2: 4.67%
## - Fraction of zero weights: View 0: 1% View 1: 19% View 2: 4%
## - Maximum correlation between factors: 0.36
## - Factor norms: 1.83 4.39 1.50
## - Tau per view (average): View 0: 6.68 View 1: 0.25 View 2: 0.25
##
##
## Iteration 36: time=0.62, ELBO=-2197402.89, deltaELBO=281.776 (0.00128247%), Factors=3
## - ELBO decomposition: Z=-1277.43 W=-43897.97 Tau=-91265.39 Y=-2060647.39 AlphaW=-314.71
## - Time spent in ELBO computation: 28.1%
## - Variance explained: View 0: 13.33% View 1: 6.06% View 2: 4.68%
## - Fraction of zero weights: View 0: 2% View 1: 19% View 2: 4%
## - Maximum correlation between factors: 0.35
## - Factor norms: 1.81 4.39 1.48
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 37: time=0.52, Factors=3
## - Variance explained: View 0: 13.33% View 1: 6.03% View 2: 4.68%
## - Fraction of zero weights: View 0: 2% View 1: 19% View 2: 4%
## - Maximum correlation between factors: 0.35
## - Factor norms: 1.79 4.40 1.46
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 38: time=0.45, Factors=3
## - Variance explained: View 0: 13.33% View 1: 6.01% View 2: 4.69%
## - Fraction of zero weights: View 0: 2% View 1: 20% View 2: 4%
## - Maximum correlation between factors: 0.35
## - Factor norms: 1.77 4.41 1.44
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 39: time=0.54, Factors=3
## - Variance explained: View 0: 13.33% View 1: 5.99% View 2: 4.69%
## - Fraction of zero weights: View 0: 2% View 1: 20% View 2: 4%
## - Maximum correlation between factors: 0.34
## - Factor norms: 1.76 4.41 1.43
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 40: time=0.43, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.96% View 2: 4.69%
## - Fraction of zero weights: View 0: 2% View 1: 21% View 2: 4%
## - Maximum correlation between factors: 0.34
## - Factor norms: 1.74 4.41 1.41
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 41: time=0.55, ELBO=-2197170.73, deltaELBO=232.163 (0.00105666%), Factors=3
## - ELBO decomposition: Z=-1273.41 W=-43783.47 Tau=-91265.39 Y=-2060533.77 AlphaW=-314.70
## - Time spent in ELBO computation: 26.9%
## - Variance explained: View 0: 13.34% View 1: 5.94% View 2: 4.70%
## - Fraction of zero weights: View 0: 1% View 1: 21% View 2: 5%
## - Maximum correlation between factors: 0.33
## - Factor norms: 1.72 4.42 1.39
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 42: time=0.47, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.92% View 2: 4.70%
## - Fraction of zero weights: View 0: 2% View 1: 21% View 2: 5%
## - Maximum correlation between factors: 0.33
## - Factor norms: 1.70 4.42 1.37
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 43: time=0.49, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.90% View 2: 4.70%
## - Fraction of zero weights: View 0: 2% View 1: 22% View 2: 5%
## - Maximum correlation between factors: 0.32
## - Factor norms: 1.69 4.42 1.36
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 44: time=0.49, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.88% View 2: 4.71%
## - Fraction of zero weights: View 0: 2% View 1: 22% View 2: 5%
## - Maximum correlation between factors: 0.32
## - Factor norms: 1.67 4.42 1.34
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 45: time=0.53, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.85% View 2: 4.71%
## - Fraction of zero weights: View 0: 1% View 1: 23% View 2: 5%
## - Maximum correlation between factors: 0.31
## - Factor norms: 1.66 4.42 1.33
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 46: time=0.58, ELBO=-2196987.23, deltaELBO=183.497 (0.00083517%), Factors=3
## - ELBO decomposition: Z=-1269.42 W=-43668.31 Tau=-91265.52 Y=-2060469.30 AlphaW=-314.69
## - Time spent in ELBO computation: 25.6%
## - Variance explained: View 0: 13.34% View 1: 5.83% View 2: 4.71%
## - Fraction of zero weights: View 0: 1% View 1: 23% View 2: 6%
## - Maximum correlation between factors: 0.31
## - Factor norms: 1.64 4.42 1.31
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 47: time=0.52, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.81% View 2: 4.71%
## - Fraction of zero weights: View 0: 1% View 1: 23% View 2: 6%
## - Maximum correlation between factors: 0.30
## - Factor norms: 1.63 4.42 1.30
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 48: time=0.48, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.79% View 2: 4.71%
## - Fraction of zero weights: View 0: 1% View 1: 24% View 2: 6%
## - Maximum correlation between factors: 0.30
## - Factor norms: 1.61 4.42 1.29
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 49: time=0.48, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.77% View 2: 4.71%
## - Fraction of zero weights: View 0: 1% View 1: 24% View 2: 6%
## - Maximum correlation between factors: 0.30
## - Factor norms: 1.60 4.41 1.27
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 50: time=0.47, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.75% View 2: 4.71%
## - Fraction of zero weights: View 0: 1% View 1: 24% View 2: 7%
## - Maximum correlation between factors: 0.29
## - Factor norms: 1.59 4.41 1.26
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 51: time=0.54, ELBO=-2196845.56, deltaELBO=141.670 (0.00064480%), Factors=3
## - ELBO decomposition: Z=-1265.47 W=-43556.54 Tau=-91265.47 Y=-2060443.38 AlphaW=-314.70
## - Time spent in ELBO computation: 23.4%
## - Variance explained: View 0: 13.34% View 1: 5.73% View 2: 4.72%
## - Fraction of zero weights: View 0: 1% View 1: 25% View 2: 7%
## - Maximum correlation between factors: 0.29
## - Factor norms: 1.57 4.40 1.25
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 52: time=0.43, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.71% View 2: 4.72%
## - Fraction of zero weights: View 0: 1% View 1: 25% View 2: 7%
## - Maximum correlation between factors: 0.28
## - Factor norms: 1.56 4.40 1.24
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 53: time=0.44, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.69% View 2: 4.72%
## - Fraction of zero weights: View 0: 1% View 1: 25% View 2: 7%
## - Maximum correlation between factors: 0.28
## - Factor norms: 1.55 4.39 1.22
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 54: time=0.43, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.68% View 2: 4.72%
## - Fraction of zero weights: View 0: 1% View 1: 26% View 2: 7%
## - Maximum correlation between factors: 0.27
## - Factor norms: 1.54 4.39 1.21
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 55: time=0.44, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.66% View 2: 4.72%
## - Fraction of zero weights: View 0: 1% View 1: 26% View 2: 7%
## - Maximum correlation between factors: 0.27
## - Factor norms: 1.53 4.38 1.20
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 56: time=0.54, ELBO=-2196739.26, deltaELBO=106.298 (0.00048380%), Factors=3
## - ELBO decomposition: Z=-1261.57 W=-43451.99 Tau=-91265.48 Y=-2060445.52 AlphaW=-314.70
## - Time spent in ELBO computation: 26.1%
## - Variance explained: View 0: 13.34% View 1: 5.64% View 2: 4.72%
## - Fraction of zero weights: View 0: 1% View 1: 26% View 2: 8%
## - Maximum correlation between factors: 0.27
## - Factor norms: 1.52 4.38 1.19
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 57: time=0.47, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.62% View 2: 4.72%
## - Fraction of zero weights: View 0: 1% View 1: 26% View 2: 8%
## - Maximum correlation between factors: 0.26
## - Factor norms: 1.51 4.37 1.18
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 58: time=0.42, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.61% View 2: 4.72%
## - Fraction of zero weights: View 0: 1% View 1: 27% View 2: 8%
## - Maximum correlation between factors: 0.26
## - Factor norms: 1.50 4.36 1.17
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 59: time=0.42, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.59% View 2: 4.72%
## - Fraction of zero weights: View 0: 1% View 1: 27% View 2: 8%
## - Maximum correlation between factors: 0.25
## - Factor norms: 1.49 4.35 1.16
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 60: time=0.58, Factors=3
## - Variance explained: View 0: 13.34% View 1: 5.58% View 2: 4.71%
## - Fraction of zero weights: View 0: 1% View 1: 27% View 2: 8%
## - Maximum correlation between factors: 0.25
## - Factor norms: 1.48 4.34 1.15
## - Tau per view (average): View 0: 6.69 View 1: 0.25 View 2: 0.25
##
##
## Iteration 61: time=0.62, ELBO=-2196660.34, deltaELBO=78.928 (0.00035923%), Factors=3
## - ELBO decomposition: Z=-1257.77 W=-43357.37 Tau=-91265.59 Y=-2060464.91 AlphaW=-314.69
## - Time spent in ELBO computation: 25.2%
##
## Converged!
##
##
##
## #######################
## ## Training finished ##
## #######################
##
##
## Saving model in C:\Users\qdread\AppData\Local\Temp\RtmpI1ircz/mofa_20240107-060239.hdf5...
MOFAobject
## Trained MOFA with the following characteristics:
## Number of views: 3
## Views names: scRNAseq scBSseq scATACseq
## Number of features (per view): 12145 4524 9471
## Number of groups: 1
## Groups names: group1
## Number of samples (per group): 113
## Number of factors: 3
Let us check the covariance of the OMICs layers. The plots generated
by plot_variance_explained()
are probably the most
important outputs of MOFA.
#ANALYZE RESULTS OF MOFA INTEGRATION
head(get_variance_explained(MOFAobject)$r2_total[[1]])
## scRNAseq scBSseq scATACseq
## 13.339221 5.562496 4.713696
head(get_variance_explained(MOFAobject)$r2_per_factor[[1]])
## scRNAseq scBSseq scATACseq
## Factor1 6.923169 5.49092293 4.49206829
## Factor2 3.559494 0.03377795 0.16060472
## Factor3 3.318763 0.03517270 0.06973147
get_variance_explained(MOFAobject)
## $r2_total
## $r2_total$group1
## scRNAseq scBSseq scATACseq
## 13.339221 5.562496 4.713696
##
##
## $r2_per_factor
## $r2_per_factor$group1
## scRNAseq scBSseq scATACseq
## Factor1 6.923169 5.49092293 4.49206829
## Factor2 3.559494 0.03377795 0.16060472
## Factor3 3.318763 0.03517270 0.06973147
MOFAobject@cache$variance_explained
## $r2_total
## $r2_total$group1
## scRNAseq scBSseq scATACseq
## 13.339221 5.562496 4.713696
##
##
## $r2_per_factor
## $r2_per_factor$group1
## scRNAseq scBSseq scATACseq
## Factor1 6.923169 5.49092293 4.49206829
## Factor2 3.559494 0.03377795 0.16060472
## Factor3 3.318763 0.03517270 0.06973147
plot_list <- plot_variance_explained(MOFAobject, x='view', y='factor', plot_total = T)
plot_list[[2]] #total variance
plot_list[[1]] #variance by factor
We can see that scRNAseq provides the largest variation (13%) in the total integrative OMICs scNMT data set, and scBSseq and scATACseq contribute around 5% of variation. We also observe that MOFA selected 3 Latent Factors (LFs); scRNAseq contributes to all of them while scBSseq and scATACseq contribute only to the first LF. Now let us interpret the LFs by displaying the features associated with each LF:
NumFactors<-dim(get_factors(MOFAobject)$group1)[2]
plot_weights(MOFAobject, view = "scRNAseq", factor = 1)
plot_top_weights(object = MOFAobject, view = "scRNAseq", factor = 1, nfeatures = 10)
plot_data_heatmap(object = MOFAobject, view = "scRNAseq", factor = "Factor1", features = 10, transpose = F, show_colnames=TRUE, show_rownames=TRUE,cluster_cols = TRUE)
plot_weights(MOFAobject, view = "scBSseq", factor = 1)
plot_top_weights(object = MOFAobject, view = "scBSseq", factor = 1, nfeatures = 10)
plot_data_heatmap(object = MOFAobject, view = "scBSseq", factor = "Factor1", features = 10, transpose = F, show_colnames=TRUE, show_rownames=TRUE,cluster_cols = TRUE)
plot_weights(MOFAobject, view = "scATACseq", factor = 1)
plot_top_weights(object = MOFAobject, view = "scATACseq", factor = 1, nfeatures = 10)
plot_data_heatmap(object = MOFAobject, view = "scATACseq", factor = "Factor1", features = 10, transpose = F, show_colnames=TRUE, show_rownames=TRUE,cluster_cols = TRUE)
Above we ranked by their importance and visualized features from each OMIC that contribute the most to the Factor 1 where all three OMICs (scRNAseq, scBSseq and scATACseq) are strongly correlated. Therefore one can expect those features to have something in common, i.e. they should also be correlated or involved in the same biological pathway.
Let us also display the cells in the low-dimensional latent space. for this purpose we will input MOFA factors into tSNE and UMAP, which are machine-learning dimension reduction and visualization methods. Display the cells as a consensus plot for scRNAseq, scBSseq and scATACseq:
controls<-c("EB_P1D12","EB_P1E12","EB_P1F12","EB_P1G12","EB_P2B12","EB_P2D12","EB_P2E12","EB_P2F12","EB_P2G12",
"ESC_B12","ESC_C12","ESC_D12")
colnames(scRNAseq)<-ifelse(colnames(scRNAseq)%in%controls,"CONTROL_CELL",colnames(scRNAseq))
cell_types<-matrix(unlist(strsplit(colnames(scRNAseq),"_")),ncol=2,byrow=TRUE)[,1]
sample_metadata <- data.frame(sample = samples_names(MOFAobject)[[1]],condition = cell_types)
samples_metadata(MOFAobject) <- sample_metadata
head(samples_metadata(MOFAobject), n=3)
## sample condition group
## 1 ESC_A02 ESC group1
## 2 ESC_A03 ESC group1
## 3 ESC_A04 ESC group1
set.seed(12345)
MOFAobject <- run_umap(MOFAobject, n_neighbors = round(sqrt(length(cell_types)),0), min_dist = 1)
MOFAobject <- run_tsne(MOFAobject, perplexity = round(sqrt(length(cell_types)),0))
plot_dimred(MOFAobject, method = "TSNE", dot_size = 3, color_by = "condition")
plot_dimred(MOFAobject, method = "UMAP", dot_size = 3, color_by = "condition")
We conclude that the MOFA unsupervised integrative OMICs approach can distinguish between all three types of cells: ESCs, EBs and Controls. Now let us see what MOFA factors contribute to different clusters:
plot_dimred(MOFAobject, method = "UMAP", dot_size = 3, color_by = "Factor1")
plot_dimred(MOFAobject, method = "UMAP", dot_size = 3, color_by = "Factor2")
plot_dimred(MOFAobject, method = "UMAP", dot_size = 3, color_by = "Factor3")
We can see that the ESC cluster is almost entirely driven by the MOFA Factor 1 meaning that all three OMICs (scRNAseq, scBSseq and scATACseq) contribute more or less equally into this cluster. In contrast the Control cluster is driven by the factor 2, which is mainly due to scRNAseq contribution. Finally, Factor 3 contributes more or less equally to all the three clusters, implying that scRNAseq impacts all the three clusters.
sessionInfo()
## R version 4.3.1 (2023-06-16 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19045)
##
## Matrix products: default
##
##
## locale:
## [1] LC_COLLATE=English_United States.utf8
## [2] LC_CTYPE=English_United States.utf8
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.utf8
##
## time zone: America/New_York
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] MOFA2_1.10.0 mixOmics_6.24.0 ggplot2_3.4.4 lattice_0.21-8
## [5] MASS_7.3-60
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.4 dir.expiry_1.8.0 ellipse_0.5.0
## [4] xfun_0.40 bslib_0.5.1 corrplot_0.92
## [7] ggrepel_0.9.3 rhdf5_2.44.0 rhdf5filters_1.12.1
## [10] vctrs_0.6.4 tools_4.3.1 generics_0.1.3
## [13] stats4_4.3.1 parallel_4.3.1 tibble_3.2.1
## [16] fansi_1.0.5 rARPACK_0.11-0 pkgconfig_2.0.3
## [19] pheatmap_1.0.12 Matrix_1.6-1.1 RColorBrewer_1.1-3
## [22] S4Vectors_0.38.1 lifecycle_1.0.3 farver_2.1.1
## [25] compiler_4.3.1 stringr_1.5.0 munsell_0.5.0
## [28] codetools_0.2-19 htmltools_0.5.6.1 sass_0.4.7
## [31] yaml_2.3.7 crayon_1.5.2 pillar_1.9.0
## [34] jquerylib_0.1.4 tidyr_1.3.0 uwot_0.1.16
## [37] BiocParallel_1.34.2 cachem_1.0.8 DelayedArray_0.26.7
## [40] abind_1.4-5 RSpectra_0.16-1 basilisk_1.12.1
## [43] tidyselect_1.2.0 digest_0.6.33 Rtsne_0.16
## [46] stringi_1.7.12 dplyr_1.1.3 reshape2_1.4.4
## [49] purrr_1.0.2 labeling_0.4.3 forcats_1.0.0
## [52] cowplot_1.1.1 fastmap_1.1.1 grid_4.3.1
## [55] colorspace_2.1-0 cli_3.6.1 magrittr_2.0.3
## [58] S4Arrays_1.0.6 utf8_1.2.4 corpcor_1.6.10
## [61] withr_2.5.1 filelock_1.0.2 scales_1.2.1
## [64] rmarkdown_2.25 matrixStats_1.0.0 igraph_1.5.1
## [67] reticulate_1.32.0.9001 gridExtra_2.3 png_0.1-8
## [70] HDF5Array_1.28.1 evaluate_0.22 knitr_1.44
## [73] IRanges_2.34.1 basilisk.utils_1.12.1 RcppAnnoy_0.0.21
## [76] rlang_1.1.1 Rcpp_1.0.11 glue_1.6.2
## [79] BiocGenerics_0.46.0 rstudioapi_0.15.0 jsonlite_1.8.7
## [82] Rhdf5lib_1.22.1 R6_2.5.1 plyr_1.8.9
## [85] MatrixGenerics_1.12.3