Last updated: 2018-04-25
library(mashr)
Loading required package: ashr
library(corrplot)
corrplot 0.84 loaded
source('../code/MashSource.R')
source('../code/sim_mean_sig.R')
In MeanSignal, the data was generated with signals in the first c conditions (\(c_{j,1}, \cdots, c_{j,c}\)). The contrast matrix L used there discards the last condition. The deviations are \(\hat{c}_{j,1} - \bar{\hat{c}_{j}}, \hat{c}_{j,2} - \bar{\hat{c}_{j}}, \cdots, \hat{c}_{j,R-1} - \bar{\hat{c}_{j}}\).
However, the contrast matrix L can discard any deviation from \(\hat{c}_{j,1} - \bar{\hat{c}_{j}}, \cdots, \hat{c}_{j,R} - \bar{\hat{c}_{j}}\). Let’s see whether the choice of the discarded deviation could influence the reuslt.
We run the same model with L that discard the first deviation.
set.seed(1)
R = 10
C = 2
data = sim.mean.sig(nsamp=10000, ncond=C)
L.1 = matrix(-1/R, R, R)
L.1[cbind(1:R,1:R)] = (R-1)/R
L.1 = L.1[2:R,]
row.names(L.1) = seq(2,R)
mash_data.1 = mash_set_data(Bhat=data$Chat, Shat=data$Shat)
mash_data_L.1 = mash_set_data_contrast(mash_data.1, L.1)
U.c = cov_canonical(mash_data_L.1)
# data driven
# select max
m.1by1 = mash_1by1(mash_data_L.1, alpha=1)
strong = get_significant_results(m.1by1,0.05)
# center Z
mash_data_L.center.1 = mash_data_L.1
mash_data_L.center.1$Bhat = mash_data_L.1$Bhat/mash_data_L.1$Shat # obtain z
mash_data_L.center.1$Shat = matrix(1, nrow(mash_data_L.1$Bhat),ncol(mash_data_L.1$Bhat))
mash_data_L.center.1$Bhat = apply(mash_data_L.center.1$Bhat, 2, function(x) x - mean(x))
U.pca = cov_pca(mash_data_L.center.1,2,strong)
U.ed = cov_ed(mash_data_L.center.1, U.pca, strong)
mashcontrast.model.1 = mash(mash_data_L.1, c(U.c, U.ed), algorithm.version = 'R', verbose = FALSE)
Using mashcommonbaseline
model, there are 283 discoveries. The covariance structure found here is: The correlation PCA 1 is:
mashcontrast.model.full.1 = mashcontrast.model.1
mashcontrast.model.full.1$result = mash_compute_posterior_matrices(g = mashcontrast.model.1, data = mash_data_L.1, algorithm.version = 'R', recover=TRUE)
There are 285 discoveries.
Indep.data.1 = mash_set_data(Bhat = mash_data_L.1$Bhat,
Shat = matrix(sqrt(0.5-1/(R*2)), nrow(data$Chat), R-1))
Indep.model.1 = mash(Indep.data.1, c(U.c, U.ed), algorithm.version = 'R', verbose = FALSE)
For mashIndep
model, there are 217 discoveries, which is less than the mashcommonbaseline
model. The covariance structure found here is:
The correlation for PCA2 and tPCA is:
Indep.model.full.1 = Indep.model.1
Indep.model.full.1$result = mash_compute_posterior_matrices(g = Indep.model.1, data = Indep.data.1, algorithm.version = 'R', recover=TRUE)
There are 219 discoveries.
The RRMSE plot:
We check the False Positive Rate and True Positive Rate. \[FPR = \frac{|N\cap S|}{|N|} \quad TPR = \frac{|CS\cap S|}{|T|} \]
Using this contrast L, the results from mashcommonbaseline
is better than mashIndep
.
The choice of L could influence the result.
sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.4
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] mvtnorm_1.0-7 plyr_1.8.4 assertthat_0.2.0 corrplot_0.84
[5] mashr_0.2-6 ashr_2.2-7
loaded via a namespace (and not attached):
[1] Rcpp_0.12.16 knitr_1.20
[3] magrittr_1.5 REBayes_1.3
[5] MASS_7.3-49 doParallel_1.0.11
[7] pscl_1.5.2 SQUAREM_2017.10-1
[9] lattice_0.20-35 ExtremeDeconvolution_1.3
[11] foreach_1.4.4 stringr_1.3.0
[13] tools_3.4.4 parallel_3.4.4
[15] grid_3.4.4 rmeta_3.0
[17] htmltools_0.3.6 iterators_1.0.9
[19] yaml_2.1.18 rprojroot_1.3-2
[21] digest_0.6.15 Matrix_1.2-14
[23] codetools_0.2-15 evaluate_0.10.1
[25] rmarkdown_1.9 stringi_1.1.7
[27] compiler_3.4.4 Rmosek_8.0.69
[29] backports_1.1.2 truncnorm_1.0-8
This R Markdown site was created with workflowr