Last updated: 2018-05-12
workflowr checks: (Click a bullet for more information) ✔ R Markdown file: up-to-date 
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
 ✔ Environment: empty 
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
 ✔ Seed: 
set.seed(12345) 
The command set.seed(12345) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
 ✔ Session information: recorded 
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
 ✔ Repository version: ddf9062 
wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    analysis/.DS_Store
    Ignored:    analysis/BH_robustness_cache/
    Ignored:    analysis/FDR_Null_cache/
    Ignored:    analysis/FDR_null_betahat_cache/
    Ignored:    analysis/Rmosek_cache/
    Ignored:    analysis/StepDown_cache/
    Ignored:    analysis/alternative2_cache/
    Ignored:    analysis/alternative_cache/
    Ignored:    analysis/ash_gd_cache/
    Ignored:    analysis/average_cor_gtex_2_cache/
    Ignored:    analysis/average_cor_gtex_cache/
    Ignored:    analysis/brca_cache/
    Ignored:    analysis/cash_deconv_cache/
    Ignored:    analysis/cash_fdr_1_cache/
    Ignored:    analysis/cash_fdr_2_cache/
    Ignored:    analysis/cash_fdr_3_cache/
    Ignored:    analysis/cash_fdr_4_cache/
    Ignored:    analysis/cash_fdr_5_cache/
    Ignored:    analysis/cash_fdr_6_cache/
    Ignored:    analysis/cash_plots_cache/
    Ignored:    analysis/cash_sim_1_cache/
    Ignored:    analysis/cash_sim_2_cache/
    Ignored:    analysis/cash_sim_3_cache/
    Ignored:    analysis/cash_sim_4_cache/
    Ignored:    analysis/cash_sim_5_cache/
    Ignored:    analysis/cash_sim_6_cache/
    Ignored:    analysis/cash_sim_7_cache/
    Ignored:    analysis/correlated_z_2_cache/
    Ignored:    analysis/correlated_z_3_cache/
    Ignored:    analysis/correlated_z_cache/
    Ignored:    analysis/create_null_cache/
    Ignored:    analysis/cutoff_null_cache/
    Ignored:    analysis/design_matrix_2_cache/
    Ignored:    analysis/design_matrix_cache/
    Ignored:    analysis/diagnostic_ash_cache/
    Ignored:    analysis/diagnostic_correlated_z_2_cache/
    Ignored:    analysis/diagnostic_correlated_z_3_cache/
    Ignored:    analysis/diagnostic_correlated_z_cache/
    Ignored:    analysis/diagnostic_plot_2_cache/
    Ignored:    analysis/diagnostic_plot_cache/
    Ignored:    analysis/efron_leukemia_cache/
    Ignored:    analysis/fitting_normal_cache/
    Ignored:    analysis/gaussian_derivatives_2_cache/
    Ignored:    analysis/gaussian_derivatives_3_cache/
    Ignored:    analysis/gaussian_derivatives_4_cache/
    Ignored:    analysis/gaussian_derivatives_5_cache/
    Ignored:    analysis/gaussian_derivatives_cache/
    Ignored:    analysis/gd-ash_cache/
    Ignored:    analysis/gd_delta_cache/
    Ignored:    analysis/gd_lik_2_cache/
    Ignored:    analysis/gd_lik_cache/
    Ignored:    analysis/gd_w_cache/
    Ignored:    analysis/knockoff_10_cache/
    Ignored:    analysis/knockoff_2_cache/
    Ignored:    analysis/knockoff_3_cache/
    Ignored:    analysis/knockoff_4_cache/
    Ignored:    analysis/knockoff_5_cache/
    Ignored:    analysis/knockoff_6_cache/
    Ignored:    analysis/knockoff_7_cache/
    Ignored:    analysis/knockoff_8_cache/
    Ignored:    analysis/knockoff_9_cache/
    Ignored:    analysis/knockoff_cache/
    Ignored:    analysis/knockoff_var_cache/
    Ignored:    analysis/marginal_z_alternative_cache/
    Ignored:    analysis/marginal_z_cache/
    Ignored:    analysis/mosek_reg_2_cache/
    Ignored:    analysis/mosek_reg_4_cache/
    Ignored:    analysis/mosek_reg_5_cache/
    Ignored:    analysis/mosek_reg_6_cache/
    Ignored:    analysis/mosek_reg_cache/
    Ignored:    analysis/pihat0_null_cache/
    Ignored:    analysis/plot_diagnostic_cache/
    Ignored:    analysis/poster_obayes17_cache/
    Ignored:    analysis/real_data_simulation_2_cache/
    Ignored:    analysis/real_data_simulation_3_cache/
    Ignored:    analysis/real_data_simulation_4_cache/
    Ignored:    analysis/real_data_simulation_5_cache/
    Ignored:    analysis/real_data_simulation_cache/
    Ignored:    analysis/rmosek_primal_dual_2_cache/
    Ignored:    analysis/rmosek_primal_dual_cache/
    Ignored:    analysis/seqgendiff_cache/
    Ignored:    analysis/simulated_correlated_null_2_cache/
    Ignored:    analysis/simulated_correlated_null_3_cache/
    Ignored:    analysis/simulated_correlated_null_cache/
    Ignored:    analysis/simulation_real_se_2_cache/
    Ignored:    analysis/simulation_real_se_cache/
    Ignored:    analysis/smemo_2_cache/
    Ignored:    data/LSI/
    Ignored:    docs/.DS_Store
    Ignored:    docs/figure/.DS_Store
    Ignored:    output/fig/
Unstaged changes:
    Deleted:    analysis/cash_plots_fdp.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
Last updated: 2018-05-12
Code version: ddf9062
From previous simulation on FDR on correlated null we get a sense that Benjamini-Hochberg is surprisingly robust under correlation. At the same time, there are cases when BH erroneously produces a large number of false discoveries. Therefore, we take a close look at those data sets BH fails on.
z = read.table("../output/z_null_liver_777.txt")
p = read.table("../output/p_null_liver_777.txt")
n = dim(z)
pihat0 = read.table("../output/pihat0_z_null_liver_777.txt")
pihat0 = as.numeric(t(pihat0))
fd.bh = c()
for (i in 1 : n[1]) {
  p_BH = p.adjust(p[i, ], method = "BH")
  fd.bh[i] = sum(p_BH <= 0.05)
}
We take a look that all data sets where BH erroneously produces at least one false discovery (green), compared with \(\hat\pi_0\) produced by ash on these same data sets.
fdn = sum(fd.bh >= 1)
I = order(fd.bh, decreasing = TRUE)[1:fdn]
x = seq(-10, 10, 0.01)
y = dnorm(x)
k = 1
m = n[2]
for (j in I) {
  cat("N0.", k, ": Data Set", j, "; Number of False Discoveries:", fd.bh[j], "; pihat0 =", pihat0[j])
  hist(as.numeric(z[j, ]), xlab = "z scores", freq = FALSE, ylim = c(0, 0.45), nclass = 100, main = "10000 z scores, 100 bins")
  lines(x, y, col = "red")
  qqnorm(as.numeric(z[j, ]), main = "Normal Q-Q plot for z scores")
  abline(0, 1)
  plot(sort(as.numeric(p[j, ])), xlab = "Order", ylab = "Ordered p value", main = "All p values")
  abline(0, 1 / m, col = "blue")
  abline(0, 0.05 / m, col = "red")
  legend("top", lty = 1, col = c("blue", "red"), c("Uniform", "BH, FDR = 0.05"))
  
  pj = sort(as.numeric(p[j, ]))
  plot(pj[1:max(100, fd.bh[j])], xlab = "Order", ylab = "Ordered p value", main = "Zoom-in to 100 smallest p values", ylim = c(0, pj[max(100, fd.bh[j])]))
  abline(0, 1 / m, col = "blue")
  points(pj[1:fd.bh[j]], col = "green", pch = 19)
  abline(0, 0.05 / m, col = "red")
  legend("top", lty = 1, col = c("blue", "red"), c("Uniform", "BH, FDR = 0.05"))
  
  plot(pj[pj <= 0.01], xlab = "Order", ylab = "Ordered p value", main = "Zoom-in to p values <= 0.01", ylim = c(0, 0.01))
  abline(0, 1 / m, col = "blue")
  points(pj[1:fd.bh[j]], col = "green", pch = 19)
  abline(0, 0.05 / m, col = "red")
  legend("top", lty = 1, col = c("blue", "red"), c("Uniform", "BH, FDR = 0.05"))
  
  plot(pj[pj <= 0.05], xlab = "Order", ylab = "Ordered p value", main = "Zoom-in to p values <= 0.05", ylim = c(0, 0.05))
  abline(0, 1 / m, col = "blue")
  points(pj[1:fd.bh[j]], col = "green", pch = 19)
  abline(0, 0.05 / m, col = "red")
  legend("top", lty = 1, col = c("blue", "red"), c("Uniform", "BH, FDR = 0.05"))
  
  k = k + 1
}
N0. 1 : Data Set 355 ; Number of False Discoveries: 639 ; pihat0 = 0.04750946






N0. 2 : Data Set 327 ; Number of False Discoveries: 489 ; pihat0 = 0.1522796






N0. 3 : Data Set 23 ; Number of False Discoveries: 408 ; pihat0 = 0.03139009






N0. 4 : Data Set 122 ; Number of False Discoveries: 331 ; pihat0 = 0.04301549






N0. 5 : Data Set 783 ; Number of False Discoveries: 170 ; pihat0 = 0.06208376






N0. 6 : Data Set 749 ; Number of False Discoveries: 114 ; pihat0 = 0.02583276






N0. 7 : Data Set 724 ; Number of False Discoveries: 79 ; pihat0 = 0.01606004






N0. 8 : Data Set 56 ; Number of False Discoveries: 35 ; pihat0 = 0.04829662






N0. 9 : Data Set 840 ; Number of False Discoveries: 28 ; pihat0 = 0.02316832






N0. 10 : Data Set 858 ; Number of False Discoveries: 16 ; pihat0 = 0.05110069






N0. 11 : Data Set 771 ; Number of False Discoveries: 12 ; pihat0 = 0.04316169






N0. 12 : Data Set 389 ; Number of False Discoveries: 11 ; pihat0 = 0.03156173






N0. 13 : Data Set 485 ; Number of False Discoveries: 9 ; pihat0 = 0.04794909






N0. 14 : Data Set 77 ; Number of False Discoveries: 7 ; pihat0 = 0.05151585






N0. 15 : Data Set 503 ; Number of False Discoveries: 7 ; pihat0 = 0.1545398






N0. 16 : Data Set 984 ; Number of False Discoveries: 7 ; pihat0 = 0.04567195






N0. 17 : Data Set 360 ; Number of False Discoveries: 6 ; pihat0 = 0.0308234






N0. 18 : Data Set 522 ; Number of False Discoveries: 4 ; pihat0 = 0.02083846






N0. 19 : Data Set 51 ; Number of False Discoveries: 3 ; pihat0 = 0.2435437






N0. 20 : Data Set 316 ; Number of False Discoveries: 3 ; pihat0 = 0.2961358






N0. 21 : Data Set 663 ; Number of False Discoveries: 3 ; pihat0 = 0.2818026






N0. 22 : Data Set 274 ; Number of False Discoveries: 2 ; pihat0 = 0.08580661






N0. 23 : Data Set 901 ; Number of False Discoveries: 2 ; pihat0 = 0.9997959






N0. 24 : Data Set 912 ; Number of False Discoveries: 2 ; pihat0 = 0.106048






N0. 25 : Data Set 22 ; Number of False Discoveries: 1 ; pihat0 = 0.03271534






N0. 26 : Data Set 31 ; Number of False Discoveries: 1 ; pihat0 = 0.1874129






N0. 27 : Data Set 187 ; Number of False Discoveries: 1 ; pihat0 = 0.7071115






N0. 28 : Data Set 248 ; Number of False Discoveries: 1 ; pihat0 = 0.5095841






N0. 29 : Data Set 269 ; Number of False Discoveries: 1 ; pihat0 = 0.02787007






N0. 30 : Data Set 285 ; Number of False Discoveries: 1 ; pihat0 = 0.4624418






N0. 31 : Data Set 403 ; Number of False Discoveries: 1 ; pihat0 = 0.0531628






N0. 32 : Data Set 483 ; Number of False Discoveries: 1 ; pihat0 = 0.9998824






N0. 33 : Data Set 501 ; Number of False Discoveries: 1 ; pihat0 = 0.1090034






N0. 34 : Data Set 530 ; Number of False Discoveries: 1 ; pihat0 = 0.4743284






N0. 35 : Data Set 574 ; Number of False Discoveries: 1 ; pihat0 = 0.5260185






N0. 36 : Data Set 575 ; Number of False Discoveries: 1 ; pihat0 = 0.0946117






N0. 37 : Data Set 643 ; Number of False Discoveries: 1 ; pihat0 = 0.3321723






N0. 38 : Data Set 769 ; Number of False Discoveries: 1 ; pihat0 = 0.1187511






N0. 39 : Data Set 778 ; Number of False Discoveries: 1 ; pihat0 = 0.07619716






N0. 40 : Data Set 817 ; Number of False Discoveries: 1 ; pihat0 = 0.999989






N0. 41 : Data Set 837 ; Number of False Discoveries: 1 ; pihat0 = 0.8625374






N0. 42 : Data Set 897 ; Number of False Discoveries: 1 ; pihat0 = 0.8465903






N0. 43 : Data Set 923 ; Number of False Discoveries: 1 ; pihat0 = 0.03239883






N0. 44 : Data Set 955 ; Number of False Discoveries: 1 ; pihat0 = 0.999845






N0. 45 : Data Set 971 ; Number of False Discoveries: 1 ; pihat0 = 0.2387909






N0. 46 : Data Set 997 ; Number of False Discoveries: 1 ; pihat0 = 0.09560788






sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.4
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
loaded via a namespace (and not attached):
 [1] workflowr_1.0.1   Rcpp_0.12.16      digest_0.6.15    
 [4] rprojroot_1.3-2   R.methodsS3_1.7.1 backports_1.1.2  
 [7] git2r_0.21.0      magrittr_1.5      evaluate_0.10.1  
[10] stringi_1.1.6     whisker_0.3-2     R.oo_1.21.0      
[13] R.utils_2.6.0     rmarkdown_1.9     tools_3.4.3      
[16] stringr_1.3.0     yaml_2.1.18       compiler_3.4.3   
[19] htmltools_0.3.6   knitr_1.20       
This reproducible R Markdown analysis was created with workflowr 1.0.1