Last updated: 2018-05-12

workflowr checks: (Click a bullet for more information)
  • R Markdown file: up-to-date

    Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

  • Environment: empty

    Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

  • Seed: set.seed(12345)

    The command set.seed(12345) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

  • Session information: recorded

    Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

  • Repository version: ddf9062

    Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

    Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
    
    Ignored files:
        Ignored:    .DS_Store
        Ignored:    .Rhistory
        Ignored:    .Rproj.user/
        Ignored:    analysis/.DS_Store
        Ignored:    analysis/BH_robustness_cache/
        Ignored:    analysis/FDR_Null_cache/
        Ignored:    analysis/FDR_null_betahat_cache/
        Ignored:    analysis/Rmosek_cache/
        Ignored:    analysis/StepDown_cache/
        Ignored:    analysis/alternative2_cache/
        Ignored:    analysis/alternative_cache/
        Ignored:    analysis/ash_gd_cache/
        Ignored:    analysis/average_cor_gtex_2_cache/
        Ignored:    analysis/average_cor_gtex_cache/
        Ignored:    analysis/brca_cache/
        Ignored:    analysis/cash_deconv_cache/
        Ignored:    analysis/cash_fdr_1_cache/
        Ignored:    analysis/cash_fdr_2_cache/
        Ignored:    analysis/cash_fdr_3_cache/
        Ignored:    analysis/cash_fdr_4_cache/
        Ignored:    analysis/cash_fdr_5_cache/
        Ignored:    analysis/cash_fdr_6_cache/
        Ignored:    analysis/cash_plots_cache/
        Ignored:    analysis/cash_sim_1_cache/
        Ignored:    analysis/cash_sim_2_cache/
        Ignored:    analysis/cash_sim_3_cache/
        Ignored:    analysis/cash_sim_4_cache/
        Ignored:    analysis/cash_sim_5_cache/
        Ignored:    analysis/cash_sim_6_cache/
        Ignored:    analysis/cash_sim_7_cache/
        Ignored:    analysis/correlated_z_2_cache/
        Ignored:    analysis/correlated_z_3_cache/
        Ignored:    analysis/correlated_z_cache/
        Ignored:    analysis/create_null_cache/
        Ignored:    analysis/cutoff_null_cache/
        Ignored:    analysis/design_matrix_2_cache/
        Ignored:    analysis/design_matrix_cache/
        Ignored:    analysis/diagnostic_ash_cache/
        Ignored:    analysis/diagnostic_correlated_z_2_cache/
        Ignored:    analysis/diagnostic_correlated_z_3_cache/
        Ignored:    analysis/diagnostic_correlated_z_cache/
        Ignored:    analysis/diagnostic_plot_2_cache/
        Ignored:    analysis/diagnostic_plot_cache/
        Ignored:    analysis/efron_leukemia_cache/
        Ignored:    analysis/fitting_normal_cache/
        Ignored:    analysis/gaussian_derivatives_2_cache/
        Ignored:    analysis/gaussian_derivatives_3_cache/
        Ignored:    analysis/gaussian_derivatives_4_cache/
        Ignored:    analysis/gaussian_derivatives_5_cache/
        Ignored:    analysis/gaussian_derivatives_cache/
        Ignored:    analysis/gd-ash_cache/
        Ignored:    analysis/gd_delta_cache/
        Ignored:    analysis/gd_lik_2_cache/
        Ignored:    analysis/gd_lik_cache/
        Ignored:    analysis/gd_w_cache/
        Ignored:    analysis/knockoff_10_cache/
        Ignored:    analysis/knockoff_2_cache/
        Ignored:    analysis/knockoff_3_cache/
        Ignored:    analysis/knockoff_4_cache/
        Ignored:    analysis/knockoff_5_cache/
        Ignored:    analysis/knockoff_6_cache/
        Ignored:    analysis/knockoff_7_cache/
        Ignored:    analysis/knockoff_8_cache/
        Ignored:    analysis/knockoff_9_cache/
        Ignored:    analysis/knockoff_cache/
        Ignored:    analysis/knockoff_var_cache/
        Ignored:    analysis/marginal_z_alternative_cache/
        Ignored:    analysis/marginal_z_cache/
        Ignored:    analysis/mosek_reg_2_cache/
        Ignored:    analysis/mosek_reg_4_cache/
        Ignored:    analysis/mosek_reg_5_cache/
        Ignored:    analysis/mosek_reg_6_cache/
        Ignored:    analysis/mosek_reg_cache/
        Ignored:    analysis/pihat0_null_cache/
        Ignored:    analysis/plot_diagnostic_cache/
        Ignored:    analysis/poster_obayes17_cache/
        Ignored:    analysis/real_data_simulation_2_cache/
        Ignored:    analysis/real_data_simulation_3_cache/
        Ignored:    analysis/real_data_simulation_4_cache/
        Ignored:    analysis/real_data_simulation_5_cache/
        Ignored:    analysis/real_data_simulation_cache/
        Ignored:    analysis/rmosek_primal_dual_2_cache/
        Ignored:    analysis/rmosek_primal_dual_cache/
        Ignored:    analysis/seqgendiff_cache/
        Ignored:    analysis/simulated_correlated_null_2_cache/
        Ignored:    analysis/simulated_correlated_null_3_cache/
        Ignored:    analysis/simulated_correlated_null_cache/
        Ignored:    analysis/simulation_real_se_2_cache/
        Ignored:    analysis/simulation_real_se_cache/
        Ignored:    analysis/smemo_2_cache/
        Ignored:    data/LSI/
        Ignored:    docs/.DS_Store
        Ignored:    docs/figure/.DS_Store
        Ignored:    output/fig/
    
    Unstaged changes:
        Deleted:    analysis/cash_plots_fdp.Rmd
    
    
    Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
Expand here to see past versions:
    File Version Author Date Message
    rmd cc0ab83 Lei Sun 2018-05-11 update
    html cd5f166 LSun 2018-04-16 Build site.
    rmd fb08738 Lei Sun 2018-04-15 bugs
    html e6e20a4 LSun 2018-04-15 Build site.
    rmd 4e853c8 LSun 2018-04-15 wflow_publish(“analysis/knockoff_9.rmd”)
    html d93edc5 LSun 2018-04-14 Build site.
    rmd c991e63 Lei Sun 2018-04-13 s value
    rmd c766b80 LSun 2018-04-13 add lfsr
    html 0582be3 LSun 2018-04-12 Build site.
    rmd 96579e4 LSun 2018-04-12 wflow_publish(“analysis/knockoff_9.rmd”)
    html 4b179a9 LSun 2018-04-05 Build site.
    rmd 20ea328 LSun 2018-04-05 wflow_publish(c(“analysis/knockoff_7.rmd”, “analysis/knockoff_8.rmd”,
    rmd c6211ab Lei Sun 2018-04-03 knockoff vs ash

Introduction

The true \(\beta\) are simulated as \(\beta \sim \pi_0\delta_0 + (1 - \pi_0)N(0, \sigma_\beta^2)\).

varbvs.get.lfsr <- function (fit) {

# For each variable, and each hyperparameter setting, get the
# posterior probability that the regression coefficient is exactly
# zero.
p0 <- 1 - fit$alpha

# For each variable, and for each hyperparameter setting, get the
# posterior probability that the regression coefficient is negative.
pn <- with(fit,alpha * pnorm(0,mu,sqrt(s)))

# For each variable, and for each hyperparameter setting, ompute the
# local false sign rate (LFSR) following the formula given in
# Matthew's Biostatistics paper, "False discovery rates: a new deal".
p        <- nrow(fit$alpha)
k          <- ncol(fit$alpha)
lfsr     <- matrix(0,p,k)
b        <- pn > 0.5*(1 - p0)
lfsr[b]  <- 1 - pn[b]
lfsr[!b] <- p0[!b] + pn[!b]

# Average the average LFSR over the hyperparameter settings, weighted
# by the probability of each hyperparameter setting.
lfsr <-    c(lfsr %*% fit$w)

return(lfsr)
}

\(n > p\)

n <- 2000
p <- 1000
k <- 200
m <- 100
q <- 0.1

Independent design

\(X_{n \times p}\) has independent columns simulated from \(N(0, (1/\sqrt n)^2)\) so they are roughly normalized.

Expand here to see past versions of unnamed-chunk-6-1.png:
Version Author Date
d93edc5 LSun 2018-04-14

Expand here to see past versions of unnamed-chunk-6-2.png:
Version Author Date
d93edc5 LSun 2018-04-14

Local correlation / AR model for \(X\)

\(X_{n \times p}\) has correlation \(\Sigma_{ij} = \rho^{|i - j|}\). Each row is independently \(N(0, \frac1n\Sigma)\).

Expand here to see past versions of unnamed-chunk-9-1.png:
Version Author Date
d93edc5 LSun 2018-04-14

Expand here to see past versions of unnamed-chunk-9-2.png:
Version Author Date
d93edc5 LSun 2018-04-14

Factor Model for \(X\)

Expand here to see past versions of unnamed-chunk-12-1.png:
Version Author Date
cd5f166 LSun 2018-04-16
d93edc5 LSun 2018-04-14

Expand here to see past versions of unnamed-chunk-12-2.png:
Version Author Date
cd5f166 LSun 2018-04-16
d93edc5 LSun 2018-04-14

Factor Model for \(\hat\beta\)

Expand here to see past versions of unnamed-chunk-15-1.png:
Version Author Date
cd5f166 LSun 2018-04-16
d93edc5 LSun 2018-04-14

Expand here to see past versions of unnamed-chunk-15-2.png:
Version Author Date
cd5f166 LSun 2018-04-16
d93edc5 LSun 2018-04-14

Observation

  1. Model-\(X\) knockoff is very powerful.
  2. Using estimated distribution of \(X\) rather than the true distribution hurts the power of Model-\(X\) knockoff.
  3. The power of Model-\(X\) knockoff using estimated distributio of \(X\) is on par with that of ASH and BH, probably because the presence of small signals makes knockoff less powerful.
  4. Sometimes equi is better than SDP when generating knockoffs, as shown in previous simulations using factor model for \(X\).

\(n < p\)

n <- 300
p <- 1000
k <- 200
m <- 100
q <- 0.1

Independent design

Cov.X <- diag(1 / n, p)

Expand here to see past versions of unnamed-chunk-19-1.png:
Version Author Date
cd5f166 LSun 2018-04-16

Expand here to see past versions of unnamed-chunk-19-2.png:
Version Author Date
cd5f166 LSun 2018-04-16

Local correlation design

Expand here to see past versions of unnamed-chunk-22-1.png:
Version Author Date
cd5f166 LSun 2018-04-16

Expand here to see past versions of unnamed-chunk-22-2.png:
Version Author Date
cd5f166 LSun 2018-04-16

Factor Model for \(X\)

Expand here to see past versions of unnamed-chunk-25-1.png:
Version Author Date
cd5f166 LSun 2018-04-16

Expand here to see past versions of unnamed-chunk-25-2.png:
Version Author Date
cd5f166 LSun 2018-04-16

Factor Model for \(\hat\beta\)

Expand here to see past versions of unnamed-chunk-28-1.png:
Version Author Date
cd5f166 LSun 2018-04-16

Expand here to see past versions of unnamed-chunk-28-2.png:
Version Author Date
cd5f166 LSun 2018-04-16

Session information

sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.4

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] workflowr_1.0.1   Rcpp_0.12.16      digest_0.6.15    
 [4] rprojroot_1.3-2   R.methodsS3_1.7.1 backports_1.1.2  
 [7] git2r_0.21.0      magrittr_1.5      evaluate_0.10.1  
[10] stringi_1.1.6     whisker_0.3-2     R.oo_1.21.0      
[13] R.utils_2.6.0     rmarkdown_1.9     tools_3.4.3      
[16] stringr_1.3.0     yaml_2.1.18       compiler_3.4.3   
[19] htmltools_0.3.6   knitr_1.20       



This reproducible R Markdown analysis was created with workflowr 1.0.1