<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta charset="utf-8"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <meta name="generator" content="pandoc" /> <title>Truncated Adaptive Shrinkage (truncash)</title> <script src="site_libs/jquery-1.11.3/jquery.min.js"></script> <meta name="viewport" content="width=device-width, initial-scale=1" /> <link href="site_libs/bootstrap-3.3.5/css/cosmo.min.css" rel="stylesheet" /> <script src="site_libs/bootstrap-3.3.5/js/bootstrap.min.js"></script> <script src="site_libs/bootstrap-3.3.5/shim/html5shiv.min.js"></script> <script src="site_libs/bootstrap-3.3.5/shim/respond.min.js"></script> <script src="site_libs/navigation-1.1/tabsets.js"></script> <link href="site_libs/highlightjs-1.1/textmate.css" rel="stylesheet" /> <script src="site_libs/highlightjs-1.1/highlight.js"></script> <link href="site_libs/font-awesome-4.5.0/css/font-awesome.min.css" rel="stylesheet" /> <style type="text/css">code{white-space: pre;}</style> <style type="text/css"> pre:not([class]) { background-color: white; } </style> <script type="text/javascript"> if (window.hljs && document.readyState && document.readyState === "complete") { window.setTimeout(function() { hljs.initHighlighting(); }, 0); } </script> <style type="text/css"> h1 { font-size: 34px; } h1.title { font-size: 38px; } h2 { font-size: 30px; } h3 { font-size: 24px; } h4 { font-size: 18px; } h5 { font-size: 16px; } h6 { font-size: 12px; } .table th:not([align]) { text-align: left; } </style> </head> <body> <style type = "text/css"> .main-container { max-width: 940px; margin-left: auto; margin-right: auto; } code { color: inherit; background-color: rgba(0, 0, 0, 0.04); } img { max-width:100%; height: auto; } .tabbed-pane { padding-top: 12px; } button.code-folding-btn:focus { outline: none; } </style> <style type="text/css"> /* padding for bootstrap navbar */ body { padding-top: 51px; padding-bottom: 40px; } /* offset scroll position for anchor links (for fixed navbar) */ .section h1 { padding-top: 56px; margin-top: -56px; } .section h2 { padding-top: 56px; margin-top: -56px; } .section h3 { padding-top: 56px; margin-top: -56px; } .section h4 { padding-top: 56px; margin-top: -56px; } .section h5 { padding-top: 56px; margin-top: -56px; } .section h6 { padding-top: 56px; margin-top: -56px; } </style> <script> // manage active state of menu based on current page $(document).ready(function () { // active menu anchor href = window.location.pathname href = href.substr(href.lastIndexOf('/') + 1) if (href === "") href = "index.html"; var menuAnchor = $('a[href="' + href + '"]'); // mark it active menuAnchor.parent().addClass('active'); // if it's got a parent navbar menu mark it active as well menuAnchor.closest('li.dropdown').addClass('active'); }); </script> <div class="container-fluid main-container"> <!-- tabsets --> <script> $(document).ready(function () { window.buildTabsets("TOC"); }); </script> <!-- code folding --> <div class="navbar navbar-default navbar-fixed-top" role="navigation"> <div class="container"> <div class="navbar-header"> <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar"> <span class="icon-bar"></span> <span class="icon-bar"></span> <span class="icon-bar"></span> </button> <a class="navbar-brand" href="index.html">truncash</a> </div> <div id="navbar" class="navbar-collapse collapse"> <ul class="nav navbar-nav"> <li> <a href="index.html">Home</a> </li> <li> <a href="about.html">About</a> </li> <li> <a href="license.html">License</a> </li> </ul> <ul class="nav navbar-nav navbar-right"> <li> <a href="https://github.com/LSun/truncash"> <span class="fa fa-github"></span> </a> </li> </ul> </div><!--/.nav-collapse --> </div><!--/.container --> </div><!--/.navbar --> <div class="fluid-row" id="header"> <h1 class="title toc-ignore">Truncated Adaptive Shrinkage (<code>truncash</code>)</h1> </div> <p><code>truncash</code> (Truncated ASH) is an exploratory project with Matthew, built on <a href="https://github.com/stephens999/ashr"><code>ashr</code></a>.</p> <ul> <li><a href="voom_null.html">Matthew’s initial observation on null, correlated data</a></li> </ul> <p>Matthew did a quick investigation of the p values and z scores obtained for simulated null data (using just voom transform, no correction) from real RNA-seq data of <a href="http://www.gtexportal.org/home/">GTEx</a>. Here is what he found.</p> <p>“I found something that I hadn’t realized, although is obvious in hindsight: although you sometimes see inflation under null of <span class="math inline">\(p\)</span>-values/<span class="math inline">\(z\)</span>-scores, the most extreme values are not inflated compared with expectations (and tend to be deflated). That is the histograms of <span class="math inline">\(p\)</span>-values that show inflation near <span class="math inline">\(0\)</span> (and deflation near <span class="math inline">\(1\)</span>) actually hide something different going on in the very left hand side near <span class="math inline">\(0\)</span>. The qq-plots are clearer… showing most extreme values are deflated, or not inflated. This is expected under positive correlation i think. For example, if all <span class="math inline">\(z\)</span>-scores were the same (complete correlation), then most extreme of n would just be <span class="math inline">\(N(0,1)\)</span>. but if independent the most extreme of n would have longer tails…”</p> <p>Matthew’s initial observation inspired this project. If under positive correlation, the most extreme tend to be not inflated, maybe we can use them to control the false discoveries. Meanwhile, if the moderate are more prone to inflation due to correlation, maybe it’s better to make only partial use of their information.</p> <ul> <li><a href="ExtremeOccurrence.html">Occurrence of extreme observations</a></li> </ul> <p>As <a href="https://galton.uchicago.edu/~stein/">Prof. Michael Stein</a> pointed during a conversation with <a href="http://stephenslab.uchicago.edu/">Matthew</a>, if the marginal distribution is correct then the expected number exceeding any threshold should be correct. So if the tail is “usually”" deflated, it should be that with some small probability there are many large <span class="math inline">\(z\)</span>-scores (even in the tail). Therefore, if “on average” we have the right number of large <span class="math inline">\(z\)</span>-scores/small <span class="math inline">\(p\)</span>-values, and “usually” we have too few, then “rarely” we should have too many. A simulation is run to check this intuition.</p> <ul> <li><a href="StepDown.html">Two FWER-controlling procedures on correlated null</a></li> </ul> <p>In order to understand the behavior of <span class="math inline">\(p\)</span>-values of top expressed, correlated genes under the global null, simulated from GTEx data, we apply two FWER-controlling multiple comparison procedures, Holm’s “step-down” ([Holm 1979]) and Hochberg’s “step-up.” ([Hochberg 1988])</p> <ul> <li><p><a href="truncash.html"><code>truncash</code> Model and first simulations</a></p></li> <li><p><a href="nullpipeline.html">Pipeline for simulating null data</a></p></li> </ul> <p>Using a toy model to examine and document the pipeline to simulate null summary statistics at each step, including <code>edgeR::calcNormFactors</code>, <code>limma::voom</code>, <code>limma::lmFit</code>, <code>limma::eBayes</code>.</p> <ul> <li><a href="FDR_Null.html">FDR on Null, Part 1</a></li> <li><a href="FDR_null_betahat.html">FDR on Null, Part 2</a></li> </ul> <p>Apply two FDR-controlling procedures, BH and BY, as well as two <span class="math inline">\(s\)</span> value models, <code>ash</code> and <code>truncash</code> to the simulated, correlated null data, and compare the numbers of false discoveries (by definition, all discoveries should be false) obtained. Part 1 uses <span class="math inline">\(z\)</span> scores only, Part 2 uses <span class="math inline">\(\hat \beta\)</span> and moderated <span class="math inline">\(\hat s\)</span>.</p> <ul> <li><a href="pihat0_null.html"><span class="math inline">\(\hat\pi_0\)</span> estimated in correlated global null</a></li> </ul> <p><span class="math inline">\(\hat\pi_0\)</span> estimated by <code>ash</code> and <code>truncash</code> with <span class="math inline">\(T = 1.96\)</span> on correlated global null data simulated from GTEx/Liver. Ideally they should be close to <span class="math inline">\(1\)</span>.</p> <ul> <li><a href="cutoff_null.html">Ordered <span class="math inline">\(p\)</span> values vs critical values</a></li> </ul> <p>For various FWER / FDR controlling procedures, and for <code>truncash</code>, what matters the most is the behavior of the most extreme observations. Here these extreme <span class="math inline">\(p\)</span> values are plotted along with common critical values used by various procedures, in order to shed light on their behavior.</p> <p>It’s very exploratory. May be related to Extreme Value Thoery and Concentration of Measure. To be continued.</p> <ul> <li><a href="SingleExtOb.html">Single most extreme observation</a></li> </ul> <p>What will happen if we allow the threshold in <code>truncash</code> dependent on data? Let’s start from the case when we only know the single most extreme observation.</p> <ul> <li><a href="t-likelihood.html">Handling <span class="math inline">\(t\)</span> likelihood</a></li> </ul> <p>When moving to <span class="math inline">\(t\)</span> likelihood, or in other words, when taking randomness of <span class="math inline">\(\hat s\)</span> into consideration, things get trickier. Here we review several techiniques currently used in Matthew’s lab, regarding <span class="math inline">\(t\)</span> likelihood and uniform mixture priors, based on a discussion with Matthew.</p> <ul> <li><a href="correlated_z.html">Histogram of correlated <span class="math inline">\(z\)</span> scores, random data sets</a></li> <li><a href="correlated_z_2.html">Histogram of correlated <span class="math inline">\(z\)</span> scores, <code>ash</code>-hostile data sets</a></li> <li><a href="correlated_z_3.html">Histogram of correlated <span class="math inline">\(z\)</span> scores, <code>BH</code>-hostile data sets</a></li> </ul> <p>An implicit key question of this inquiry is: what’s the behavior of <span class="math inline">\(z\)</span> scores under dependency? We take a look at their histograms. First for randomly sampled data sets. Second for those most “hostile” to <code>ash</code>. Third for those most “hostile” to <code>BH</code>. The bottom line is we are reproducing what Efron observed in microarray data, that “the theoretical null may fail” in different ways. Now the key questions are</p> <ol style="list-style-type: decimal"> <li><p>Why the theoretical null may fail? What does it mean by <strong>correlation</strong>?</p></li> <li><p>Can <code>truncash</code> make <code>ash</code> more robust against some of the foes that make the theoretical null fail?</p></li> <li><p>Generally, how robust is empirical Bayes? Is empirical Bayes robust or non-robust to certain kinds of correlation?</p></li> </ol> <ul> <li><a href="gaussian_derivatives.html">Fitting empirical null with Gaussian derivatives: Theory</a></li> <li><a href="gaussian_derivatives_2.html">Fitting empirical null with Gaussian derivatives: Examples</a></li> <li><a href="gaussian_derivatives_3.html">Fitting empirical null with Gaussian derivatives: Numerical issues</a></li> <li><a href="gaussian_derivatives_4.html">Fitting empirical null with Gaussian derivatives: Large correlations</a></li> </ul> <p>Inspired by <a href="http://amstat.tandfonline.com/doi/abs/10.1198/jasa.2010.tm10237">Schwartzman 2010</a>, we experiments a new way to tackle “empirical null.”</p> <!-- The goal of this new template is to simplify the setup and maintenance of a research website. --> <!-- Specifically, --> <!-- * Easier to build and extend the website using the new tools in the [rmarkdown][] package and [latest RStudio release][rstudio] --> <!-- * Easier to deploy the website with Git and GitHub by only using one branch --> <!-- [rmarkdown]: http://rmarkdown.rstudio.com/rmarkdown_websites.htm --> <!-- [rstudio]: https://www.rstudio.com/products/rstudio/download/preview/ --> <hr> <p> This <a href="http://rmarkdown.rstudio.com">R Markdown</a> site was created with <a href="https://github.com/jdblischak/workflowr">workflowr</a> </p> <hr> <!-- To enable disqus, uncomment the section below and provide your disqus_shortname --> <!-- disqus <div id="disqus_thread"></div> <script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'rmarkdown'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript> <a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a> --> </div> <script> // add bootstrap table styles to pandoc tables function bootstrapStylePandocTables() { $('tr.header').parent('thead').parent('table').addClass('table table-condensed'); } $(document).ready(function () { bootstrapStylePandocTables(); }); </script> <!-- dynamically load mathjax for compatibility with self-contained --> <script> (function () { var script = document.createElement("script"); script.type = "text/javascript"; script.src = "https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"; document.getElementsByTagName("head")[0].appendChild(script); })(); </script> </body> </html>