<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta charset="utf-8" /> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <meta name="generator" content="pandoc" /> <meta name="author" content="Zhengrong Xing, Peter Carbonetto and Matthew Stephens" /> <title>Gaussian variance estimation in simulated data sets</title> <script src="site_libs/jquery-1.11.3/jquery.min.js"></script> <meta name="viewport" content="width=device-width, initial-scale=1" /> <link href="site_libs/bootstrap-3.3.5/css/readable.min.css" rel="stylesheet" /> <script src="site_libs/bootstrap-3.3.5/js/bootstrap.min.js"></script> <script src="site_libs/bootstrap-3.3.5/shim/html5shiv.min.js"></script> <script src="site_libs/bootstrap-3.3.5/shim/respond.min.js"></script> <script src="site_libs/navigation-1.1/tabsets.js"></script> <link href="site_libs/highlightjs-9.12.0/textmate.css" rel="stylesheet" /> <script src="site_libs/highlightjs-9.12.0/highlight.js"></script> <style type="text/css">code{white-space: pre;}</style> <style type="text/css"> pre:not([class]) { background-color: white; } </style> <script type="text/javascript"> if (window.hljs) { hljs.configure({languages: []}); hljs.initHighlightingOnLoad(); if (document.readyState && document.readyState === "complete") { window.setTimeout(function() { hljs.initHighlighting(); }, 0); } } </script> <style type="text/css"> h1 { font-size: 34px; } h1.title { font-size: 38px; } h2 { font-size: 30px; } h3 { font-size: 24px; } h4 { font-size: 18px; } h5 { font-size: 16px; } h6 { font-size: 12px; } .table th:not([align]) { text-align: left; } </style> </head> <body> <style type = "text/css"> .main-container { max-width: 940px; margin-left: auto; margin-right: auto; } code { color: inherit; background-color: rgba(0, 0, 0, 0.04); } img { max-width:100%; height: auto; } .tabbed-pane { padding-top: 12px; } .html-widget { margin-bottom: 20px; } button.code-folding-btn:focus { outline: none; } </style> <style type="text/css"> /* padding for bootstrap navbar */ body { padding-top: 51px; padding-bottom: 40px; } /* offset scroll position for anchor links (for fixed navbar) */ .section h1 { padding-top: 56px; margin-top: -56px; } .section h2 { padding-top: 56px; margin-top: -56px; } .section h3 { padding-top: 56px; margin-top: -56px; } .section h4 { padding-top: 56px; margin-top: -56px; } .section h5 { padding-top: 56px; margin-top: -56px; } .section h6 { padding-top: 56px; margin-top: -56px; } </style> <script> // manage active state of menu based on current page $(document).ready(function () { // active menu anchor href = window.location.pathname href = href.substr(href.lastIndexOf('/') + 1) if (href === "") href = "index.html"; var menuAnchor = $('a[href="' + href + '"]'); // mark it active menuAnchor.parent().addClass('active'); // if it's got a parent navbar menu mark it active as well menuAnchor.closest('li.dropdown').addClass('active'); }); </script> <div class="container-fluid main-container"> <!-- tabsets --> <script> $(document).ready(function () { window.buildTabsets("TOC"); }); </script> <!-- code folding --> <div class="navbar navbar-default navbar-fixed-top" role="navigation"> <div class="container"> <div class="navbar-header"> <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar"> <span class="icon-bar"></span> <span class="icon-bar"></span> <span class="icon-bar"></span> </button> <a class="navbar-brand" href="index.html">smash</a> </div> <div id="navbar" class="navbar-collapse collapse"> <ul class="nav navbar-nav"> <li> <a href="index.html">Overview</a> </li> </ul> <ul class="nav navbar-nav navbar-right"> <li> <a href="https://github.com/stephenslab/smash-paper">source</a> </li> </ul> </div><!--/.nav-collapse --> </div><!--/.container --> </div><!--/.navbar --> <!-- Add a small amount of space between sections. --> <style type="text/css"> div.section { padding-top: 12px; } </style> <!-- Add a small amount of space between sections. --> <style type="text/css"> div.section { padding-top: 12px; } </style> <div class="fluid-row" id="header"> <h1 class="title toc-ignore">Gaussian variance estimation in simulated data sets</h1> <h4 class="author"><em>Zhengrong Xing, Peter Carbonetto and Matthew Stephens</em></h4> </div> <p><strong>Last updated:</strong> 2018-11-09</p> <strong>workflowr checks:</strong> <small>(Click a bullet for more information)</small> <ul> <li> <p><details> <summary> <strong style="color:blue;">✔</strong> <strong>R Markdown file:</strong> up-to-date </summary></p> <p>Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.</p> </details> </li> <li> <p><details> <summary> <strong style="color:blue;">✔</strong> <strong>Environment:</strong> empty </summary></p> <p>Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.</p> </details> </li> <li> <p><details> <summary> <strong style="color:blue;">✔</strong> <strong>Seed:</strong> <code>set.seed(1)</code> </summary></p> <p>The command <code>set.seed(1)</code> was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.</p> </details> </li> <li> <p><details> <summary> <strong style="color:blue;">✔</strong> <strong>Session information:</strong> recorded </summary></p> <p>Great job! Recording the operating system, R version, and package versions is critical for reproducibility.</p> </details> </li> <li> <p><details> <summary> <strong style="color:blue;">✔</strong> <strong>Repository version:</strong> <a href="https://github.com/stephenslab/smash-paper/tree/8214f9d8ecafb2007fc8be76d43e6cbabe8d95b7" target="_blank">8214f9d</a> </summary></p> Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated. <br><br> Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use <code>wflow_publish</code> or <code>wflow_git_commit</code>). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated: <pre><code> Ignored files: Ignored: dsc/code/Wavelab850/MEXSource/CPAnalysis.mexmac Ignored: dsc/code/Wavelab850/MEXSource/DownDyadHi.mexmac Ignored: dsc/code/Wavelab850/MEXSource/DownDyadLo.mexmac Ignored: dsc/code/Wavelab850/MEXSource/FAIPT.mexmac Ignored: dsc/code/Wavelab850/MEXSource/FCPSynthesis.mexmac Ignored: dsc/code/Wavelab850/MEXSource/FMIPT.mexmac Ignored: dsc/code/Wavelab850/MEXSource/FWPSynthesis.mexmac Ignored: dsc/code/Wavelab850/MEXSource/FWT2_PO.mexmac Ignored: dsc/code/Wavelab850/MEXSource/FWT_PBS.mexmac Ignored: dsc/code/Wavelab850/MEXSource/FWT_PO.mexmac Ignored: dsc/code/Wavelab850/MEXSource/FWT_TI.mexmac Ignored: dsc/code/Wavelab850/MEXSource/IAIPT.mexmac Ignored: dsc/code/Wavelab850/MEXSource/IMIPT.mexmac Ignored: dsc/code/Wavelab850/MEXSource/IWT2_PO.mexmac Ignored: dsc/code/Wavelab850/MEXSource/IWT_PBS.mexmac Ignored: dsc/code/Wavelab850/MEXSource/IWT_PO.mexmac Ignored: dsc/code/Wavelab850/MEXSource/IWT_TI.mexmac Ignored: dsc/code/Wavelab850/MEXSource/LMIRefineSeq.mexmac Ignored: dsc/code/Wavelab850/MEXSource/MedRefineSeq.mexmac Ignored: dsc/code/Wavelab850/MEXSource/UpDyadHi.mexmac Ignored: dsc/code/Wavelab850/MEXSource/UpDyadLo.mexmac Ignored: dsc/code/Wavelab850/MEXSource/WPAnalysis.mexmac Ignored: dsc/code/Wavelab850/MEXSource/dct_ii.mexmac Ignored: dsc/code/Wavelab850/MEXSource/dct_iii.mexmac Ignored: dsc/code/Wavelab850/MEXSource/dct_iv.mexmac Ignored: dsc/code/Wavelab850/MEXSource/dst_ii.mexmac Ignored: dsc/code/Wavelab850/MEXSource/dst_iii.mexmac Unstaged changes: Deleted: code/mfvb.functions.R </code></pre> Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes. </details> </li> </ul> <details> <summary> <small><strong>Expand here to see past versions:</strong></small> </summary> <ul> <table style="border-collapse:separate; border-spacing:5px;"> <thead> <tr> <th style="text-align:left;"> File </th> <th style="text-align:left;"> Version </th> <th style="text-align:left;"> Author </th> <th style="text-align:left;"> Date </th> <th style="text-align:left;"> Message </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Rmd </td> <td style="text-align:left;"> <a href="https://github.com/stephenslab/smash-paper/blob/8214f9d8ecafb2007fc8be76d43e6cbabe8d95b7/analysis/gaussvarest.Rmd" target="_blank">8214f9d</a> </td> <td style="text-align:left;"> Peter Carbonetto </td> <td style="text-align:left;"> 2018-11-09 </td> <td style="text-align:left;"> wflow_publish(“gaussvarest.Rmd”) </td> </tr> <tr> <td style="text-align:left;"> html </td> <td style="text-align:left;"> <a href="https://cdn.rawgit.com/stephenslab/smash-paper/5173e56fe17f17ffcb5e6917084d992b53aedcf4/docs/gaussvarest.html" target="_blank">5173e56</a> </td> <td style="text-align:left;"> Peter Carbonetto </td> <td style="text-align:left;"> 2018-11-09 </td> <td style="text-align:left;"> Fixed error in generating the summary at the end of the gaussvarest </td> </tr> <tr> <td style="text-align:left;"> Rmd </td> <td style="text-align:left;"> <a href="https://github.com/stephenslab/smash-paper/blob/684ee02303e0f84dc66fd5dc8098e4e415ebf1ac/analysis/gaussvarest.Rmd" target="_blank">684ee02</a> </td> <td style="text-align:left;"> Peter Carbonetto </td> <td style="text-align:left;"> 2018-11-09 </td> <td style="text-align:left;"> wflow_publish(“gaussvarest.Rmd”) </td> </tr> <tr> <td style="text-align:left;"> html </td> <td style="text-align:left;"> <a href="https://cdn.rawgit.com/stephenslab/smash-paper/692edf4dbf47501380d09c8eae7b77442bac545c/docs/gaussvarest.html" target="_blank">692edf4</a> </td> <td style="text-align:left;"> Peter Carbonetto </td> <td style="text-align:left;"> 2018-11-09 </td> <td style="text-align:left;"> Success in running full gaussvarest analysis with n=10 simulations. </td> </tr> <tr> <td style="text-align:left;"> Rmd </td> <td style="text-align:left;"> <a href="https://github.com/stephenslab/smash-paper/blob/4bca2bee9d4fa23d3e3edfaa4ce1c9b35bde4f42/analysis/gaussvarest.Rmd" target="_blank">4bca2be</a> </td> <td style="text-align:left;"> Peter Carbonetto </td> <td style="text-align:left;"> 2018-11-09 </td> <td style="text-align:left;"> wflow_publish(“gaussvarest.Rmd”) </td> </tr> <tr> <td style="text-align:left;"> html </td> <td style="text-align:left;"> <a href="https://cdn.rawgit.com/stephenslab/smash-paper/593d682fabe4847c9577f4cc289d4f19ee8292a5/docs/gaussvarest.html" target="_blank">593d682</a> </td> <td style="text-align:left;"> Peter Carbonetto </td> <td style="text-align:left;"> 2018-11-09 </td> <td style="text-align:left;"> More testing of gaussvarest analysis with n=10 simulations. </td> </tr> <tr> <td style="text-align:left;"> Rmd </td> <td style="text-align:left;"> <a href="https://github.com/stephenslab/smash-paper/blob/3bcc70bdc3ea585ebb5232ae099519a9203246b4/analysis/gaussvarest.Rmd" target="_blank">3bcc70b</a> </td> <td style="text-align:left;"> Peter Carbonetto </td> <td style="text-align:left;"> 2018-11-09 </td> <td style="text-align:left;"> wflow_publish(“gaussvarest.Rmd”) </td> </tr> <tr> <td style="text-align:left;"> Rmd </td> <td style="text-align:left;"> <a href="https://github.com/stephenslab/smash-paper/blob/e1ebc43e2b3f375763904fe66c7ba9f47614279a/analysis/gaussvarest.Rmd" target="_blank">e1ebc43</a> </td> <td style="text-align:left;"> Peter Carbonetto </td> <td style="text-align:left;"> 2018-11-09 </td> <td style="text-align:left;"> wflow_publish(“gaussvarest.Rmd”) </td> </tr> <tr> <td style="text-align:left;"> html </td> <td style="text-align:left;"> <a href="https://cdn.rawgit.com/stephenslab/smash-paper/613f9b22da6bfe78fab4b9477869fc891876ad71/docs/gaussvarest.html" target="_blank">613f9b2</a> </td> <td style="text-align:left;"> Peter Carbonetto </td> <td style="text-align:left;"> 2018-11-09 </td> <td style="text-align:left;"> First workflowr build of the gaussvarest analysis. </td> </tr> <tr> <td style="text-align:left;"> Rmd </td> <td style="text-align:left;"> <a href="https://github.com/stephenslab/smash-paper/blob/2c6154dfd11ef6987d66fa68db3dd2a4306d6995/analysis/gaussvarest.Rmd" target="_blank">2c6154d</a> </td> <td style="text-align:left;"> Peter Carbonetto </td> <td style="text-align:left;"> 2018-11-09 </td> <td style="text-align:left;"> wflow_publish(“gaussvarest.Rmd”) </td> </tr> <tr> <td style="text-align:left;"> Rmd </td> <td style="text-align:left;"> <a href="https://github.com/stephenslab/smash-paper/blob/fb432d7638d08f6672f25626415ed184f0abb2cd/analysis/gaussvarest.Rmd" target="_blank">fb432d7</a> </td> <td style="text-align:left;"> Peter Carbonetto </td> <td style="text-align:left;"> 2018-11-09 </td> <td style="text-align:left;"> wflow_publish(“gaussvarest.Rmd”) </td> </tr> <tr> <td style="text-align:left;"> Rmd </td> <td style="text-align:left;"> <a href="https://github.com/stephenslab/smash-paper/blob/049dcbb8e4f4bbfd2b42a4def9a91b3a67cff8b5/analysis/gaussvarest.Rmd" target="_blank">049dcbb</a> </td> <td style="text-align:left;"> Peter Carbonetto </td> <td style="text-align:left;"> 2018-11-08 </td> <td style="text-align:left;"> Moved around some files and revised TOC in home page. </td> </tr> </tbody> </table> </ul> <p></details></p> <hr /> <p>This analysis implements the “Gaussian variance estimation” simulation experiments in the paper. In particular, we compare the Mean Field Variational Bayes (MFVB) method against SMASH in two scenarios. The figure and table generated at the end of this script should match up with the figure and table shown in the paper.</p> <p>Running the code could take several hours to complete as it runs the two methods on 100 simulated data sets for each of the two scenarios.</p> <p>We thank M. Menictas & M. Wand for generously sharing code that was used to implement these experiments.</p> <div id="initial-setup-instructions" class="section level2"> <h2>Initial setup instructions</h2> <p>To run this example on your own computer, please follow these setup instructions. These instructions assume you already have R and/or RStudio installed on your computer.</p> <p>First, download or clone the <a href="https://github.com/stephenslab/smash-paper">git repository</a> on your computer.</p> <p>Launch R, and change the working directory to be the “analysis” folder inside your local copy of the git repository.</p> <p>Finally, install the smashr package from GitHub:</p> <pre class="r"><code>devtools::install_github("stephenslab/smashr")</code></pre> <p>See the “Session Info” at the bottom for the versions of the software and R packages that were used to generate the results shown below.</p> </div> <div id="set-up-r-environment" class="section level2"> <h2>Set up R environment</h2> <p>Load the smashr package, as well as some functions used in the analysis below.</p> <pre class="r"><code>library(smashr) source("../code/mfvb.R")</code></pre> </div> <div id="analysis-settings" class="section level2"> <h2>Analysis settings</h2> <p>Specify the number of data sets simulated in the first and second simulation scenarios.</p> <pre class="r"><code>nsim1 <- 100 nsim2 <- 100</code></pre> <p>Next, specify the hyperparameters used in running the MFVB method.</p> <pre class="r"><code>Au.hyp <- 1e5 Av.hyp <- 1e5 sigsq.gamma <- 1e10 sigsq.beta <- 1e10</code></pre> <p>These variables specify some colours used in the plots.</p> <pre class="r"><code>mainCol <- "darkslateblue" ptCol <- "paleturquoise3" lineCol <- "skyblue" axisCol <- "black"</code></pre> <p>These are additional plotting parameters.</p> <pre class="r"><code>cex.pt <- 0.75 cex.mainVal <- 1.7 cex.labVal <- 1.3 xlabVal <- "x"</code></pre> </div> <div id="plot-mean-and-variance-functions-used-to-simulate-data" class="section level2"> <h2>Plot mean and variance functions used to simulate data</h2> <p>Compare this plot against the one shown in Fig. 4 of the paper.</p> <pre class="r"><code>xgrid <- (0:10000)/10000 plot(xgrid,fTrue(xgrid),type = "l",ylim = c(-5,5),ylab = "y",xlab = "X", lwd = 2) lines(xgrid,fTrue(xgrid) + 2*sqrt(gTrue(xgrid)),col = "darkorange",lwd = 2) lines(xgrid,fTrue(xgrid) - 2*sqrt(gTrue(xgrid)),col = "darkorange",lwd = 2)</code></pre> <p><img src="figure/gaussvarest.Rmd/plot-mean-and-variance-1.png" width="576" style="display: block; margin: auto;" /></p> <details> <summary><em>Expand here to see past versions of plot-mean-and-variance-1.png:</em></summary> <table style="border-collapse:separate; border-spacing:5px;"> <thead> <tr> <th style="text-align:left;"> Version </th> <th style="text-align:left;"> Author </th> <th style="text-align:left;"> Date </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> <a href="https://github.com/stephenslab/smash-paper/blob/613f9b22da6bfe78fab4b9477869fc891876ad71/docs/figure/gaussvarest.Rmd/plot-mean-and-variance-1.png" target="_blank">613f9b2</a> </td> <td style="text-align:left;"> Peter Carbonetto </td> <td style="text-align:left;"> 2018-11-09 </td> </tr> </tbody> </table> <p></details></p> </div> <div id="first-simulation-scenario-unevenly-spaced-data" class="section level2"> <h2>First simulation scenario: unevenly spaced data</h2> <p>In the first scenario, we simulate data sets with 500 unevenly spaced data points, and assess accuracy, separately for the mean and variance estimates) by computing the mean of the squared errors (MSE) evaluated at 201 equally spaced points.</p> <pre class="r"><code>mse.mu.uneven.mfvb <- 0 mse.mu.uneven.smash <- 0 mse.sd.uneven.mfvb <- 0 mse.sd.uneven.smash <- 0</code></pre> <p>Run the SMASH and MFVB methods for each simulated data set.</p> <pre class="r"><code>cat(sprintf("Running %d simulations: ",nsim1)) for (j in 1:nsim1) { cat(sprintf("%d ",j)) # SIMULATE DATA set.seed(3*j) n <- 500 xOrig <- runif(n) set.seed(3*j) yOrig <- fTrue(xOrig) + sqrt(exp(loggTrue(xOrig)))*rnorm(n) aOrig <- min(xOrig) bOrig <- max(xOrig) mean.x <- mean(xOrig) sd.x <- sd(xOrig) mean.y <- mean(yOrig) sd.y <- sd(yOrig) a <- (aOrig - mean.x)/sd.x b <- (bOrig - mean.x)/sd.x x <- (xOrig - mean.x)/sd.x y <- (yOrig - mean.y)/sd.y numIntKnotsU <- 17 intKnotsU <- quantile(x,seq(0,1,length=numIntKnotsU+2)[-c(1,numIntKnotsU+2)]) Zu <- ZOSull(x,intKnots=intKnotsU,range.x=c(a,b)) numKnotsU <- ncol(Zu) numIntKnotsV <- numIntKnotsU intKnotsV <- quantile(x,seq(0,1,length = numIntKnotsV + 2)[-c(1,numIntKnotsV+2)]) Zv <- ZOSull(x,intKnots=intKnotsV,range.x=c(a,b)) numKnotsV <- ncol(Zv) # RUN MEAN FIELD VARIATIONAL BAYES X <- cbind(rep(1,n),x) Cumat <- cbind(X,Zu) Cvmat <- cbind(X,Zv) ncX <- ncol(X) ncZu <- ncol(Zu) ncZv <- ncol(Zv) ncCu <- ncol(Cumat) ncCv <- ncol(Cvmat) MFVBfit <- meanVarMFVB(y,X,ncZu,ncZv,Au.hyp,Av.hyp, sigsq.gamma,sigsq.beta) ng <- 201 xgOrig <- seq(aOrig,bOrig,length=ng) xg <- (xgOrig - mean.x)/sd.x Xg <- cbind(rep(1,ng),xg) Zug <- ZOSull(xg,intKnots=intKnotsU,range.x=c(a,b)) Cug <- cbind(Xg,Zug) Zvg <- ZOSull(xg,intKnots=intKnotsV,range.x=c(a,b)) Cvg <- cbind(Xg,Zvg) mu.q.nu <- MFVBfit$mu.q.nu mu.q.omega <- MFVBfit$mu.q.omega Sigma.q.nu <- MFVBfit$Sigma.q.nu Sigma.q.omega <- MFVBfit$Sigma.q.omega fhatMFVBg <- Cug%*%mu.q.nu fhatMFVBgOrig <- fhatMFVBg*sd.y + mean.y logghatMFVBg <- Cvg%*%mu.q.omega logghatMFVBgOrig <- logghatMFVBg + 2*log(sd.y) sdloggMFVBgOrig <- sqrt(diag(Cvg%*%Sigma.q.omega%*%t(Cvg))) credLowloggMFVBgOrig <- logghatMFVBgOrig - qnorm(0.975)*sdloggMFVBgOrig credUpploggMFVBgOrig <- logghatMFVBgOrig + qnorm(0.975)*sdloggMFVBgOrig sqrtghatMFVBg <- exp(0.5*Cvg %*% mu.q.omega + 0.125*diag(Cvg%*%Sigma.q.omega%*%t(Cvg))) sqrtghatMFVBgOrig <- sqrtghatMFVBg*sd.y # RUN SMASH x.mod <- unique(sort(xOrig)) y.mod <- 0 for(i in 1:length(x.mod)) y.mod[i] <- median(yOrig[xOrig == x.mod[i]]) y.exp <- c(y.mod,y.mod[length(y.mod):(2*length(y.mod)-2^9+1)]) y.final <- c(y.exp,y.exp[length(y.exp):1]) mu.est <- smash.gaus(y.final,filter.number=1,family="DaubExPhase") var.est <- smash.gaus(y.final,v.est=TRUE) mu.est <- mu.est[1:500] var.est <- var.est[1:500] mu.est.inter <- approx(x.mod,mu.est,xgOrig,'linear')$y var.est.inter <- approx(x.mod,var.est,xgOrig,'linear')$y mse.mu.uneven.mfvb[j]<-mean((fhatMFVBgOrig - fTrue(xgOrig))^2) mse.sd.uneven.mfvb[j]<-mean((sqrtghatMFVBgOrig-exp((loggTrue(xgOrig))/2))^2) mu.est <- smash.gaus(y.final,filter.number=8,family="DaubLeAsymm") var.est <- smash.gaus(y.final,v.est=TRUE,v.basis=TRUE,filter.number=8, family="DaubLeAsymm") mu.est <- mu.est[1:500] var.est <- var.est[1:500] mu.est.inter <- approx(x.mod,mu.est,xgOrig,'linear')$y var.est.inter <- approx(x.mod,var.est,xgOrig,'linear')$y mse.mu.uneven.smash[j] <- mean((mu.est.inter-fTrue(xgOrig))^2) mse.sd.uneven.smash[j] <- mean((sqrt(var.est.inter)-exp((loggTrue(xgOrig))/2))^2) } # Running 100 simulations: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100</code></pre> </div> <div id="second-simulation-scenario-evenly-spaced-points" class="section level2"> <h2>Second simulation scenario: evenly spaced points</h2> <p>In this scenario, we simulate data sets with 1,024 evenly spaced data points. We assess accuracy separately for the mean and standard deviation as the mean of the MSEs evaluated at each of the locations.</p> <pre class="r"><code>mse.mu.even.mfvb <- 0 mse.mu.even.smash <- 0 mse.sd.even.mfvb <- 0 mse.sd.even.smash <- 0</code></pre> <p>Run the SMASH and MFVB methods for each simulated data set.</p> <pre class="r"><code>cat(sprintf("Running %d simulations: ",nsim2)) for (j in 1:nsim2) { cat(sprintf("%d ",j)) # SIMULATE DATA n <- 2^10 xOrig <- (1:n)/n set.seed(30*j) yOrig <- fTrue(xOrig) + sqrt(exp(loggTrue(xOrig)))*rnorm(n) aOrig <- min(xOrig) bOrig <- max(xOrig) mean.x <- mean(xOrig) sd.x <- sd(xOrig) mean.y <- mean(yOrig) sd.y <- sd(yOrig) a <- (aOrig - mean.x)/sd.x b <- (bOrig - mean.x)/sd.x x <- (xOrig - mean.x)/sd.x y <- (yOrig - mean.y)/sd.y numIntKnotsU <- 17 intKnotsU <- quantile(x,seq(0,1,length=numIntKnotsU+2)[-c(1,numIntKnotsU+2)]) Zu <- ZOSull(x,intKnots=intKnotsU,range.x=c(a,b)) numKnotsU <- ncol(Zu) numIntKnotsV <- numIntKnotsU intKnotsV <- quantile(x,seq(0,1,length=numIntKnotsV+2)[-c(1,numIntKnotsV+2)]) Zv <- ZOSull(x,intKnots=intKnotsV,range.x=c(a,b)) numKnotsV <- ncol(Zv) # RUN MEAN FIELD VARIATIONAL BAYES X <- cbind(rep(1,n),x) Cumat <- cbind(X,Zu) Cvmat <- cbind(X,Zv) ncX <- ncol(X) ncZu <- ncol(Zu) ncZv <- ncol(Zv) ncCu <- ncol(Cumat) ncCv <- ncol(Cvmat) MFVBfit <- meanVarMFVB(y,X,ncZu,ncZv,Au.hyp,Av.hyp, sigsq.gamma,sigsq.beta) ng <- 2^10 xgOrig <- seq(aOrig,bOrig,length=ng) xg <- (xgOrig - mean.x)/sd.x Xg <- cbind(rep(1,ng),xg) Zug <- ZOSull(xg,intKnots=intKnotsU,range.x=c(a,b)) Cug <- cbind(Xg,Zug) Zvg <- ZOSull(xg,intKnots=intKnotsV,range.x=c(a,b)) Cvg <- cbind(Xg,Zvg) mu.q.nu <- MFVBfit$mu.q.nu mu.q.omega <- MFVBfit$mu.q.omega Sigma.q.nu <- MFVBfit$Sigma.q.nu Sigma.q.omega <- MFVBfit$Sigma.q.omega # Get the mean function estimate. fhatMFVBg <- Cug %*% mu.q.nu fhatMFVBgOrig <- fhatMFVBg*sd.y + mean.y logghatMFVBg <- Cvg%*%mu.q.omega logghatMFVBgOrig <- logghatMFVBg + 2*log(sd.y) sdloggMFVBgOrig <- sqrt(diag(Cvg%*%Sigma.q.omega%*%t(Cvg))) credLowloggMFVBgOrig <- logghatMFVBgOrig - qnorm(0.975)*sdloggMFVBgOrig credUpploggMFVBgOrig <- logghatMFVBgOrig + qnorm(0.975)*sdloggMFVBgOrig sqrtghatMFVBg <- exp(0.5*Cvg%*%mu.q.omega + 0.125*diag(Cvg%*%Sigma.q.omega%*%t(Cvg))) sqrtghatMFVBgOrig <- sqrtghatMFVBg*sd.y # RUN SMASH mu.est <- smash.gaus(yOrig,filter.number=1,family="DaubExPhase") var.est <- smash.gaus(yOrig,v.est=TRUE) mse.mu.even.mfvb[j] <- mean((fhatMFVBgOrig-fTrue(xgOrig))^2) mse.sd.even.mfvb[j] <- mean((sqrtghatMFVBgOrig-exp((loggTrue(xgOrig))/2))^2) mu.est <- smash.gaus(yOrig,filter.number=8,family="DaubLeAsymm") var.est <- smash.gaus(yOrig,v.est=TRUE,v.basis=TRUE,filter.number=8, family = "DaubLeAsymm") mse.mu.even.smash[j] <- mean((mu.est - fTrue(xgOrig))^2) mse.sd.even.smash[j] <- mean((sqrt(var.est)-exp((loggTrue(xgOrig))/2))^2) } # Running 100 simulations: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100</code></pre> </div> <div id="summarize-results-of-simulations" class="section level2"> <h2>Summarize results of simulations</h2> <p>The following two tables show the mean squared error (MSE) averaged over the 100 simulations in each of the scenarios. Compare these results with Table 1 in the paper.</p> <pre class="r"><code>mse.table1 <- rbind(c(mean(mse.mu.uneven.mfvb),mean(mse.sd.uneven.mfvb)), c(mean(mse.mu.uneven.smash),mean(mse.sd.uneven.smash))) mse.table2 <- rbind(c(mean(mse.mu.even.mfvb),mean(mse.sd.even.mfvb)), c(mean(mse.mu.even.smash),mean(mse.sd.even.smash))) rownames(mse.table1) <- c("MFVB","SMASH") colnames(mse.table1) <- c("mean","sd") rownames(mse.table2) <- c("MFVB","SMASH") colnames(mse.table2) <- c("mean","sd") cat(sprintf("MSE averaged across %d simulations in Scenario 1:\n",nsim1)) print(mse.table1) cat("\n") cat(sprintf("MSE averaged across %d simulations in Scenario 2:\n",nsim2)) print(mse.table2) # MSE averaged across 100 simulations in Scenario 1: # mean sd # MFVB 0.03302778 0.01991272 # SMASH 0.03343833 0.01869602 # # MSE averaged across 100 simulations in Scenario 2: # mean sd # MFVB 0.01721438 0.008471983 # SMASH 0.01582035 0.006513889</code></pre> <p>In Scenario 1, the data are not equally spaced, and the number of data points is not a power of 2; in this setting, SMASH is more accurate in estimating both the mean and s.d.</p> <p>In Scenario 2, the data are equally spaced, and the number of data points is a power of 2; SMASH again outperforms MFVB in both mean and s.d. estimation.</p> </div> <div id="session-information" class="section level2"> <h2>Session information</h2> <pre class="r"><code>sessionInfo() # R version 3.4.3 (2017-11-30) # Platform: x86_64-apple-darwin15.6.0 (64-bit) # Running under: macOS High Sierra 10.13.6 # # Matrix products: default # BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib # LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib # # locale: # [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 # # attached base packages: # [1] splines stats graphics grDevices utils datasets methods # [8] base # # other attached packages: # [1] smashr_1.2-0 # # loaded via a namespace (and not attached): # [1] Rcpp_0.12.19 compiler_3.4.3 git2r_0.23.0 # [4] workflowr_1.1.1 R.methodsS3_1.7.1 R.utils_2.6.0 # [7] bitops_1.0-6 iterators_1.0.9 tools_3.4.3 # [10] digest_0.6.17 evaluate_0.11 lattice_0.20-35 # [13] Matrix_1.2-12 foreach_1.4.4 yaml_2.2.0 # [16] parallel_3.4.3 stringr_1.3.1 knitr_1.20 # [19] caTools_1.17.1 REBayes_1.3 rprojroot_1.3-2 # [22] grid_3.4.3 data.table_1.11.4 rmarkdown_1.10 # [25] ashr_2.2-23 magrittr_1.5 whisker_0.3-2 # [28] backports_1.1.2 codetools_0.2-15 htmltools_0.3.6 # [31] MASS_7.3-48 assertthat_0.2.0 wavethresh_4.6.8 # [34] stringi_1.2.4 Rmosek_8.0.69 doParallel_1.0.11 # [37] pscl_1.5.2 truncnorm_1.0-8 SQUAREM_2017.10-1 # [40] R.oo_1.21.0</code></pre> </div> <script type="text/x-mathjax-config"> MathJax.Hub.Config({ "HTML-CSS": { availableFonts: ["TeX"] } }); </script> <!-- Adjust MathJax settings so that all math formulae are shown using TeX fonts only; see http://docs.mathjax.org/en/latest/configuration.html. This will make the presentation more consistent at the cost of the webpage sometimes taking slightly longer to load. Note that this only works because the footer is added to webpages before the MathJax javascript. --> <script type="text/x-mathjax-config"> MathJax.Hub.Config({ "HTML-CSS": { availableFonts: ["TeX"] } }); </script> <hr> <p> This reproducible <a href="http://rmarkdown.rstudio.com">R Markdown</a> analysis was created with <a href="https://github.com/jdblischak/workflowr">workflowr</a> 1.1.1 </p> <hr> </div> <script> // add bootstrap table styles to pandoc tables function bootstrapStylePandocTables() { $('tr.header').parent('thead').parent('table').addClass('table table-condensed'); } $(document).ready(function () { bootstrapStylePandocTables(); }); </script> <!-- dynamically load mathjax for compatibility with self-contained --> <script> (function () { var script = document.createElement("script"); script.type = "text/javascript"; script.src = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"; document.getElementsByTagName("head")[0].appendChild(script); })(); </script> </body> </html>