Last updated: 2018-12-14
workflowr checks: (Click a bullet for more information) ✔ R Markdown file: up-to-date
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
✔ Environment: empty
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
✔ Seed:
set.seed(12345)
The command set.seed(12345)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
✔ Session information: recorded
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
✔ Repository version: 73ed3b7
wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: analysis/.DS_Store
Ignored: analysis/BH_robustness_cache/
Ignored: analysis/FDR_Null_cache/
Ignored: analysis/FDR_null_betahat_cache/
Ignored: analysis/Rmosek_cache/
Ignored: analysis/StepDown_cache/
Ignored: analysis/alternative2_cache/
Ignored: analysis/alternative_cache/
Ignored: analysis/ash_gd_cache/
Ignored: analysis/average_cor_gtex_2_cache/
Ignored: analysis/average_cor_gtex_cache/
Ignored: analysis/brca_cache/
Ignored: analysis/cash_deconv_cache/
Ignored: analysis/cash_fdr_1_cache/
Ignored: analysis/cash_fdr_2_cache/
Ignored: analysis/cash_fdr_3_cache/
Ignored: analysis/cash_fdr_4_cache/
Ignored: analysis/cash_fdr_5_cache/
Ignored: analysis/cash_fdr_6_cache/
Ignored: analysis/cash_paper_fig1_cache/
Ignored: analysis/cash_paper_fig_avgecdf_cache/
Ignored: analysis/cash_paper_fig_deconv_cache/
Ignored: analysis/cash_paper_fig_leukemia_cache/
Ignored: analysis/cash_plots_2_cache/
Ignored: analysis/cash_plots_3_cache/
Ignored: analysis/cash_plots_4_cache/
Ignored: analysis/cash_plots_5_cache/
Ignored: analysis/cash_plots_cache/
Ignored: analysis/cash_sim_1_cache/
Ignored: analysis/cash_sim_2_cache/
Ignored: analysis/cash_sim_3_cache/
Ignored: analysis/cash_sim_4_cache/
Ignored: analysis/cash_sim_5_cache/
Ignored: analysis/cash_sim_6_cache/
Ignored: analysis/cash_sim_7_cache/
Ignored: analysis/correlated_z_2_cache/
Ignored: analysis/correlated_z_3_cache/
Ignored: analysis/correlated_z_cache/
Ignored: analysis/create_null_cache/
Ignored: analysis/cutoff_null_cache/
Ignored: analysis/design_matrix_2_cache/
Ignored: analysis/design_matrix_cache/
Ignored: analysis/diagnostic_ash_cache/
Ignored: analysis/diagnostic_correlated_z_2_cache/
Ignored: analysis/diagnostic_correlated_z_3_cache/
Ignored: analysis/diagnostic_correlated_z_cache/
Ignored: analysis/diagnostic_plot_2_cache/
Ignored: analysis/diagnostic_plot_cache/
Ignored: analysis/efron_leukemia_cache/
Ignored: analysis/fitting_normal_cache/
Ignored: analysis/gaussian_derivatives_2_cache/
Ignored: analysis/gaussian_derivatives_3_cache/
Ignored: analysis/gaussian_derivatives_4_cache/
Ignored: analysis/gaussian_derivatives_5_cache/
Ignored: analysis/gaussian_derivatives_cache/
Ignored: analysis/gd-ash_cache/
Ignored: analysis/gd_delta_cache/
Ignored: analysis/gd_lik_2_cache/
Ignored: analysis/gd_lik_cache/
Ignored: analysis/gd_w_cache/
Ignored: analysis/knockoff_10_cache/
Ignored: analysis/knockoff_2_cache/
Ignored: analysis/knockoff_3_cache/
Ignored: analysis/knockoff_4_cache/
Ignored: analysis/knockoff_5_cache/
Ignored: analysis/knockoff_6_cache/
Ignored: analysis/knockoff_7_cache/
Ignored: analysis/knockoff_8_cache/
Ignored: analysis/knockoff_9_cache/
Ignored: analysis/knockoff_cache/
Ignored: analysis/knockoff_var_cache/
Ignored: analysis/marginal_z_alternative_cache/
Ignored: analysis/marginal_z_cache/
Ignored: analysis/mosek_reg_2_cache/
Ignored: analysis/mosek_reg_4_cache/
Ignored: analysis/mosek_reg_5_cache/
Ignored: analysis/mosek_reg_6_cache/
Ignored: analysis/mosek_reg_cache/
Ignored: analysis/pihat0_null_cache/
Ignored: analysis/plot_diagnostic_cache/
Ignored: analysis/poster_obayes17_cache/
Ignored: analysis/real_data_simulation_2_cache/
Ignored: analysis/real_data_simulation_3_cache/
Ignored: analysis/real_data_simulation_4_cache/
Ignored: analysis/real_data_simulation_5_cache/
Ignored: analysis/real_data_simulation_cache/
Ignored: analysis/rmosek_primal_dual_2_cache/
Ignored: analysis/rmosek_primal_dual_cache/
Ignored: analysis/seqgendiff_cache/
Ignored: analysis/simulated_correlated_null_2_cache/
Ignored: analysis/simulated_correlated_null_3_cache/
Ignored: analysis/simulated_correlated_null_cache/
Ignored: analysis/simulation_real_se_2_cache/
Ignored: analysis/simulation_real_se_cache/
Ignored: analysis/smemo_2_cache/
Ignored: data/LSI/
Ignored: docs/.DS_Store
Ignored: docs/figure/.DS_Store
Ignored: output/fig/
Ignored: output/paper/
Unstaged changes:
Modified: analysis/cash_plots_2.rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
File | Version | Author | Date | Message |
---|---|---|---|---|
rmd | c87b8d2 | Lei Sun | 2018-11-11 | figures |
z.sel <- readRDS("../output/paper/simulation/z.sel.rds")
source('../code/count_to_summary.R')
source('../code/gdash_lik.R')
Loading required package: EQL
Loading required package: ttutils
Loading required package: SQUAREM
Loading required package: REBayes
Loading required package: Matrix
Warning: package 'Matrix' was built under R version 3.4.4
Loading required package: CVXR
Attaching package: 'CVXR'
The following object is masked from 'package:stats':
power
Loading required package: PolynomF
Loading required package: Rmosek
Loading required package: ashr
library(ggplot2)
Warning: package 'ggplot2' was built under R version 3.4.4
CDF.KW <- function(h, interp = FALSE, eps = 0.001, bw = 0.7){
#Wasserstein distance: ||G-H||_W
if(interp == "biweight"){
yk = h$x
for (j in 1:length(yk))
yk[j] = sum(biweight(h$x[j], h$x, bw = bw)*h$y/sum(h$y))
H <- cumsum(yk)
H <- H/H[length(H)]
}
else {
H <- cumsum(h$y)
H <- H/H[length(H)]
}
return(H)
}
G <- function (t) {
0.6 * pnorm(t, 0, 0) + 0.3 * pnorm(t, 0, 1) + 0.1 * pnorm(t, 0, 3)
}
set.seed(777)
theta <- sample(c(
rnorm(6e3, 0, 0),
rnorm(3e3, 0, 1),
rnorm(1e3, 0, 3)
))
x.plot <- seq(-6, 6, by = 0.1)
G.plot <- G(x.plot)
r <- readRDS("../data/liver.rds")
nsamp <- 5
ngene <- 1e4
Y = lcpm(r)
subset = top_genes_index(ngene, Y)
r = r[subset,]
set.seed(777)
counts <- r[, sample(ncol(r), 2 * nsamp)]
design <- model.matrix(~c(rep(0, nsamp), rep(1, nsamp)))
summary <- count_to_summary(counts, design)
s <- summary$sebetahat
s <- s / sqrt(mean(s^2))
noise.label <- c(
'a',
'b',
'c',
'd',
'e'
)
deconv.list <- list()
for (i in 1 : 5) {
if (i <= 4) {
Z <- z.sel[i, ]
} else {
set.seed(777)
Z <- rnorm(1e4)
}
X <- theta + s * Z
z <- theta + Z
## Truth
true.data <- cbind.data.frame(
method = "True g",
x = x.plot,
cdfhat = G.plot
)
## ashr
fit.ashr <- ashr::ash(X, s, method = "fdr", mixcompdist = "normal")
ashr.plot <- as.numeric(ashr::mixcdf(ashr::get_fitted_g(fit.ashr), x.plot))
ashr.data <- cbind.data.frame(
method = "ashr",
x = x.plot,
cdfhat = ashr.plot
)
## cashr
fit.cashr <- gdash(X, s)
cashr.plot <- as.numeric(ashr::mixcdf(ashr::get_fitted_g(fit.cashr), x.plot))
cashr.data <- cbind.data.frame(
method = "cashr",
x = x.plot,
cdfhat = cashr.plot
)
## deconvolveR
fit.deconvolveR <- deconvolveR::deconv(tau = x.plot, X = z, family = "Normal", deltaAt = 0)
deconvolveR.data <- cbind.data.frame(
method = "deconvolveR",
x = fit.deconvolveR$stats[, 1],
cdfhat = fit.deconvolveR$stats[, 4]
)
## Kiefer-Wolfowitz's NPMLE (1956)
## implemented by Koenker-Mizera-Gu's REBayes (2016)
v = seq(-6.025, 6.025, by = 0.05)
fit.REBayes <- REBayes::GLmix(x = X, v = v, sigma = s)
REBayes.plot <- CDF.KW(fit.REBayes)
REBayes.data <- cbind.data.frame(
method = "REBayes",
x = fit.REBayes$x,
cdfhat = REBayes.plot
)
## EbayesThresh
fit.EbayesThresh <- EbayesThresh::ebayesthresh(X, sdev = s, verbose = TRUE, prior = "laplace", a = NA)
EbayesThresh.plot <- (1 - fit.EbayesThresh$w) * (x.plot >= 0) + fit.EbayesThresh$w * rmutil::plaplace(x.plot, m = 0, s = 1 / fit.EbayesThresh$a)
EbayesThresh.data <- cbind.data.frame(
method = "EbayesThresh",
x = x.plot,
cdfhat = EbayesThresh.plot
)
deconv.list[[i]] <- cbind.data.frame(
noise = noise.label[i],
rbind.data.frame(
true.data,
EbayesThresh.data,
REBayes.data,
ashr.data,
deconvolveR.data,
cashr.data
)
)
}
Warning in stats::nlm(f = loglik, p = aStart, gradtol = 1e-10, ...): NA/Inf
replaced by maximum positive value
Warning in stats::nlm(f = loglik, p = aStart, gradtol = 1e-10, ...): NA/Inf
replaced by maximum positive value
Warning in stats::nlm(f = loglik, p = aStart, gradtol = 1e-10, ...): NA/Inf
replaced by maximum positive value
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in stats::nlm(f = loglik, p = aStart, gradtol = 1e-10, ...): NA/Inf
replaced by maximum positive value
Warning in stats::nlm(f = loglik, p = aStart, gradtol = 1e-10, ...): NA/Inf
replaced by maximum positive value
Warning in stats::nlm(f = loglik, p = aStart, gradtol = 1e-10, ...): NA/Inf
replaced by maximum positive value
Warning in stats::nlm(f = loglik, p = aStart, gradtol = 1e-10, ...): NA/Inf
replaced by maximum positive value
Warning in stats::nlm(f = loglik, p = aStart, gradtol = 1e-10, ...): NA/Inf
replaced by maximum positive value
Warning in stats::nlm(f = loglik, p = aStart, gradtol = 1e-10, ...): NA/Inf
replaced by maximum positive value
Warning in stats::nlm(f = loglik, p = aStart, gradtol = 1e-10, ...): NA/Inf
replaced by maximum positive value
Warning in stats::nlm(f = loglik, p = aStart, gradtol = 1e-10, ...): NA/Inf
replaced by maximum positive value
Warning in stats::nlm(f = loglik, p = aStart, gradtol = 1e-10, ...): NA/Inf
replaced by maximum positive value
Warning in stats::nlm(f = loglik, p = aStart, gradtol = 1e-10, ...): NA/Inf
replaced by maximum positive value
Warning in stats::nlm(f = loglik, p = aStart, gradtol = 1e-10, ...): NA/Inf
replaced by maximum positive value
Warning in stats::nlm(f = loglik, p = aStart, gradtol = 1e-10, ...): NA/Inf
replaced by maximum positive value
deconv.ggdata <- do.call(rbind.data.frame, deconv.list)
noise.level <- c("(a)",
"(b)",
"(c)",
"(d)",
"(e)")
deconv.ggdata$noise <- plyr::mapvalues(deconv.ggdata$noise,
from = noise.label,
to = noise.level
)
deconv.ggdata$noise <- factor(deconv.ggdata$noise,
levels = levels(deconv.ggdata$noise)[c(1, 2, 5, 3, 4)]
)
method.col <- c("black", scales::hue_pal()(5)[c(1, 3, 4, 2, 5)])
method.linetype <- rep(1, 6)
#method.linetype <- c(1, 2, 4, 5, 6)
## plotting
deconv.plot <- ggplot(data = deconv.ggdata, aes(x = x, y = cdfhat, col = method, linetype = method)) +
geom_line(size = 1) +
facet_wrap(~noise, nrow = 2) +
xlim(-5, 5) +
scale_linetype_manual(values = method.linetype
#, labels = method.name
#, guide = guide_legend(nrow = 1)
) +
scale_color_manual(values = method.col
#, labels = method.name
#, guide = guide_legend(nrow = 1)
) +
labs(y = expression(paste("CDF of (estimated) g"))
#, x = expression(theta)
#, title = expression(g == 0.6~delta[0] + 0.3~N(0, 1) + 0.1~N(0, 3^2))
) +
theme(plot.title = element_text(size = 15, hjust = 0.5),
axis.title.x = element_blank(),
axis.text.x = element_text(size = 10),
axis.title.y = element_text(size = 15),
axis.text.y = element_text(size = 10),
strip.text = element_text(size = 15),
legend.position = c(0.85, 0.25),
legend.title = element_blank(),
#legend.background = element_rect(color = "grey"),
legend.text = element_text(size = 15))
ggsave("../output/paper/deconv.pdf", height = 6, width = 10)
Warning: Removed 140 rows containing missing values (geom_path).
q <- 0.1
p.val.3 <- pnorm(-abs((theta + s * z.sel[3, ]) / s)) * 2
fit.BH.3 <- p.adjust(p.val.3, method = "BH")
sum(fit.BH.3 <= q)
[1] 3218
sum(theta[fit.BH.3 <= q] == 0)
[1] 1484
sum(theta[fit.BH.3 <= q] == 0) / sum(fit.BH.3 <= q)
[1] 0.461156
fit.qvalue.3 <- qvalue::qvalue(p.val.3)
fit.qvalue.3$pi0
[1] 0.3909627
sum(fit.qvalue.3$qvalues <= q)
[1] 5021
sum(theta[fit.qvalue.3$qvalues <= q] == 0)
[1] 2689
sum(theta[fit.qvalue.3$qvalues <= q] == 0) / sum(fit.qvalue.3$qvalues <= q)
[1] 0.5355507
sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.14.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggplot2_3.1.0 ashr_2.2-7 Rmosek_8.0.69
[4] PolynomF_1.0-2 CVXR_0.95 REBayes_1.3
[7] Matrix_1.2-14 SQUAREM_2017.10-1 EQL_1.0-0
[10] ttutils_1.0-1
loaded via a namespace (and not attached):
[1] qvalue_2.10.0 locfit_1.5-9.1 reshape2_1.4.3
[4] splines_3.4.3 lattice_0.20-35 colorspace_1.3-2
[7] htmltools_0.3.6 yaml_2.1.19 gmp_0.5-13.1
[10] rlang_0.3.0.1 R.oo_1.22.0 pillar_1.2.2
[13] glue_1.2.0 Rmpfr_0.7-0 withr_2.1.2
[16] R.utils_2.6.0 bit64_0.9-7 bindrcpp_0.2.2
[19] scs_1.1-1 foreach_1.4.4 plyr_1.8.4
[22] bindr_0.1.1 stringr_1.3.1 munsell_0.4.3
[25] gtable_0.2.0 workflowr_1.1.1 R.methodsS3_1.7.1
[28] codetools_0.2-15 evaluate_0.10.1 labeling_0.3
[31] knitr_1.20 doParallel_1.0.11 pscl_1.5.2
[34] parallel_3.4.3 Rcpp_0.12.16 edgeR_3.20.9
[37] backports_1.1.2 scales_0.5.0 limma_3.34.9
[40] truncnorm_1.0-8 bit_1.1-13 digest_0.6.15
[43] stringi_1.2.2 dplyr_0.7.4 grid_3.4.3
[46] rprojroot_1.3-2 ECOSolveR_0.4 tools_3.4.3
[49] magrittr_1.5 lazyeval_0.2.1 tibble_1.4.2
[52] whisker_0.3-2 pkgconfig_2.0.1 MASS_7.3-50
[55] assertthat_0.2.0 rmarkdown_1.9 iterators_1.0.9
[58] R6_2.2.2 git2r_0.23.0 compiler_3.4.3
This reproducible R Markdown analysis was created with workflowr 1.1.1