... a few things done by Liang

Computation

Lately, I've been working to convince myself that everything is a computation.

-Rudy Rucker

Here listed a few techical reports and manuscripts  related to statistical computing.

  • geoCount: a R Package of Bayesian Estimation and Model Checking for Generalized Linear Spatial Models

Manuscript [under writting]
Keywords: hierarchical models, robust MCMC, latent variables, parallel computing, R/C++ API, Bayesian model checking, transformed residuals
Abstract: Model fitting and checking has remained difficult for hierarchical models with complex structure, such as generalized linear spatial models. Because of multi-layer hierarchy and the large number of unobserved latent variables, posterior sampling for latent variables and hyper-parameters usually cannot achieve good convergence. In this package, we applied up-to-date robust MCMC algorithms which not only provide fast mixing and quick convergence of the chains but also small correlation between posterior samples of different variables or parameters. Furthmore, we speed up the generation of Markov chains with parallel computing techniques. Also, Bayesian model checking methods and a new method based on transformed residuals are implemented ...

  • Communications between R and C++

Report [pdf]
Code [Rcpp.R] [Rinside.cpp] [module.cpp]
Keywords: incoming ...
Abstract: incoming ...

  • Parallel Computing with R and How to Use it on High Performance Computing Clusters

Report [pdf]
Code [snow.R] [snow_CV.R] [snow_BS.R]
[snowfall.R] [multicore.R] [parallel.job] [array.job]
Keywords: Rmpi, snow, snowfall, multicore, wrapper function, HPC, job script
Abstract: Methodological advances have led to much more computationally demand in statistical computing, such as Markov chain Monte Carlo (MCMC) algorithms, bootstrapping, cross-validation, Monte Carlo simulation, etc. And many areas of statistical applications are experiencing rapid growth in the size of data sets. Parallel computing is a common approach to resolve these problems. In this paper, four most promising packages for parallel computing with R are introduced with examples of implementation. And the procedure about how to use R in parallel way on high performance computing cluster is illustrated.

  • Improved Robust MCMC Algorithm for Hierarchical Models

Report [pdf]
Keywords: Hastings-within-Gibbs, slow mixing, group updating, Langevin algorithm, data-corrected parameterization, flat prior
Abstract: In this paper, three important techniques are discussed with details: group updating scheme, Langevin algorithm, data-corrected parameterization. They largely improve the performance of Hastings-within-Gibbs algorithm. And these improvements are illustrated through application on a hierarchical model with Rongelap data.

  • Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Models

Report [pdf]
Keywords: MCMC, Gibbs sampler, Metropolis-Hastings, "fix-scan", slow mixing
Abstract: In this paper, common MCMC algorithms are introduced including Hastings-within-Gibbs algorithm. Then it is applied to a hierarchical model with simulated data set. “Fix-scan” technique is used to update the latent variables in the model. And the results are studied to explore the problems of the algorithm.

  • Nested Sampling: Introduction and Implementation

Report [pdf]
Code [nested_norm.R] [nested_exp.R] [nested_growth.R]
Keywords: marginal density of the data, Monte Carlo methods, cumulative prior mass, parameter space, shrinkage of the likelihood, terminnation condition, wrap-around and reflection technique
Abstract: Nested Sampling is a new technique to calculate the evidence, (alternatively the marginal likelihood, marginal density of the data, or the prior predictive), in a way that uses Monte Carlo methods ...

  • MCMC Algorithms for LMMs and GLMMs: Implementation in R and Comparison with WinBUGS

Report [pdf]
Code [caseI.R] [caseII.R] [caseIII.R] [caseI_c.R]
Keywords: mixed models, Bayesian analysis, hierarchical structure, Gibbs sampler, Metropolis-Hastings algorithm
Abstract: Although we already have WinBUGS for posterior sampling, all the computation in WinBUGS runs in a "black-box" and it is almost impossible to control or modify the internal algorithms for specific needs. One the other hand, generalized linear mixed models (GLMMs) are getting popular because of its capability to handle an extraordinary range of complications in regression analysis ...


Contact ljing918@gmail.com if there is any question.