Package 'LaplacesDemon' reference manual

Title:	Complete Environment for Bayesian Inference
Description:	Provides a complete environment for Bayesian inference using a variety of different samplers (see ?LaplacesDemon for an overview).
Authors:	Byron Hall [aut], Martina Hall [aut], Statisticat, LLC [aut], Eric Brown [ctb], Richard Hermanson [ctb], Emmanuel Charpentier [ctb], Daniel Heck [ctb], Stephane Laurent [ctb], Quentin F. Gronau [ctb], Henrik Singmann [cre]
Maintainer:	Henrik Singmann <[email protected]>
License:	MIT + file LICENSE
Version:	16.1.6
Built:	2025-02-10 04:43:59 UTC
Source:	https://github.com/laplacesdemonr/laplacesdemon

Complete Environment for Bayesian Inference

Description

Provides a complete environment for Bayesian inference using a variety of different samplers (see ?LaplacesDemon for an overview).

Details

The DESCRIPTION file:

Package:	LaplacesDemon
Version:	16.1.6
Title:	Complete Environment for Bayesian Inference
Authors@R:	c(person("Byron", "Hall", role = "aut"), person("Martina", "Hall", role = "aut"), person(family="Statisticat, LLC", role = "aut"), person(given="Eric", family="Brown", role = "ctb"), person(given="Richard", family="Hermanson", role = "ctb"), person(given="Emmanuel", family="Charpentier", role = "ctb"), person(given="Daniel", family="Heck", role = "ctb"), person(given="Stephane", family="Laurent", role = "ctb"), person(given="Quentin F.", family="Gronau", role = "ctb"), person(given="Henrik", family="Singmann", email="[email protected]", role="cre"))
Depends:	R (>= 3.0.0)
Imports:	parallel, grDevices, graphics, stats, utils
Suggests:	KernSmooth
ByteCompile:	TRUE
Description:	Provides a complete environment for Bayesian inference using a variety of different samplers (see ?LaplacesDemon for an overview).
License:	MIT + file LICENSE
URL:	https://github.com/LaplacesDemonR/LaplacesDemon
BugReports:	https://github.com/LaplacesDemonR/LaplacesDemon/issues
Repository:	https://laplacesdemonr.r-universe.dev
RemoteUrl:	https://github.com/laplacesdemonr/laplacesdemon
RemoteRef:	HEAD
RemoteSha:	de9107d46c215a9db57ad6e9c95a9ebcaf75ef25
Author:	Byron Hall [aut], Martina Hall [aut], Statisticat, LLC [aut], Eric Brown [ctb], Richard Hermanson [ctb], Emmanuel Charpentier [ctb], Daniel Heck [ctb], Stephane Laurent [ctb], Quentin F. Gronau [ctb], Henrik Singmann [cre]
Maintainer:	Henrik Singmann <[email protected]>

Index of help topics:

ABB                     Approximate Bayesian Bootstrap
AcceptanceRate          Acceptance Rate
BMK.Diagnostic          BMK Convergence Diagnostic
BayesFactor             Bayes Factor
BayesTheorem            Bayes' Theorem
BayesianBootstrap       The Bayesian Bootstrap
BigData                 Big Data
Blocks                  Blocks
CSF                     Cumulative Sample Function
CenterScale             Centering and Scaling
Combine                 Combine Demonoid Objects
Consort                 Consort with Laplace's Demon
Cov2Prec                Precision
ESS                     Effective Sample Size due to Autocorrelation
GIV                     Generate Initial Values
GaussHermiteQuadRule    Math Utility Functions
Gelfand.Diagnostic      Gelfand's Convergence Diagnostic
Gelman.Diagnostic       Gelman and Rubin's MCMC Convergence Diagnostic
Geweke.Diagnostic       Geweke's Convergence Diagnostic
Hangartner.Diagnostic   Hangartner's Convergence Diagnostic
Heidelberger.Diagnostic
                        Heidelberger and Welch's MCMC Convergence
                        Diagnostic
IAT                     Integrated Autocorrelation Time
Importance              Variable Importance
IterativeQuadrature     Iterative Quadrature
Juxtapose               Juxtapose MCMC Algorithm Inefficiency
KLD                     Kullback-Leibler Divergence (KLD)
KS.Diagnostic           Kolmogorov-Smirnov Convergence Diagnostic
LML                     Logarithm of the Marginal Likelihood
LPL.interval            Lowest Posterior Loss Interval
LaplaceApproximation    Laplace Approximation
LaplacesDemon           Laplace's Demon
LaplacesDemon-package   Complete Environment for Bayesian Inference
LaplacesDemon.RAM       LaplacesDemon RAM Estimate
Levene.Test             Levene's Test
LossMatrix              Loss Matrix
MCSE                    Monte Carlo Standard Error
MISS                    Multiple Imputation Sequential Sampling
MinnesotaPrior          Minnesota Prior
Mode                    The Mode(s) of a Vector
Model.Spec.Time         Model Specification Time
PMC                     Population Monte Carlo
PMC.RAM                 PMC RAM Estimate
PosteriorChecks         Posterior Checks
Raftery.Diagnostic      Raftery and Lewis's diagnostic
RejectionSampling       Rejection Sampling
SIR                     Sampling Importance Resampling
SensitivityAnalysis     Sensitivity Analysis
Stick                   Truncated Stick-Breaking
Thin                    Thin
Validate                Holdout Validation
VariationalBayes        Variational Bayes
WAIC                    Widely Applicable Information Criterion
as.covar                Proposal Covariance
as.indicator.matrix     Matrix Utility Functions
as.initial.values       Initial Values
as.parm.names           Parameter Names
as.ppc                  As Posterior Predictive Check
burnin                  Burn-in
caterpillar.plot        Caterpillar Plot
cloglog                 The log-log and complementary log-log functions
cond.plot               Conditional Plots
dStick                  Truncated Stick-Breaking Prior Distribution
dalaplace               Asymmetric Laplace Distribution: Univariate
dallaplace              Asymmetric Log-Laplace Distribution
daml                    Asymmetric Multivariate Laplace Distribution
dbern                   Bernoulli Distribution
dcat                    Categorical Distribution
dcrmrf                  Continuous Relaxation of a Markov Random Field
                        Distribution
ddirichlet              Dirichlet Distribution
de.Finetti.Game         de Finetti's Game
deburn                  De-Burn
delicit                 Prior Elicitation
demonchoice             Demon Choice Data Set
demonfx                 Demon FX Data Set
demonsessions           Demon Sessions Data Set
demonsnacks             Demon Snacks Data Set
demontexas              Demon Space-Time Data Set
dgpd                    Generalized Pareto Distribution
dgpois                  Generalized Poisson Distribution
dhalfcauchy             Half-Cauchy Distribution
dhalfnorm               Half-Normal Distribution
dhalft                  Half-t Distribution
dhs                     Horseshoe Distribution
dhuangwand              Huang-Wand Distribution
dhyperg                 Hyperprior-g Prior and Zellner's g-Prior
dinvbeta                Inverse Beta Distribution
dinvchisq               (Scaled) Inverse Chi-Squared Distribution
dinvgamma               Inverse Gamma Distribution
dinvgaussian            Inverse Gaussian Distribution
dinvmatrixgamma         Inverse Matrix Gamma Distribution
dinvwishart             Inverse Wishart Distribution
dinvwishartc            Inverse Wishart Distribution: Cholesky
                        Parameterization
dlaplace                Laplace Distribution: Univariate Symmetric
dlaplacem               Mixture of Laplace Distributions
dlaplacep               Laplace Distribution: Precision
                        Parameterization
dlasso                  LASSO Distribution
dllaplace               Log-Laplace Distribution: Univariate Symmetric
dlnormp                 Log-Normal Distribution: Precision
                        Parameterization
dmatrixgamma            Matrix Gamma Distribution
dmatrixnorm             Matrix Normal Distribution
dmvc                    Multivariate Cauchy Distribution
dmvcc                   Multivariate Cauchy Distribution: Cholesky
                        Parameterization
dmvcp                   Multivariate Cauchy Distribution: Precision
                        Parameterization
dmvcpc                  Multivariate Cauchy Distribution:
                        Precision-Cholesky Parameterization
dmvl                    Multivariate Laplace Distribution
dmvlc                   Multivariate Laplace Distribution: Cholesky
                        Parameterization
dmvn                    Multivariate Normal Distribution
dmvnc                   Multivariate Normal Distribution: Cholesky
                        Parameterization
dmvnp                   Multivariate Normal Distribution: Precision
                        Parameterization
dmvnpc                  Multivariate Normal Distribution:
                        Precision-Cholesky Parameterization
dmvpe                   Multivariate Power Exponential Distribution
dmvpec                  Multivariate Power Exponential Distribution:
                        Cholesky Parameterization
dmvpolya                Multivariate Polya Distribution
dmvt                    Multivariate t Distribution
dmvtc                   Multivariate t Distribution: Cholesky
                        Parameterization
dmvtp                   Multivariate t Distribution: Precision
                        Parameterization
dmvtpc                  Multivariate t Distribution: Precision-Cholesky
                        Parameterization
dnorminvwishart         Normal-Inverse-Wishart Distribution
dnormlaplace            Normal-Laplace Distribution: Univariate
                        Asymmetric
dnormm                  Mixture of Normal Distributions
dnormp                  Normal Distribution: Precision Parameterization
dnormv                  Normal Distribution: Variance Parameterization
dnormwishart            Normal-Wishart Distribution
dpareto                 Pareto Distribution
dpe                     Power Exponential Distribution: Univariate
                        Symmetric
dsdlaplace              Skew Discrete Laplace Distribution: Univariate
dsiw                    Scaled Inverse Wishart Distribution
dslaplace               Skew-Laplace Distribution: Univariate
dst                     Student t Distribution: Univariate
dstp                    Student t Distribution: Precision
                        Parameterization
dtrunc                  Truncated Distributions
dwishart                Wishart Distribution
dwishartc               Wishart Distribution: Cholesky Parameterization
dyangberger             Yang-Berger Distribution
interval                Constrain to Interval
is.appeased             Appeased
is.bayesfactor          Logical Check of Classes
is.bayesian             Logical Check of a Bayesian Model
is.constant             Logical Check of a Constant
is.constrained          Logical Check of Constraints
is.data                 Logical Check of Data
is.model                Logical Check of a Model
is.proper               Logical Check of Propriety
is.stationary           Logical Check of Stationarity
joint.density.plot      Joint Density Plot
joint.pr.plot           Joint Probability Region Plot
logit                   The logit and inverse-logit functions
p.interval              Probability Interval
plot.bmk                Plot Hellinger Distances
plot.demonoid           Plot samples from the output of Laplace's Demon
plot.demonoid.ppc       Plots of Posterior Predictive Checks
plot.importance         Plot Variable Importance
plot.iterquad           Plot the output of 'IterativeQuadrature'
plot.iterquad.ppc       Plots of Posterior Predictive Checks
plot.juxtapose          Plot MCMC Juxtaposition
plot.laplace            Plot the output of 'LaplaceApproximation'
plot.laplace.ppc        Plots of Posterior Predictive Checks
plot.miss               Plot samples from the output of MISS
plot.pmc                Plot samples from the output of PMC
plot.pmc.ppc            Plots of Posterior Predictive Checks
plot.vb                 Plot the output of 'VariationalBayes'
plot.vb.ppc             Plots of Posterior Predictive Checks
plotMatrix              Plot a Numerical Matrix
plotSamples             Plot Samples
predict.demonoid        Posterior Predictive Checks
predict.iterquad        Posterior Predictive Checks
predict.laplace         Posterior Predictive Checks
predict.pmc             Posterior Predictive Checks
predict.vb              Posterior Predictive Checks
print.demonoid          Print an object of class 'demonoid' to the
                        screen.
print.heidelberger      Print an object of class 'heidelberger' to the
                        screen.
print.iterquad          Print an object of class 'iterquad' to the
                        screen.
print.laplace           Print an object of class 'laplace' to the
                        screen.
print.miss              Print an object of class 'miss' to the screen.
print.pmc               Print an object of class 'pmc' to the screen.
print.raftery           Print an object of class 'raftery' to the
                        screen.
print.vb                Print an object of class 'vb' to the screen.
server_Listening        Server Listening
summary.demonoid.ppc    Posterior Predictive Check Summary
summary.iterquad.ppc    Posterior Predictive Check Summary
summary.laplace.ppc     Posterior Predictive Check Summary
summary.miss            MISS Summary
summary.pmc.ppc         Posterior Predictive Check Summary
summary.vb.ppc          Posterior Predictive Check Summary

The goal of LaplacesDemon, often referred to as LD, is to provide a complete and self-contained Bayesian environment within R. For example, this package includes dozens of MCMC algorithms, Laplace Approximation, iterative quadrature, variational Bayes, parallelization, big data, PMC, over 100 examples in the “Examples” vignette, dozens of additional probability distributions, numerous MCMC diagnostics, Bayes factors, posterior predictive checks, a variety of plots, elicitation, parameter and variable importance, Bayesian forms of test statistics (such as Durbin-Watson, Jarque-Bera, etc.), validation, and numerous additional utility functions, such as functions for multimodality, matrices, or timing your model specification. Other vignettes include an introduction to Bayesian inference, as well as a tutorial.

No further development of this package is currently being done as the original maintainer has stopped working on the package. Contributions to this package are welcome at https://github.com/LaplacesDemonR/LaplacesDemon.

The main function in this package is the LaplacesDemon function, and the best place to start is probably with the LaplacesDemon Tutorial vignette.

Author(s)

Byron Hall [aut], Martina Hall [aut], Statisticat, LLC [aut], Eric Brown [ctb], Richard Hermanson [ctb], Emmanuel Charpentier [ctb], Daniel Heck [ctb], Stephane Laurent [ctb], Quentin F. Gronau [ctb], Henrik Singmann [cre]

Maintainer: Henrik Singmann <[email protected]>

Approximate Bayesian Bootstrap

Description

This function performs multiple imputation (MI) with the Approximate Bayesian Bootstrap (ABB) of Rubin and Schenker (1986).

Usage

ABB(X, K=1)
ABB(X, K=1)

Arguments

`X`	This is a vector or matrix of data that must include both observed and missing values. When `X` is a matrix, missing values must occur somewhere in the set, but are not required to occur in each variable.
`K`	This is the number of imputations.

Details

The Approximate Bayesian Bootstrap (ABB) is a modified form of the BayesianBootstrap (Rubin, 1981) that is used for multiple imputation (MI). Imputation is a family of statistical methods for replacing missing values with estimates. Introduced by Rubin and Schenker (1986) and Rubin (1987), MI is a family of imputation methods that includes multiple estimates, and therefore includes variability of the estimates.

The data, $\textbf{X}$ , are assumed to be independent and identically distributed (IID), contain both observed and missing values, and its missing values are assumed to be ignorable (meaning enough information is available in the data that the missingness mechanism can be ignored, if the information is used properly) and Missing Completely At Random (MCAR). When ABB is used in conjunction with a propensity score (described below), missing values may be Missing At Random (MAR).

ABB does not add auxiliary information, but performs imputation with two sampling (with replacement) steps. First, $\textbf{X}^\star_{obs}$ is sampled from $\textbf{X}_{obs}$ . Then, $\textbf{X}^\star_{mis}$ is sampled from $\textbf{X}^\star_{obs}$ . The result is a sample of the posterior predictive distribution of $(\textbf{X}_{mis}|\textbf{X}_{obs})$ . The first sampling step is also known as hotdeck imputation, and the second sampling step changes the variance. Since auxiliary information is not included, ABB is appropriate for missing values that are ignorable and MCAR.

Auxiliary information may be included in the process of imputation by introducing a propensity score (Rosenbaum and Rubin, 1983; Rosenbaum and Rubin, 1984), which is an estimate of the probability of missingness. The propensity score is often the result of a binary logit model, where missingness is predicted as a function of other variables. The propensity scores are discretized into quantile-based groups, usually quintiles. Each quintile must have both observed and missing values. ABB is applied to each quintile. This is called within-class imputation. It is assumed that the missing mechanism depends only on the variables used to estimate the propensity score.

With $K=1$ , ABB may be used in MCMC, such as in LaplacesDemon, more commonly along with a propensity score for missingness. MI is performed, despite $K=1$ , because imputation occurs at each MCMC iteration. The practical advantage of this form of imputation is the ease with which it may be implemented. For example, full-likelihood imputation should perform better, but requires a chain to be updated for each missing value.

An example of a limitation of ABB with propensity scores is to consider imputing missing values of income from age in a context where age and income have a positive relationship, and where the highest incomes are missing systematically. ABB with propensity scores should impute these highest missing incomes given the highest observed ages, but is unable to infer beyond the observed data.

ABB has been extended (Parzen et al., 2005) to reduce bias, by introducing a correction factor that is applied to the MI variance estimate. This correction may be applied to output from ABB.

Value

This function returns a list with $K$ components, one for each set of imputations. Each component contains a vector of imputations equal in length to the number of missing values in the data.

ABB does not currently return the mean of the imputations, or the between-imputation variance or within-imputation variance.

Author(s)

Statisticat, LLC [email protected]

References

Parzen, M., Lipsitz, S.R., and Fitzmaurice, G.M. (2005). "A Note on Reducing the Bias of the Approximate Bayesian Bootstrap Imputation Variance Estimator". Biometrika, 92, 4, p. 971–974.

Rosenbaum, P.R. and Rubin, D.B. (1983). "The Central Role of the Propensity Score in Observational Studies for Causal Effects". Biometrika, 70, p. 41–55.

Rosenbaum, P.R. and Rubin, D.B. (1984). "Reducing Bias in Observational Studies Using Subclassification in the Propensity Score". Journal of the American Statistical Association, 79, p. 516–524.

Rubin, D.B. (1981). "The Bayesian Bootstrap". Annals of Statistics, 9, p. 130–134.

Rubin, D.B. (1987). "Multiple Imputation for Nonresponse in Surveys". John Wiley and Sons: New York, NY.

Rubin, D.B. and Schenker, N. (1986). "Multiple Imputation for Interval Estimation from Simple Random Samples with Ignorable Nonresponse". Journal of the American Statistical Association, 81, p. 366–374.

Examples

library(LaplacesDemon)

### Create Data
J <- 10 #Number of variables
m <- 20 #Number of missings
N <- 50 #Number of records
mu <- runif(J, 0, 100)
sigma <- runif(J, 0, 100)
X <- matrix(0, N, J)
for (j in 1:J) X[,j] <- rnorm(N, mu[j], sigma[j])

### Create Missing Values
M1 <- rep(0, N*J)
M2 <- sample(N*J, m)
M1[M2] <- 1
M <- matrix(M1, N, J)
X <- ifelse(M == 1, NA, X)

### Approximate Bayesian Bootstrap
imp <- ABB(X, K=1)

### Replace Missing Values in X (when K=1)
X.imp <- X
X.imp[which(is.na(X.imp))] <- unlist(imp)
X.imp
library(LaplacesDemon)

### Create Data
J <- 10 #Number of variables
m <- 20 #Number of missings
N <- 50 #Number of records
mu <- runif(J, 0, 100)
sigma <- runif(J, 0, 100)
X <- matrix(0, N, J)
for (j in 1:J) X[,j] <- rnorm(N, mu[j], sigma[j])

### Create Missing Values
M1 <- rep(0, N*J)
M2 <- sample(N*J, m)
M1[M2] <- 1
M <- matrix(M1, N, J)
X <- ifelse(M == 1, NA, X)

### Approximate Bayesian Bootstrap
imp <- ABB(X, K=1)

### Replace Missing Values in X (when K=1)
X.imp <- X
X.imp[which(is.na(X.imp))] <- unlist(imp)
X.imp

Acceptance Rate

Description

The Acceptance.Rate function calculates the acceptance rate per chain from a matrix of posterior MCMC samples.

Usage

AcceptanceRate(x)
AcceptanceRate(x)

Arguments

`x`	This required argument accepts a $S \times J$ numeric matrix of $S$ posterior samples for $J$ variables, such as `Posterior1` or `Posterior2` from an object of class `demonoid`.

Details

The acceptance rate of an MCMC algorithm is the percentage of iterations in which the proposals were accepted.

Optimal Acceptance Rates
The optimal acceptance rate varies with the number of parameters and by algorithm. Algorithms with componentwise Gaussian proposals have an optimal acceptance rate of 0.44, regardless of the number of parameters. Algorithms that update with multivariate Gaussian proposals tend to have an optimal acceptance rate that ranges from 0.44 for one parameter (one IID Gaussian target distribution) to 0.234 for an infinite number of parameters (IID Gaussian target distributions), and 0.234 is approached quickly as the number of parameters increases. The AHMC, HMC, and THMC algorithms have an optimal acceptance rate of 0.67, except with the algorithm specification L=1, where the optimal acceptance rate is 0.574. The target acceptance rate is specified in HMCDA and NUTS, and the recommended rate is 0.65 and 0.60 respectively. Some algorithms have an acceptance rate of 1, such as AGG, ESS, GG, GS (MISS only), SGLD, or Slice.

Global and Local Acceptance Rates
LaplacesDemon reports the global acceptance rate for the un-thinned chains. However, componentwise algorithms make a proposal per parameter, and therefore have a local acceptance rate for each parameter. Since only the global acceptance rate is reported, the AcceptanceRate function may be used to calculate the local acceptance rates from a matrix of un-thinned posterior samples.

Thinning
Thinned samples tend to have higher local acceptance rates than un-thinned samples. With enough samples and enough thinning, local acceptance rates approach 1. Local acceptance rates do not need to approach the optimal acceptance rates above. Conversely, local acceptance rates do not need to approach 1, because too much information may possibly be discarded by thinning. For more information on thinning, see the Thin function.

Diagnostics
The AcceptanceRate function may be used to calculate local acceptance rates on a matrix of thinned or un-thinned samples. Any chain with a local acceptance rate that is an outlier may be studied for reasons that may cause the outlier. A local acceptance rate outlier does not violate theory and is often acceptable, but may indicate a potential problem. Only some of the many potential problems include: identifiability, model misspecification, multicollinearity, multimodality, choice of prior distributions, or becoming trapped in a low-probability space. The solution to local acceptance rate outliers tends to be either changing the MCMC algorithm or re-specifying the model or priors. For example, an MCMC algorithm that makes multivariate Gaussian proposals for a large number of parameters may have low global and local acceptance rates when far from the target distributions.

Value

The AcceptanceRate function returns a vector of acceptance rates, one for each chain.

Author(s)

Statisticat, LLC. [email protected]

Examples

library(LaplacesDemon)
AcceptanceRate(matrix(rnorm(5000),1000,5))
library(LaplacesDemon)
AcceptanceRate(matrix(rnorm(5000),1000,5))

Proposal Covariance

Description

This function returns the most recent covariance matrix or a list of blocking covariance matrices from an object of class demonoid, the most recent covariance matrix from iterquad, laplace, or vb, the most recent covariance matrix from the chain with the lowest deviance in an object of class demonoid.hpc, and a number of covariance matrices of an object of class pmc equal to the number of mixture components. The returned covariance matrix or matrices are intended to be the initial proposal covariance matrix or matrices for future updates. A variance vector from an object of class demonoid or demonoid.hpc is converted to a covariance matrix.

Usage

as.covar(x)
as.covar(x)

Arguments

`x`	This is an object of class `demonoid`, `demonoid.hpc`, `iterquad`, `laplace`, `pmc`, or `vb`.

Details

Unless it is known beforehand how many iterations are required for iterative quadrature, Laplace Approximation, or Variational Bayes to converge, MCMC to appear converged, or the normalized perplexity to stabilize in PMC, multiple updates are necessary. An additional update, however, should not begin with the same proposal covariance matrix or matrices as the original update, because it will have to repeat the work already accomplished. For this reason, the as.covar function may be used at the end of an update to change the previous initial values to the latest values.

The as.covar function is most helpful with objects of class pmc that have multiple mixture components. For more information, see PMC.

Value

The returned value is a matrix (or array in the case of PMC with multiple mixture components) of the latest observed or proposal covariance, which may now be used as an initial proposal covariance matrix or matrices for a future update.

Author(s)

Statisticat, LLC [email protected]

Initial Values

Description

This function returns the most recent posterior samples from an object of class demonoid or demonoid.hpc, the posterior means of an object of class iterquad, the posterior modes of an object of class laplace or vb, the posterior means of an object of class pmc with one mixture component, or the latest means of the importance sampling distribution of an object of class pmc with multiple mixture components. The returned values are intended to be the initial values for future updates.

Usage

as.initial.values(x)
as.initial.values(x)

Arguments

`x`	This is an object of class `demonoid`, `demonoid.hpc`, `iterquad`, `laplace`, `pmc`, or `vb`.

Details

Unless it is known beforehand how many iterations are required for IterativeQuadrature, LaplaceApproximation, or VariationalBayes to converge, MCMC in LaplacesDemon to appear converged, or the normalized perplexity to stabilize in PMC, multiple updates are necessary. An additional update, however, should not begin with the same initial values as the original update, because it will have to repeat the work already accomplished. For this reason, the as.initial.values function may be used at the end of an update to change the previous initial values to the latest values.

When using LaplacesDemon.hpc, as.initial.values should be used when the output is of class demonoid.hpc, before the Combine function is used to combine the multiple chains for use with Consort and other functions, because the Combine function returns an object of class demonoid, and the number of chains will become unknown. The Consort function may suggest using as.initial.values, but when applied to an object of class demonoid, it will return the latest values as if there were only one chain.

Value

The returned value is a vector (or matrix in the case of an object of class demonoid.hpc, or pmc with multiple mixture components) of the latest values, which may now be used as initial values for a future update.

Author(s)

Statisticat, LLC. [email protected]

Parameter Names

Description

This function creates a vector of parameter names from a list of parameters, and the list may contain any combination of scalars, vectors, matrices, upper-triangular matrices, and arrays.

Usage

as.parm.names(x, uppertri=NULL)
as.parm.names(x, uppertri=NULL)

Arguments

`x`	This required argument is a list of named parameters. The list may contain scalars, vectors, matrices, and arrays. The value of the named parameters does not matter here, though they are usually set to zero. However, if a missing value occurs, then the associated element is omitted in the output.
`uppertri`	This optional argument must be a vector with a length equal to the number of named parameters. Each element in `uppertri` must be either a 0 or 1, where a 1 indicates that an upper triangular matrix will be used for the associated element in the vector of named parameters. Each element of `uppertri` is associated with a named parameter. The `uppertri` argument does not function with arrays.

Details

Each model function for IterativeQuadrature, LaplaceApproximation, LaplacesDemon, PMC, or VariationalBayes requires a vector of parameters (specified at first as Initial.Values) and a list of data. One component in the list of data must be named parm.names. Each element of parm.names is a name associated with the corresponding parameter in Initial.Values.

The parm.names vector is easy to program explicitly for a simple model, but can require considerably more programming effort for more complicated models. The as.parm.names function is a utility function designed to minimize programming by the user.

For example, a simple model may only require parm.names <- c("alpha", "beta[1]", "beta[2]", "sigma"). A more complicated model may contain hundreds of parameters that are a combination of scalars, vectors, matrices, upper-triangular matrices, and arrays, and is the reason for the as.parm.names function. The code for the above is as.parm.names(list(alpha=0, beta=rep(0,2), sigma=0)).

In the case of an upper-triangular matrix, simply pass the full matrix to as.parm.names and indicate that only the upper-triangular will be used via the uppertri argument. For example, as.parm.names(list(beta=rep(0,J),U=diag(K)), uppertri=c(0,1)) creates parameter names for a vector of $\beta$ parameters of length $J$ and an upper-triangular matrix $\textbf{U}$ of dimension $K$ .

Numerous examples may be found in the accompanying “Examples” vignette.

Value

This function returns a vector of parameter names.

Author(s)

Statisticat, LLC. [email protected]

Examples

library(LaplacesDemon)
N <- 100
J <- 5
y <- rnorm(N,0,1)
X <- matrix(runif(N*J,-2,2),N,J)
S <- diag(J)
T <- diag(2)
mon.names <- c("LP","sigma")
parm.names <- as.parm.names(list(log.sigma=0, beta=rep(0,J), S=diag(J),
     T=diag(2)), uppertri=c(0,0,0,1))
MyData <- list(J=J, N=N, S=S, T=T, X=X, mon.names=mon.names,
     parm.names=parm.names, y=y)
MyData
library(LaplacesDemon)
N <- 100
J <- 5
y <- rnorm(N,0,1)
X <- matrix(runif(N*J,-2,2),N,J)
S <- diag(J)
T <- diag(2)
mon.names <- c("LP","sigma")
parm.names <- as.parm.names(list(log.sigma=0, beta=rep(0,J), S=diag(J),
     T=diag(2)), uppertri=c(0,0,0,1))
MyData <- list(J=J, N=N, S=S, T=T, X=X, mon.names=mon.names,
     parm.names=parm.names, y=y)
MyData

As Posterior Predictive Check

Description

This function converts an object of class demonoid.val to an object of class demonoid.ppc.

Usage

as.ppc(x, set=3)
as.ppc(x, set=3)

Arguments

`x`	This is an object of class `demonoid.val`.
`set`	This is an integer that indicates which list component is to be used. When `set=1`, the modeled data set is used. When `set=2`, the validation data set is used. When `set=3`, both data sets are used.

Details

After using the Validate function for holdout validation, it is often suggested to perform posterior predictive checks. The as.ppc function converts the output object of Validate, which is an object of class demonoid.val, to an object of class demonoid.ppc. The returned object is the same as if it were created with the predict.demonoid function, rather than the Validate function.

After this conversion, the user may use posterior predictive checks, as usual, with the summary.demonoid.ppc function.

Value

The returned object is an object of class demonoid.ppc.

Author(s)

Statisticat, LLC. [email protected]

Bayes Factor

Description

This function calculates Bayes factors for two or more fitted objects of class demonoid, iterquad, laplace, pmc, or vb that were estimated respectively with the LaplacesDemon, IterativeQuadrature, LaplaceApproximation, PMC, or VariationalBayes functions, and indicates the strength of evidence in favor of the hypothesis (that each model, $\mathcal{M}_i$ , is better than another model, $\mathcal{M}_j$ ).

Usage

BayesFactor(x)
BayesFactor(x)

Arguments

`x`	This is a list of two or more fitted objects of class `demonoid`, `iterquad`, `laplace`, `pmc`, or `vb`. The components are named in order beginning with model 1, `M1`, and $k$ models are usually represented as $\mathcal{M}_1,\dots,\mathcal{M}_k$ .

Details

Introduced by Harold Jeffreys, a 'Bayes factor' is a Bayesian alternative to frequentist hypothesis testing that is most often used for the comparison of multiple models by hypothesis testing, usually to determine which model better fits the data (Jeffreys, 1961). Bayes factors are notoriously difficult to compute, and the Bayes factor is only defined when the marginal density of $\textbf{y}$ under each model is proper (see is.proper). However, the Bayes factor is easy to approximate with the Laplace-Metropolis estimator (Lewis and Raftery, 1997) and other methods of approximating the logarithm of the marginal likelihood (for more information, see LML).

Hypothesis testing with Bayes factors is more robust than frequentist hypothesis testing, since the Bayesian form avoids model selection bias, evaluates evidence in favor of the null hypothesis, includes model uncertainty, and allows non-nested models to be compared (though of course the model must have the same dependent variable). Also, frequentist significance tests become biased in favor of rejecting the null hypothesis with sufficiently large sample size.

The Bayes factor for comparing two models may be approximated as the ratio of the marginal likelihood of the data in model 1 and model 2. Formally, the Bayes factor in this case is

$B = \frac{p(\textbf{y}|\mathcal{M}_1)}{p(\textbf{y}|\mathcal{M}_2)} = \frac{\int p(\textbf{y}|\Theta_1,\mathcal{M}_1)p(\Theta_1|\mathcal{M}_1)d\Theta_1}{\int p(\textbf{y}|\Theta_2,\mathcal{M}_2)p(\Theta_2|\mathcal{M}_2)d\Theta_2}$

where $p(\textbf{y}|\mathcal{M}_1)$ is the marginal likelihood of the data in model 1.

The IterativeQuadrature, LaplaceApproximation, LaplacesDemon, PMC, and VariationalBayes functions each return the LML, the approximate logarithm of the marginal likelihood of the data, in each fitted object of class iterquad, laplace, demonoid, pmc, or vb. The BayesFactor function calculates matrix B, a matrix of Bayes factors, where each element of matrix B is a comparison of two models. Each Bayes factor is calculated as the exponentiated difference of LML of model 1 ( $\mathcal{M}_1$ ) and LML of model 2 ( $\mathcal{M}_2$ ), and the hypothesis for each element of matrix B is that the model associated with the row is greater than the model associated with the column. For example, element B[3,2] is the Bayes factor that model 3 is greater than model 2. The 'Strength of Evidence' aids in the interpretation (Jeffreys, 1961).

A table for the interpretation of the strength of evidence for Bayes factors is available at https://web.archive.org/web/20150214194051/http://www.bayesian-inference.com/bayesfactors.

Each Bayes factor, B, is the posterior odds in favor of the hypothesis divided by the prior odds in favor of the hypothesis, where the hypothesis is usually $\mathcal{M}_1 > \mathcal{M}_2$ . For example, when B[3,2]=2, the data favor $\mathcal{M}_3$ over $\mathcal{M}_2$ with 2:1 odds.

It is also popular to consider the natural logarithm of the Bayes factor. The scale of the logged Bayes factor is the same above and below one, which is more appropriate for visual comparisons. For example, when comparing two Bayes factors at 0.5 and 2, the logarithm of these Bayes factors is -0.69 and 0.69.

Gelman finds Bayes factors generally to be irrelevant, because they compute the relative probabilities of the models conditional on one of them being true. Gelman prefers approaches that measure the distance of the data to each of the approximate models (Gelman et al., 2004, p. 180), such as with posterior predictive checks (see the predict.iterquad function regarding iterative quadrature, predict.laplace function in the context of Laplace Approximation, predict.demonoid function in the context of MCMC, predict.pmc function in the context of PMC, or predict.vb function in the context of Variational Bayes). Kass et al. (1995) asserts this can be done without assuming one model is the true model.

Value

BayesFactor returns an object of class bayesfactor that is a list with the following components:

`B`	This is a matrix of Bayes factors.
`Hypothesis`	This is the hypothesis, and is stated as 'row > column', indicating that the model associated with the row of an element in matrix `B` is greater than the model associated with the column of that element.
`Strength.of.Evidence`	This is the strength of evidence in favor of the hypothesis.
`Posterior.Probability`	This is a vector of the posterior probability of each model, given flat priors.

Author(s)

Statisticat, LLC.

References

Gelman, A., Carlin, J., Stern, H., and Rubin, D. (2004). "Bayesian Data Analysis, Texts in Statistical Science, 2nd ed.". Chapman and Hall, London.

Jeffreys, H. (1961). "Theory of Probability, Third Edition". Oxford University Press: Oxford, England.

Kass, R.E. and Raftery, A.E. (1995). "Bayes Factors". Journal of the American Statistical Association, 90(430), p. 773–795.

Lewis, S.M. and Raftery, A.E. (1997). "Estimating Bayes Factors via Posterior Simulation with the Laplace-Metropolis Estimator". Journal of the American Statistical Association, 92, p. 648–655.

Examples

# The following example fits a model as Fit1, then adds a predictor, and
# fits another model, Fit2. The two models are compared with Bayes
# factors.

library(LaplacesDemon)

##############################  Demon Data  ###############################
data(demonsnacks)
J <- 2
y <- log(demonsnacks$Calories)
X <- cbind(1, as.matrix(log(demonsnacks[,10]+1)))
X[,2] <- CenterScale(X[,2])

#########################  Data List Preparation  #########################
mon.names <- "LP"
parm.names <- as.parm.names(list(beta=rep(0,J), sigma=0))
pos.beta <- grep("beta", parm.names)
pos.sigma <- grep("sigma", parm.names)
PGF <- function(Data) {
     beta <- rnorm(Data$J)
     sigma <- runif(1)
     return(c(beta, sigma))
     }
MyData <- list(J=J, PGF=PGF, X=X, mon.names=mon.names,
     parm.names=parm.names, pos.beta=pos.beta, pos.sigma=pos.sigma, y=y)

##########################  Model Specification  ##########################
Model <- function(parm, Data)
     {
     ### Parameters
     beta <- parm[Data$pos.beta]
     sigma <- interval(parm[Data$pos.sigma], 1e-100, Inf)
     parm[Data$pos.sigma] <- sigma
     ### Log-Prior
     beta.prior <- sum(dnormv(beta, 0, 1000, log=TRUE))
     sigma.prior <- dhalfcauchy(sigma, 25, log=TRUE)
     ### Log-Likelihood
     mu <- tcrossprod(Data$X, t(beta))
     LL <- sum(dnorm(Data$y, mu, sigma, log=TRUE))
     ### Log-Posterior
     LP <- LL + beta.prior + sigma.prior
     Modelout <- list(LP=LP, Dev=-2*LL, Monitor=LP,
          yhat=rnorm(length(mu), mu, sigma), parm=parm)
     return(Modelout)
     }

############################  Initial Values  #############################
Initial.Values <- GIV(Model, MyData, PGF=TRUE)

########################  Laplace Approximation  ##########################
Fit1 <- LaplaceApproximation(Model, Initial.Values, Data=MyData,
     Iterations=10000)
Fit1

##############################  Demon Data  ###############################
data(demonsnacks)
J <- 3
y <- log(demonsnacks$Calories)
X <- cbind(1, as.matrix(demonsnacks[,c(7,8)]))
X[,2] <- CenterScale(X[,2])
X[,3] <- CenterScale(X[,3])

#########################  Data List Preparation  #########################
mon.names <- c("sigma","mu[1]")
parm.names <- as.parm.names(list(beta=rep(0,J), sigma=0))
pos.beta <- grep("beta", parm.names)
pos.sigma <- grep("sigma", parm.names)
PGF <- function(Data) return(c(rnormv(Data$J,0,10), rhalfcauchy(1,5)))
MyData <- list(J=J, PGF=PGF, X=X, mon.names=mon.names,
     parm.names=parm.names, pos.beta=pos.beta, pos.sigma=pos.sigma, y=y)

############################  Initial Values  #############################
Initial.Values <- GIV(Model, MyData, PGF=TRUE)

########################  Laplace Approximation  ##########################
Fit2 <- LaplaceApproximation(Model, Initial.Values, Data=MyData,
     Iterations=10000)
Fit2

#############################  Bayes Factor  ##############################
Model.list <- list(M1=Fit1, M2=Fit2)
BayesFactor(Model.list)
# The following example fits a model as Fit1, then adds a predictor, and
# fits another model, Fit2. The two models are compared with Bayes
# factors.

library(LaplacesDemon)

##############################  Demon Data  ###############################
data(demonsnacks)
J <- 2
y <- log(demonsnacks$Calories)
X <- cbind(1, as.matrix(log(demonsnacks[,10]+1)))
X[,2] <- CenterScale(X[,2])

#########################  Data List Preparation  #########################
mon.names <- "LP"
parm.names <- as.parm.names(list(beta=rep(0,J), sigma=0))
pos.beta <- grep("beta", parm.names)
pos.sigma <- grep("sigma", parm.names)
PGF <- function(Data) {
     beta <- rnorm(Data$J)
     sigma <- runif(1)
     return(c(beta, sigma))
     }
MyData <- list(J=J, PGF=PGF, X=X, mon.names=mon.names,
     parm.names=parm.names, pos.beta=pos.beta, pos.sigma=pos.sigma, y=y)

##########################  Model Specification  ##########################
Model <- function(parm, Data)
     {
     ### Parameters
     beta <- parm[Data$pos.beta]
     sigma <- interval(parm[Data$pos.sigma], 1e-100, Inf)
     parm[Data$pos.sigma] <- sigma
     ### Log-Prior
     beta.prior <- sum(dnormv(beta, 0, 1000, log=TRUE))
     sigma.prior <- dhalfcauchy(sigma, 25, log=TRUE)
     ### Log-Likelihood
     mu <- tcrossprod(Data$X, t(beta))
     LL <- sum(dnorm(Data$y, mu, sigma, log=TRUE))
     ### Log-Posterior
     LP <- LL + beta.prior + sigma.prior
     Modelout <- list(LP=LP, Dev=-2*LL, Monitor=LP,
          yhat=rnorm(length(mu), mu, sigma), parm=parm)
     return(Modelout)
     }

############################  Initial Values  #############################
Initial.Values <- GIV(Model, MyData, PGF=TRUE)

########################  Laplace Approximation  ##########################
Fit1 <- LaplaceApproximation(Model, Initial.Values, Data=MyData,
     Iterations=10000)
Fit1

##############################  Demon Data  ###############################
data(demonsnacks)
J <- 3
y <- log(demonsnacks$Calories)
X <- cbind(1, as.matrix(demonsnacks[,c(7,8)]))
X[,2] <- CenterScale(X[,2])
X[,3] <- CenterScale(X[,3])

#########################  Data List Preparation  #########################
mon.names <- c("sigma","mu[1]")
parm.names <- as.parm.names(list(beta=rep(0,J), sigma=0))
pos.beta <- grep("beta", parm.names)
pos.sigma <- grep("sigma", parm.names)
PGF <- function(Data) return(c(rnormv(Data$J,0,10), rhalfcauchy(1,5)))
MyData <- list(J=J, PGF=PGF, X=X, mon.names=mon.names,
     parm.names=parm.names, pos.beta=pos.beta, pos.sigma=pos.sigma, y=y)

############################  Initial Values  #############################
Initial.Values <- GIV(Model, MyData, PGF=TRUE)

########################  Laplace Approximation  ##########################
Fit2 <- LaplaceApproximation(Model, Initial.Values, Data=MyData,
     Iterations=10000)
Fit2

#############################  Bayes Factor  ##############################
Model.list <- list(M1=Fit1, M2=Fit2)
BayesFactor(Model.list)

The Bayesian Bootstrap

Description

This function performs the Bayesian bootstrap of Rubin (1981), returning either bootstrapped weights or statistics.

Usage

BayesianBootstrap(X, n=1000, Method="weights", Status=NULL)
BayesianBootstrap(X, n=1000, Method="weights", Status=NULL)

Arguments

`X`	This is a vector or matrix of data. When a matrix is supplied, sampling is based on the first column.
`n`	This is the number of bootstrapped replications.
`Method`	When `Method="weights"` (which is the default), a matrix of row weights is returned. Otherwise, a function is accepted. The function specifies the statistic to be bootstrapped. The first argument of the function should be a matrix of data, and the second argument should be a vector of weights.
`Status`	This determines the periodicity of status messages. When `Status=100`, for example, a status message is displayed every 100 replications. Otherwise, `Status` defaults to `NULL`, and status messages are not displayed.

Details

The term, ‘bootstrap’, comes from the German novel Adventures of Baron Munchausen by Rudolph Raspe, in which the hero saves himself from drowning by pulling on his own bootstraps. The idea of the statistical bootstrap is to evaluate properties of an estimator through the empirical, rather than theoretical, CDF.

Rubin (1981) introduced the Bayesian bootstrap. In contrast to the frequentist bootstrap which simulates the sampling distribution of a statistic estimating a parameter, the Bayesian bootstrap simulates the posterior distribution.

The data, $\textbf{X}$ , are assumed to be independent and identically distributed (IID), and to be a representative sample of the larger (bootstrapped) population. Given that the data has $N$ rows in one bootstrap replication, the row weights are sampled from a Dirichlet distribution with all $N$ concentration parameters equal to $1$ (a uniform distribution over an open standard $N-1$ simplex). The distributions of a parameter inferred from considering many samples of weights are interpretable as posterior distributions on that parameter.

The Bayesian bootstrap is useful for estimating marginal posterior covariance and standard deviations for the posterior modes of LaplaceApproximation, especially when the model dimension (the number of parameters) is large enough that estimating the Hessian matrix of second partial derivatives is too computationally demanding.

Just as with the frequentist bootstrap, inappropriate use of the Bayesian bootstrap can lead to inappropriate inferences. The Bayesian bootstrap violates the likelihood principle, because the evaluation of a statistic of interest depends on data sets other than the observed data set. For more information on the likelihood principle, see https://web.archive.org/web/20150213002158/http://www.bayesian-inference.com/likelihood#likelihoodprinciple.

The BayesianBootstrap function has many uses, including creating test statistics on the population data given the observed data (supported here), imputation (with this variation: ABB), validation, and more.

Value

When Method="weights", this function returns a $N \times n$ matrix of weights, where the number of rows $N$ is equal to the number of rows in X.

For statistics, a matrix or array is returned, depending on the number of dimensions. The replicates are indexed by row in a matrix or in the first dimension of the array.

Author(s)

Bogumil Kaminski, [email protected] and Statisticat, LLC.

References

Rubin, D.B. (1981). "The Bayesian Bootstrap". The Annals of Statistics, 9(1), p. 130–134.

Examples

library(LaplacesDemon)

#Example 1: Samples
x <- 1:2
BB <- BayesianBootstrap(X=x, n=100, Method="weights"); BB

#Example 2: Mean, Univariate
x <- 1:2
BB <- BayesianBootstrap(X=x, n=100, Method=weighted.mean); BB

#Example 3: Mean, Multivariate
data(demonsnacks)
BB <- BayesianBootstrap(X=demonsnacks, n=100,
     Method=function(x,w) apply(x, 2, weighted.mean, w=w)); BB

#Example 4: Correlation
dye <- c(1.15, 1.70, 1.42, 1.38, 2.80, 4.70, 4.80, 1.41, 3.90)
efp <- c(1.38, 1.72, 1.59, 1.47, 1.66, 3.45, 3.87, 1.31, 3.75)
X <- matrix(c(dye,efp), length(dye), 2)
colnames(X) <- c("dye","efp")
BB <- BayesianBootstrap(X=X, n=100,
     Method=function(x,w) cov.wt(x, w, cor=TRUE)$cor); BB

#Example 5: Marginal Posterior Covariance
#The following example is commented out due to package build time.
#To run the following example, use the code from the examples in
#the LaplaceApproximation function for the data, model specification
#function, and initial values. Then perform the Laplace
#Approximation as below (with CovEst="Identity" and sir=FALSE) until
#convergence, set the latest initial values, then use the Bayesian
#bootstrap on the data, run the Laplace Approximation again to
#convergence, save the posterior modes, and repeat until S samples
#of the posterior modes are collected. Finally, calculate the
#parameter covariance or standard deviation.

#Fit <- LaplaceApproximation(Model, Initial.Values, Data=MyData,
#     Iterations=1000, Method="SPG",  CovEst="Identity", sir=FALSE)
#Initial.Values <- as.initial.values(Fit)
#S <- 100 #Number of bootstrapped sets of posterior modes (parameters)
#Z <- rbind(Fit$Summary1[,1]) #Bootstrapped parameters collected here
#N <- nrow(MyData$X) #Number of records
#MyData.B <- MyData
#for (s in 1:S) {
#     cat("\nIter:", s, "\n")
#     BB <- BayesianBootstrap(MyData$y, n=N)
#     z <- apply(BB, 2, function(x) sample.int(N, size=1, prob=x))
#     MyData.B$y <- MyData$y[z]
#     MyData.B$X <- MyData$X[z,]
#     Fit <- LaplaceApproximation(Model, Initial.Values, Data=MyData.B,
#          Iterations=1000, Method="SPG", CovEst="Identity", sir=FALSE)
#     Z <- rbind(Z, Fit$Summary1[,1])}
#cov(Z) #Bootstrapped marginal posterior covariance
#sqrt(diag(cov(Z))) #Bootstrapped marginal posterior standard deviations
library(LaplacesDemon)

#Example 1: Samples
x <- 1:2
BB <- BayesianBootstrap(X=x, n=100, Method="weights"); BB

#Example 2: Mean, Univariate
x <- 1:2
BB <- BayesianBootstrap(X=x, n=100, Method=weighted.mean); BB

#Example 3: Mean, Multivariate
data(demonsnacks)
BB <- BayesianBootstrap(X=demonsnacks, n=100,
     Method=function(x,w) apply(x, 2, weighted.mean, w=w)); BB

#Example 4: Correlation
dye <- c(1.15, 1.70, 1.42, 1.38, 2.80, 4.70, 4.80, 1.41, 3.90)
efp <- c(1.38, 1.72, 1.59, 1.47, 1.66, 3.45, 3.87, 1.31, 3.75)
X <- matrix(c(dye,efp), length(dye), 2)
colnames(X) <- c("dye","efp")
BB <- BayesianBootstrap(X=X, n=100,
     Method=function(x,w) cov.wt(x, w, cor=TRUE)$cor); BB

#Example 5: Marginal Posterior Covariance
#The following example is commented out due to package build time.
#To run the following example, use the code from the examples in
#the LaplaceApproximation function for the data, model specification
#function, and initial values. Then perform the Laplace
#Approximation as below (with CovEst="Identity" and sir=FALSE) until
#convergence, set the latest initial values, then use the Bayesian
#bootstrap on the data, run the Laplace Approximation again to
#convergence, save the posterior modes, and repeat until S samples
#of the posterior modes are collected. Finally, calculate the
#parameter covariance or standard deviation.

#Fit <- LaplaceApproximation(Model, Initial.Values, Data=MyData,
#     Iterations=1000, Method="SPG",  CovEst="Identity", sir=FALSE)
#Initial.Values <- as.initial.values(Fit)
#S <- 100 #Number of bootstrapped sets of posterior modes (parameters)
#Z <- rbind(Fit$Summary1[,1]) #Bootstrapped parameters collected here
#N <- nrow(MyData$X) #Number of records
#MyData.B <- MyData
#for (s in 1:S) {
#     cat("\nIter:", s, "\n")
#     BB <- BayesianBootstrap(MyData$y, n=N)
#     z <- apply(BB, 2, function(x) sample.int(N, size=1, prob=x))
#     MyData.B$y <- MyData$y[z]
#     MyData.B$X <- MyData$X[z,]
#     Fit <- LaplaceApproximation(Model, Initial.Values, Data=MyData.B,
#          Iterations=1000, Method="SPG", CovEst="Identity", sir=FALSE)
#     Z <- rbind(Z, Fit$Summary1[,1])}
#cov(Z) #Bootstrapped marginal posterior covariance
#sqrt(diag(cov(Z))) #Bootstrapped marginal posterior standard deviations

Bayes' Theorem

Description

Bayes' theorem shows the relation between two conditional probabilities that are the reverse of each other. This theorem is named after Reverend Thomas Bayes (1702-1761), and is also referred to as Bayes' law or Bayes' rule (Bayes and Price, 1763). Bayes' theorem expresses the conditional probability, or ‘posterior probability’, of an event $A$ after $B$ is observed in terms of the 'prior probability' of $A$ , prior probability of $B$ , and the conditional probability of $B$ given $A$ . Bayes' theorem is valid in all common interpretations of probability. This function provides one of several forms of calculations that are possible with Bayes' theorem.

Usage

BayesTheorem(PrA, PrBA)
BayesTheorem(PrA, PrBA)

Arguments

`PrA`	This required argument is the prior probability of $A$ , or $\Pr(A)$ .
`PrBA`	This required argument is the conditional probability of $B$ given $A$ or $\Pr(B \| A)$ , and is known as the data, evidence, or likelihood.

Details

Bayes' theorem provides an expression for the conditional probability of $A$ given $B$ , which is equal to

$\Pr(A | B) = \frac{\Pr(B | A)\Pr(A)}{\Pr(B)}$

For example, suppose one asks the question: what is the probability of going to Hell, conditional on consorting (or given that a person consorts) with Laplace's Demon. By replacing $A$ with $Hell$ and $B$ with $Consort$ , the question becomes

$\Pr(\mathrm{Hell} | \mathrm{Consort}) = \frac{\Pr(\mathrm{Consort} | \mathrm{Hell})\Pr(\mathrm{Hell})}{\Pr(\mathrm{Consort})}$

Note that a common fallacy is to assume that $\Pr(A | B) = \Pr(B | A)$ , which is called the conditional probability fallacy.

Another way to state Bayes' theorem (and this is the form in the provided function) is

$\Pr(A_i | B) = \frac{\Pr(B | A_i)\Pr(A_i)}{\Pr(B | A_i)\Pr(A_i) +\dots+ \Pr(B | A_n)\Pr(A_n)}$

Let's examine our burning question, by replacing $A_i$ with Hell or Heaven, and replacing $B$ with Consort

$\Pr(A_1) = \Pr(\mathrm{Hell})$
$\Pr(A_2) = \Pr(\mathrm{Heaven})$
$\Pr(B) = \Pr(\mathrm{Consort})$
$\Pr(A_1 | B) = \Pr(\mathrm{Hell} | \mathrm{Consort})$
$\Pr(A_2 | B) = \Pr(\mathrm{Heaven} | \mathrm{Consort})$
$\Pr(B | A_1) = \Pr(\mathrm{Consort} | \mathrm{Hell})$
$\Pr(B | A_2) = \Pr(\mathrm{Consort} | \mathrm{Heaven})$

Laplace's Demon was conjured and asked for some data. He was glad to oblige.

6 people consorted out of 9 who went to Hell.
5 people consorted out of 7 who went to Heaven.
75% of the population goes to Hell.
25% of the population goes to Heaven.

Now, Bayes' theorem is applied to the data. Four pieces are worked out as follows

$\Pr(\mathrm{Consort} | \mathrm{Hell}) = 6/9 = 0.666$
$\Pr(\mathrm{Consort} | \mathrm{Heaven}) = 5/7 = 0.714$
$\Pr(\mathrm{Hell}) = 0.75$
$\Pr(\mathrm{Heaven}) = 0.25$

Finally, the desired conditional probability $\Pr(\mathrm{Hell} | \mathrm{Consort})$ is calculated using Bayes' theorem

$\Pr(\mathrm{Hell} | \mathrm{Consort}) = \frac{0.666(0.75)}{0.666(0.75) + 0.714(0.25)}$
$\Pr(\mathrm{Hell} | \mathrm{Consort}) = 0.737$

The probability of someone consorting with Laplace's Demon and going to Hell is 73.7%, which is less than the prevalence of 75% in the population. According to these findings, consorting with Laplace's Demon does not increase the probability of going to Hell.

For an introduction to model-based Bayesian inference, see the accompanying vignette entitled “Bayesian Inference” or https://web.archive.org/web/20150206004608/http://www.bayesian-inference.com/bayesian.

Value

The BayesTheorem function returns the conditional probability of $A$ given $B$ , known in Bayesian inference as the posterior. The returned object is of class bayestheorem.

Author(s)

Statisticat, LLC.

References

Bayes, T. and Price, R. (1763). "An Essay Towards Solving a Problem in the Doctrine of Chances". By the late Rev. Mr. Bayes, communicated by Mr. Price, in a letter to John Canton, M.A. and F.R.S. Philosophical Transactions of the Royal Statistical Society of London, 53, p. 370–418.

Examples

# Pr(Hell|Consort) =
PrA <- c(0.75,0.25)
PrBA <- c(6/9, 5/7)
BayesTheorem(PrA, PrBA)
# Pr(Hell|Consort) =
PrA <- c(0.75,0.25)
PrBA <- c(6/9, 5/7)
BayesTheorem(PrA, PrBA)

Big Data

Description

This function enables Bayesian inference with data that is too large for computer memory (RAM) with the simplest method: reading in batches of data (where each batch is a section of rows), applying a function to the batch, and combining the results.

Usage

BigData(file, nrow, ncol, size=1, Method="add", CPUs=1, Type="PSOCK",
FUN, ...)
BigData(file, nrow, ncol, size=1, Method="add", CPUs=1, Type="PSOCK",
FUN, ...)

Arguments

`file`	This required argument accepts a path and filename that must refer to a .csv file, and that must contain only a numeric matrix without a header, row names, or column names.
`nrow`	This required argument accepts a scalar integer that indicates the number of rows in the big data matrix.
`ncol`	This required argument accepts a scalar integer that indicates the number of columns in the big data matrix.
`size`	This argument accepts a scalar integer that specifies the number of rows of each batch. The last batch is not required to have the same number of rows as the other batches. The largest possible size, and therefore the fewest number of batches, should be preferred.
`Method`	This argument accepts a scalar string, defaults to "add", and alternatively accepts "rbind". When `Method="rbind"`, the user-specified function `FUN` is applied to each batch, and results are combined together by rows. For example, if calculating $\mu = \textbf{X}\beta$ in, say, 10 batches, then the output column vector $\mu$ is equal to the number of rows of the big data set.
`CPUs`	This argument accepts an integer that specifies the number of central processing units (CPUs) of the multicore computer or computer cluster. This argument defaults to `CPUs=1`, in which parallel processing does not occur.
`Type`	This argument specifies the type of parallel processing to perform, accepting either `Type="PSOCK"` or `Type="MPI"`.
`FUN`	This required argument accepts a user-specified function that will be performed on each batch. The first argument in the function must be the data.
`...`	Additional arguments are used within the user-specified function. Additional arguments often refer to parameters.

Details

Big data is defined loosely here as data that is too large for computer memory (RAM). The BigData function uses the split-apply-combine strategy with a big data set. The unmanageable big data set is split into smaller, manageable pieces (batches), a function is applied to each batch, and results are combined.

Each iteration, the BigData function opens a connection to a big data set and keeps the connection open while the scan function reads in each batch of data (elsewhere, batches are often referred to chunks). A user-specified function is applied to each batch of data, the results are combined together, the connection is closed, and the results are returned.

As an introductory example, suppose a statistician updates a linear regression model, but the design matrix $\textbf{X}$ is too large for computer memory. Suppose the design matrix has 100 million rows, and the statistician specifies size=1e6. The statistician combines dependent variable $\textbf{y}$ with design matrix $\textbf{X}$ . Each iteration in IterativeQuadrature, LaplaceApproximation, LaplacesDemon, PMC, or VariationalBayes, the BigData function sequentially reads in one million rows of the combined data $\textbf{X}$ , calculates expectation vector $\mu$ , and finally returns the sum of the log-likelihood. The sum of the log-likelihood is added together for all batches, and returned.

There are many limitations with this function.

This function is not fast, in the sense that the entire big data set is processed in batches, each iteration. With iterative methods, this may perform well, albeit slowly.

There are many functions that cannot be performed on batches, though most models in the Examples vignette may easily be updated with big data.

Large matrices of samples are unaddressed, only the data.

Although many (but not all) models may be estimated, many additional functions in this package will not work when applied after the model has updated. Instead, a batch or random sample of data (see the read.matrix function for sampling from big data) should be used in the usual way, in the Data argument, and the Model function coded in the usual way without the BigData function.

Parallel processing may be performed when the user specifies CPUs to be greater than one, implying that the specified number of CPUs exists and is available. Parallelization may be performed on a multicore computer or a computer cluster. Either a Simple Network of Workstations (SNOW) or Message Passing Interface (MPI) is used. Each call to BigData establishes and closes the parallelization, which is costly, and unfortunately results in copious output to the console. With small data sets, parallel processing may be slower, due to computer network communication. With larger data sets, the user should experience a faster run-time.

There have been several alternative approaches suggested for big data.

Huang and Gelman (2005) propose that the user creates batches by sampling from big data, updating a separate Bayesian model on each batch, and combining the results into a consensus posterior. This many-mini-model approach may be faster when feasible, because multiple models may be updated in parallel, say one per CPU. Such results will work with all functions in this package. With the many-mini-model approach, several methods are proposed for combining posterior samples from batch-level models, such as by using a normal approximation, updating from prior to posterior sequentially (the posterior from the last batch becomes the prior of the next batch), sample from the full posterior via importance sampling from the batched posteriors, and more.

Scott et al. (2013) propose a method that they call Consensus Monte Carlo, which consists of breaking the data down into chunks, calling each chunk a shard, and use a many-mini-model approach as well, but propose their own method of weighting the posteriors back together.

Balakrishnan and Madigan (2006) introduced a Sequential Monte Carlo (SMC) sampler, a refinement of an earlier proposal, that was designed for big data. It makes one pass through the massive data set, after an initial MCMC estimation on a small sample. Each particle is updated for each record, resulting in numerous evaluations per record.

Welling and Teh (2011) proposed a new class of MCMC sampler in which only a random sample of big data is used each iteration. The stochastic gradient Langevin dynamics (SGLD) algorithm is available in the LaplacesDemon function.

An important alternative to consider is using the ff package, where "ff" stands for fast access file. The ff package has been tested successfully with updating a model in LaplacesDemon. Once the big data set, say $\textbf{X}$ , is an object of class ff_matrix, simply include it in the list of data as usual, and modify the Model specification function appropriately. For example, change mu <- tcrossprod(X, t(beta)) to mu <- tcrossprod(X[], t(beta)). The ff package is not included as a dependency in the LaplacesDemon package, so it must be installed and activated.

Value

The BigData function returns output that is the result of performing a user-specified function on batches of big data. Output is a matrix, and may have one or more column vectors.

Author(s)

Statisticat, LLC [email protected]

References

Balakrishnan, S. and Madigan, D. (2006). "A One-Pass Sequential Monte Carlo Method for Bayesian Analysis of Massive Datasets". Bayesian Analysis, 1(2), p. 345–362.

Huang, Z. and Gelman, A. (2005) "Sampling for Bayesian Computation with Large Datasets". SSRN eLibrary.

Scott, S.L., Blocker, A.W. and Bonassi, F.V. (2013). "Bayes and Big Data: The Consensus Monte Carlo Algorithm". In Bayes 250.

Welling, M. and Teh, Y.W. (2011). "Bayesian Learning via Stochastic Gradient Langevin Dynamics". Proceedings of the 28th International Conference on Machine Learning (ICML), p. 681–688.

Examples

### Below is an example of a linear regression model specification
### function in which BigData reads in a batch of 1,000 records of
### Data$N records from a data set that is too large to fully open
### in memory. The example simulates on 10,000 records, which is
### not big data; it's just a toy example. The data set is file X.csv,
### and the first column of matrix X is the dependent variable y. The
### user supplies a function to BigData along with parameters beta and
### sigma. When each batch of 1,000 records is read in,
### mu = XB is calculated, and then the LL is calculated as
### y ~ N(mu, sigma^2). These results are added together from all
### batches, and returned as LL.

library(LaplacesDemon)
N <- 10000
J <- 10 #Number of predictors, including the intercept
X <- matrix(1,N,J)
for (j in 2:J) {X[,j] <- rnorm(N,runif(1,-3,3),runif(1,0.1,1))}
beta.orig <- runif(J,-3,3)
e <- rnorm(N,0,0.1)
y <- as.vector(tcrossprod(beta.orig, X) + e)
mon.names <- c("LP","sigma")
parm.names <- as.parm.names(list(beta=rep(0,J), log.sigma=0))
PGF <- function(Data) return(c(rnormv(Data$J,0,0.01),
     log(rhalfcauchy(1,1))))
MyData <- list(J=J, PGF=PGF, N=N, mon.names=mon.names,
     parm.names=parm.names) #Notice that X and y are not included here
filename <- tempfile("X.csv")  
write.table(cbind(y,X), filename, sep=",", row.names=FALSE,
  col.names=FALSE)

Model <- function(parm, Data)
     {
     ### Parameters
     beta <- parm[1:Data$J]
     sigma <- exp(parm[Data$J+1])
     ### Log(Prior Densities)
     beta.prior <- sum(dnormv(beta, 0, 1000, log=TRUE))
     sigma.prior <- dhalfcauchy(sigma, 25, log=TRUE)
     ### Log-Likelihood
     LL <- BigData(file=filename, nrow=Data$N, ncol=Data$J+1, size=1000,
          Method="add", CPUs=1, Type="PSOCK",
          FUN=function(x, beta, sigma) sum(dnorm(x[,1], tcrossprod(x[,-1],
               t(beta)), sigma, log=TRUE)), beta, sigma)
     ### Log-Posterior
     LP <- LL + beta.prior + sigma.prior
     Modelout <- list(LP=LP, Dev=-2*LL, Monitor=c(LP,sigma),
          yhat=0,#rnorm(length(mu), mu, sigma),
          parm=parm)
     return(Modelout)
     }

### From here, the user may update the model as usual.
### Below is an example of a linear regression model specification
### function in which BigData reads in a batch of 1,000 records of
### Data$N records from a data set that is too large to fully open
### in memory. The example simulates on 10,000 records, which is
### not big data; it's just a toy example. The data set is file X.csv,
### and the first column of matrix X is the dependent variable y. The
### user supplies a function to BigData along with parameters beta and
### sigma. When each batch of 1,000 records is read in,
### mu = XB is calculated, and then the LL is calculated as
### y ~ N(mu, sigma^2). These results are added together from all
### batches, and returned as LL.

library(LaplacesDemon)
N <- 10000
J <- 10 #Number of predictors, including the intercept
X <- matrix(1,N,J)
for (j in 2:J) {X[,j] <- rnorm(N,runif(1,-3,3),runif(1,0.1,1))}
beta.orig <- runif(J,-3,3)
e <- rnorm(N,0,0.1)
y <- as.vector(tcrossprod(beta.orig, X) + e)
mon.names <- c("LP","sigma")
parm.names <- as.parm.names(list(beta=rep(0,J), log.sigma=0))
PGF <- function(Data) return(c(rnormv(Data$J,0,0.01),
     log(rhalfcauchy(1,1))))
MyData <- list(J=J, PGF=PGF, N=N, mon.names=mon.names,
     parm.names=parm.names) #Notice that X and y are not included here
filename <- tempfile("X.csv")  
write.table(cbind(y,X), filename, sep=",", row.names=FALSE,
  col.names=FALSE)

Model <- function(parm, Data)
     {
     ### Parameters
     beta <- parm[1:Data$J]
     sigma <- exp(parm[Data$J+1])
     ### Log(Prior Densities)
     beta.prior <- sum(dnormv(beta, 0, 1000, log=TRUE))
     sigma.prior <- dhalfcauchy(sigma, 25, log=TRUE)
     ### Log-Likelihood
     LL <- BigData(file=filename, nrow=Data$N, ncol=Data$J+1, size=1000,
          Method="add", CPUs=1, Type="PSOCK",
          FUN=function(x, beta, sigma) sum(dnorm(x[,1], tcrossprod(x[,-1],
               t(beta)), sigma, log=TRUE)), beta, sigma)
     ### Log-Posterior
     LP <- LL + beta.prior + sigma.prior
     Modelout <- list(LP=LP, Dev=-2*LL, Monitor=c(LP,sigma),
          yhat=0,#rnorm(length(mu), mu, sigma),
          parm=parm)
     return(Modelout)
     }

### From here, the user may update the model as usual.

Blocks

Description

The Blocks function returns a list of $N$ blocks of parameters, for use with some MCMC algorithms in the LaplacesDemon function. Blocks may be created either sequentially, or from a hierarchical clustering of the posterior correlation matrix.

Usage

Blocks(Initial.Values, N, PostCor=NULL)
Blocks(Initial.Values, N, PostCor=NULL)

Arguments

`Initial.Values`	This required argument is a vector of initial values.
`N`	This optional argument indicates the desired number of blocks. If omitted, then the truncated square root of the number of initial values is used. If a posterior correlation matrix is supplied to `PostCor`, then `N` may be a scalar, or have length two. If `N` has length two, then the first element indicates the minimum number of blocks, and the second element indicates the maximum number of blocks, and the number of blocks is the maximum of the mean silhouette width for each hierarchical cluster solution.
`PostCor`	This optional argument defaults to `NULL`, in which case sequential blocking is performed. If a posterior correlation matrix is supplied, then blocks are created based on hierarchical clustering.

Details

Usually, there is more than one target distribution in MCMC, in which case it must be determined whether it is best to sample from target distributions individually, in groups, or all at once. Blockwise sampling (also called block updating) refers to splitting a multivariate vector into groups called blocks, and each block is sampled separately. A block may contain one or more parameters.

Parameters are usually grouped into blocks such that parameters within a block are as correlated as possible, and parameters between blocks are as independent as possible. This strategy retains as much of the parameter correlation as possible for blockwise sampling, as opposed to componentwise sampling where parameter correlation is ignored. The PosteriorChecks function can be used on the output of previous runs to find highly correlated parameters. See examples below.

Advantages of blockwise sampling are that a different MCMC algorithm may be used for each block (or parameter, for that matter), creating a more specialized approach (though different algorithms by block are not supported here), the acceptance of a newly proposed state is likely to be higher than sampling from all target distributions at once in high dimensions, and large proposal covariance matrices can be reduced in size, which is most helpful again in high dimensions.

Disadvantages of blockwise sampling are that correlations probably exist between parameters between blocks, and each block is updated while holding the other blocks constant, ignoring these correlations of parameters between blocks. Without simultaneously taking everything into account, the algorithm may converge slowly or never arrive at the proper solution. However, there are instances when it may be best when everything is not taken into account at once, such as in state-space models. Also, as the number of blocks increases, more computation is required, which slows the algorithm. In general, blockwise sampling allows a more specialized approach at the expense of accuracy, generalization, and speed. Blockwise sampling is offered in the following algorithms: Adaptive-Mixture Metropolis (AMM), Adaptive Metropolis-within-Gibbs (AMWG), Automated Factor Slice Sampler (AFSS), Elliptical Slice Sampler (ESS), Hit-And-Run Metropolis (HARM), Metropolis-within-Gibbs (MWG), Random-Walk Metropolis (RWM), Robust Adaptive Metropolis (RAM), Slice Sampler (Slice), and the Univariate Eigenvector Slice Sampler (UESS).

Large-dimensional models often require blockwise sampling. For example, with thousands of parameters, a componentwise algorithm must evaluate the model specification function once per parameter per iteration, resulting in an algorithm that may take longer than is acceptable to produce samples. Algorithms that require derivatives, such as the family of Hamiltonian Monte Carlo (HMC), require even more evaluations of the model specification function per iteration, and quickly become too costly in large dimensions. Finally, algorithms with multivariate proposals often have difficulty producing an accepted proposal in large-dimensional models. The most practical solution is to group parameters into $N$ blocks, and each iteration the algorithm evaluates the model specification function $N$ times, each with a reduced set of parameters.

The Blocks function performs either a sequential assignment of parameters to blocks when posterior correlation is not supplied, or uses hierarchical clustering to create blocks based on posterior correlation. If posterior correlation is supplied, then the user may specify a range of the number of blocks to consider, and the optimal number of blocks is considered to be the maximum of the mean silhouette width of each hierarchical clustering. Silhouette width is calculated as per the cluster package. Hierarchical clustering is performed on the distance matrix calculated from the dissimilarity matrix (1 - abs(PostCor)) of the posterior correlation matrix. With sequential assignment, the number of parameters per block is approximately equal. With hierarchical clustering, the number of parameters per block may vary widely. Creating blocks from hierarchical clustering performs well in practice, though there are many alternative methods the user may consider outside of this function, such as using factor analysis, model-based clustering, or other methods.

Aside from sequentially-assigned blocks, or blocks based on posterior correlation, it is also common to group parameters with similar uses, such as putting regression effects parameters into one block, and autocorrelation parameters into another block. Another popular way to group parameters into blocks is by time-period for some time-series models. These alternative blocking strategies are unsupported in the Blocks function, and best left to user discretion.

Some MCMC algorithms that accept blocked parameters also require blocked variance-covariance matrices. The Blocks function does not return these matrices, because it may not be necessary, or when it is, the user may prefer identity matrices, scaled identity matrices, or matrices with explicitly-defined elements.

If the user is looking for a place to begin with blockwise sampling, then the recommended, default approach (when blocked parameters by time-period are not desired in a time-series) is to begin with a trial run of the adaptive, unblocked HARM algorithm (since covariance matrices are not required) for the purposes of obtaining a posterior correlation matrix. Next, create blocks with the Blocks function based on the posterior correlation matrix obtained from the trial run. Finally, run the desired, blocked algorithm with the newly created blocks (and possibly user-specified covariance matrices), beginning where the trial run ended.

If hierarchical clustering is used, then it is important to note that hierarchical clustering has no idea that the user intends to perform blockwise sampling in MCMC. If hierarchical clustering returns numerous small blocks, then the user may consider combining some or all of those blocks. For example, if several 1-parameter blocks are returned, then blockwise sampling will equal componentwise sampling for those blocks, which will iterate slower. Conversely, if hierarchical clustering returns one or more big blocks, each with enough parameters that multivariate sampling will have difficulty getting an accepted proposal, or an accepted proposal that moves more than a small amount, then the user may consider subdividing these big blocks into smaller, more manageable blocks, though with the understanding that more posterior correlation is unaccounted for.

Value

The Blocks function returns an object of class blocks, which is a list. Each component of the list is a block of parameters, and parameters are indicated by their position in the initial values vector.

Author(s)

Statisticat, LLC. [email protected]

Examples

library(LaplacesDemon)

### Create the default number of sequentially assigned blocks:
Initial.Values <- rep(0,1000)
MyBlocks <- Blocks(Initial.Values)
MyBlocks

### Or, a pre-specified number of sequentially assigned blocks:
#Initial.Values <- rep(0,1000)
#MyBlocks <- Blocks(Initial.Values, N=20)

### If scaled diagonal covariance matrices are desired:
#VarCov <- list()
#for (i in 1:length(MyBlocks))
#  VarCov[[i]] <- diag(length(MyBlocks[[i]]))*2.38^2/length(MyBlocks[[i]])

### Or, determine the number of blocks in the range of 2 to 50 from
### hierarchical clustering on the posterior correlation matrix of an
### object, say called Fit, output from LaplacesDemon:
#MyBlocks <- Blocks(Initial.Values, N=c(2,50),
#  PostCor=cor(Fit$Posterior1))
#lapply(MyBlocks, length) #See the number of parameters per block

### Or, create a pre-specified number of blocks from hierarchical
### clustering on the posterior correlation matrix of an object,
### say called Fit, output from LaplacesDemon:
#MyBlocks <- Blocks(Initial.Values, N=20, PostCor=cor(Fit$Posterior1))

### Posterior correlation from a previous trial run could be obtained
### with either method below (though cor() will be fastest because
### additional checks are not calculated for the parameters):
#rho <- cor(Fit$Posterior1)
#rho <- PosteriorChecks(Fit)$Posterior.Correlation
library(LaplacesDemon)

### Create the default number of sequentially assigned blocks:
Initial.Values <- rep(0,1000)
MyBlocks <- Blocks(Initial.Values)
MyBlocks

### Or, a pre-specified number of sequentially assigned blocks:
#Initial.Values <- rep(0,1000)
#MyBlocks <- Blocks(Initial.Values, N=20)

### If scaled diagonal covariance matrices are desired:
#VarCov <- list()
#for (i in 1:length(MyBlocks))
#  VarCov[[i]] <- diag(length(MyBlocks[[i]]))*2.38^2/length(MyBlocks[[i]])

### Or, determine the number of blocks in the range of 2 to 50 from
### hierarchical clustering on the posterior correlation matrix of an
### object, say called Fit, output from LaplacesDemon:
#MyBlocks <- Blocks(Initial.Values, N=c(2,50),
#  PostCor=cor(Fit$Posterior1))
#lapply(MyBlocks, length) #See the number of parameters per block

### Or, create a pre-specified number of blocks from hierarchical
### clustering on the posterior correlation matrix of an object,
### say called Fit, output from LaplacesDemon:
#MyBlocks <- Blocks(Initial.Values, N=20, PostCor=cor(Fit$Posterior1))

### Posterior correlation from a previous trial run could be obtained
### with either method below (though cor() will be fastest because
### additional checks are not calculated for the parameters):
#rho <- cor(Fit$Posterior1)
#rho <- PosteriorChecks(Fit)$Posterior.Correlation

BMK Convergence Diagnostic

Description

Given a matrix of posterior samples from MCMC, the BMK.Diagnostic function calculates Hellinger distances between consecutive batches for each chain. This is useful for monitoring convergence of MCMC chains.

Usage

BMK.Diagnostic(X, batches=10)
BMK.Diagnostic(X, batches=10)

Arguments

`X`	This required argument accepts a matrix of posterior samples or an object of class `demonoid`, in which case it uses the posterior samples in `X$Posterior1`.
`batches`	This is the number of batches on which the convergence diagnostic will be calculated. The `batches` argument defaults to 10.

Details

Hellinger distance is used to quantify dissimilarity between two probability distributions. It is based on the Hellinger integral, introduced by Hellinger (1909). Traditionally, Hellinger distance is bound to the interval [0,1], though another popular form occurs in the interval [0, $\sqrt{2}$ ]. A higher value of Hellinger distance is associated with more dissimilarity between the distributions.

Convergence is assumed when Hellinger distances are below a threshold, indicating that posterior samples are similar between consecutive batches. If all Hellinger distances beyond a given batch of samples is below the threshold, then burnin is suggested to occur immediately before the first batch of satisfactory Hellinger distances.

As an aid to interpretation, consider a matrix of 1,000 posterior samples from three chains: beta[1], beta[2], and beta[3]. With 10 batches, the column names are: 100, 200, ..., 900. A Hellinger distance for the chain beta[1] at 100 is the Hellinger distance between two batches: samples 1-100, and samples 101:200.

A benefit to using BMK.Diagnostic is that the resulting Hellinger distances may easily be plotted with the plotMatrix function, allowing the user to see quickly which consecutive batches of which chains were dissimilar. This makes it easier to find problematic chains.

The BMK.Diagnostic is calculated automatically in the LaplacesDemon function, and is one of the criteria in the Consort function regarding the recommendation of when to stop updating the Markov chain Monte Carlo (MCMC) sampler in LaplacesDemon.

For more information on the related topics of burn-in and stationarity, see the burnin and is.stationary functions, and the accompanying vignettes.

Value

The BMK.Diagnostic function returns an object of class bmk that is a $J \times B$ matrix of Hellinger distances between consecutive batches for $J$ parameters of posterior samples. The number of columns, $B$ is equal to the number of batches minus one.

The BMK.Diagnostic function is similar to the bmkconverge function in package BMK.

References

Boone, E.L., Merrick, J.R. and Krachey, M.J. (2013). "A Hellinger Distance Approach to MCMC Diagnostics". Journal of Statistical Computation and Simulation, in press.

Hellinger, E. (1909). "Neue Begrundung der Theorie quadratischer Formen von unendlichvielen Veranderlichen" (in German). Journal fur die reine und angewandte Mathematik, 136, p. 210–271.

Examples

library(LaplacesDemon)
N <- 1000 #Number of posterior samples
J <- 10 #Number of parameters
Theta <- matrix(runif(N*J),N,J)
colnames(Theta) <- paste("beta[", 1:J, "]", sep="")
for (i in 2:N) {Theta[i,1] <- Theta[i-1,1] + rnorm(1)}
HD <- BMK.Diagnostic(Theta, batches=10)
plot(HD, title="Hellinger distance between batches")
library(LaplacesDemon)
N <- 1000 #Number of posterior samples
J <- 10 #Number of parameters
Theta <- matrix(runif(N*J),N,J)
colnames(Theta) <- paste("beta[", 1:J, "]", sep="")
for (i in 2:N) {Theta[i,1] <- Theta[i-1,1] + rnorm(1)}
HD <- BMK.Diagnostic(Theta, batches=10)
plot(HD, title="Hellinger distance between batches")

Burn-in

Description

The burnin function estimates the duration of burn-in in iterations for one or more Markov chains. “Burn-in” refers to the initial portion of a Markov chain that is not stationary and is still affected by its initial value.

Usage

burnin(x, method="BMK")
burnin(x, method="BMK")

Arguments

`x`	This is a vector or matrix of posterior samples for which a the number of burn-in iterations will be estimated.
`method`	This argument defaults to `"BMK"`, in which case stationarity is estimated with the `BMK.Diagnostic` function. Alternatively, the `Geweke.Diagnostic` function may be used when `method="Geweke"` or the `KS.Diagnostic` function may be used when `method="KS"`.

Details

Burn-in is a colloquial term for the initial iterations in a Markov chain prior to its convergence to the target distribution. During burn-in, the chain is not considered to have “forgotten” its initial value.

Burn-in is not a theoretical part of MCMC, but its use is the norm because of the need to limit the number of posterior samples due to computer memory. If burn-in were retained rather than discarded, then more posterior samples would have to be retained. If a Markov chain starts anywhere close to the center of its target distribution, then burn-in iterations do not need to be discarded.

In the LaplacesDemon function, stationarity is estimated with the BMK.Diagnostic function on all thinned posterior samples of each chain, beginning at cumulative 10% intervals relative to the total number of samples, and the lowest number in which all chains are stationary is considered the burn-in.

The term, “burn-in”, originated in electronics regarding the initial testing of component failure at the factory to eliminate initial failures (Geyer, 2011). Although “burn-in' has been the standard term for decades, some are referring to these as “warm-up” iterations.

Value

The burnin function returns a vector equal in length to the number of MCMC chains in x, and each element indicates the maximum iteration in burn-in.

Author(s)

Statisticat, LLC. [email protected]

References

Geyer, C.J. (2011). "Introduction to Markov Chain Monte Carlo". In S Brooks, A Gelman, G Jones, and M Xiao-Li (eds.), "Handbook of Markov Chain Monte Carlo", p. 3–48. Chapman and Hall, Boca Raton, FL.

Examples

library(LaplacesDemon)
x <- rnorm(1000)
burnin(x)
library(LaplacesDemon)
x <- rnorm(1000)
burnin(x)

Caterpillar Plot

Description

A caterpillar plot is a horizontal plot of 3 quantiles of selected distributions. This may be used to produce a caterpillar plot of posterior samples (parameters and monitored variables) from an object either of class demonoid, demonoid.hpc, iterquad, laplace, pmc, vb, or a matrix.

Usage

caterpillar.plot(x, Parms=NULL, Title=NULL)caterpillar.plot(x, Parms=NULL, Title=NULL)

Arguments

`x`	This required argument is an object of class `demonoid`, codedemonoid.hpc, `iterquad`, `laplace`, `pmc`, `vb`, or a $S \times J$ matrix of $S$ samples and $J$ variables. For an object of class `demonoid`, the distributions of the stationary posterior summary (`Summary2`) will be attempted first, and if missing, then the parameters of all posterior samples (`Summary1`) will be plotted. For an object of class `demonoid.hpc`, stationarity may differ by chain, so all posterior samples (`Summary1`) are used. For an object of class `laplace` or `vb`, the distributions in the posterior summary, `Summary`, are plotted according to the posterior draws, sampled with sampling importance resampling in the `SIR` function. When a generic matrix is supplied, unimodal 95% HPD intervals are estimated with the `p.interval` function.
`Parms`	This argument accepts a vector of quoted strings to be matched for selecting parameters and monitored variables for plotting (though all parameters are selected when a generic matrix is supplied). This argument defaults to `NULL` and selects every parameter for plotting. Each quoted string is matched to one or more parameter names with the `grep` function. For example, if the user specifies `Parms=c("eta", "tau")`, and if the parameter names are beta[1], beta[2], eta[1], eta[2], and tau, then all parameters will be selected, because the string `eta` is within `beta`. Since `grep` is used, string matching uses regular expressions, so beware of meta-characters, though these are acceptable: ".", "[", and "]".
`Title`	This argument accepts a title for the plot.

Details

Caterpillar plots are popular plots in Bayesian inference for summarizing the quantiles of posterior samples. A caterpillar plot is similar to a horizontal boxplot, though without quartiles, making it easier for the user to study more distributions in a single plot. The following quantiles are plotted as a line for each parameter: 0.025 and 0.975, with the exception of a generic matrix, where unimodal 95% HPD intervals are estimated (for more information, see p.interval). A vertical, gray line is included at zero. For all but class demonoid.hpc, the median appears as a black dot, and the quantile line is black. For class demonoid.hpc, the color of the median and quantile line differs by chain; the first chain is black and additional chains appear beneath.

Author(s)

Statisticat, LLC. [email protected]

Examples

#An example is provided in the LaplacesDemon function.#An example is provided in the LaplacesDemon function.

Centering and Scaling

Description

This function either centers and scales a continuous variable and provides options for binary variables, or returns an untransformed variable from a centered and scaled variable.

Usage

CenterScale(x, Binary="none", Inverse=FALSE, mu, sigma, Range, Min)
CenterScale(x, Binary="none", Inverse=FALSE, mu, sigma, Range, Min)

Arguments

`x`	This is a vector to be centered and scaled, or to be untransformed if `Inverse=TRUE`.
`Binary`	This argument indicates how binary variables will be treated, and defaults to `"none"`, which keeps the original scale, or transforms the variable to the 0-1 range, if not already there. With `"center"`, it will center the binary variable by subtracting the mean. With `"center0"`, it centers the binary variable at zero, recoding a 0 to -0.5, and a 1 to 0.5. Finally, `"centerscale"` will center and scale the binary variable, subtracting the mean and dividing by two standard deviations.
`Inverse`	Logical. If `TRUE`, then a centered and scaled variable `x` will be transformed to its original, un-centered and un-scaled state. This defaults to `FALSE`.
`mu`, `sigma`, `Range`, `Min`	These arguments are required only when `Inverse=TRUE`, where `mu` is the mean, `sigma` is the standard deviation, `Range` is the range, and `Min` is the minimum of the original `x`. `Range` and `Min` are used only when `Binary="none"` or `Binary="center0"`.

Details

Gelman (2008) recommends centering and scaling continuous predictors to facilitate MCMC convergence and enable comparisons between coefficients of centered and scaled continuous predictors with coefficients of untransformed binary predictors. A continuous predictor is centered and scaled as follows: x.cs <- (x - mean(x)) / (2*sd(x)). This is an improvement over the usual practice of standardizing predictors, which is x.z <- (x - mean(x)) / sd(x), where coefficients cannot be validly compared between binary and continuous predictors.

In MCMC, such as in LaplacesDemon, a centered and scaled predictor often results in a higher effective sample size (ESS), and therefore the chain mixes better. Centering and scaling is a method of re-parameterization to improve mixing.

Griffin and Brown (2013) also assert that the user may not want to scale predictors that are measured on the same scale, since scaling in this case may increase noisy, low signals. In this case, centering (without scaling) is recommended. To center a predictor, subtract its mean.

Value

The CenterScale function returns a centered and scaled vector, or the untransformed vector.

References

Gelman, A. (2008). "Scaling Regression Inputs by Dividing by Two Standard Devations". Statistics in Medicine, 27, p. 2865–2873.

Griffin, J.E. and Brown, P.J. (2013) "Some Priors for Sparse Regression Modelling". Bayesian Analysis, 8(3), p. 691–702.

Examples

### See the LaplacesDemon function for an example in use.
library(LaplacesDemon)
x <- rnorm(100,10,1)
x.cs <- CenterScale(x)
x.orig <- CenterScale(x.cs, Inverse=TRUE, mu=mean(x), sigma=sd(x))
### See the LaplacesDemon function for an example in use.
library(LaplacesDemon)
x <- rnorm(100,10,1)
x.cs <- CenterScale(x)
x.orig <- CenterScale(x.cs, Inverse=TRUE, mu=mean(x), sigma=sd(x))

Combine Demonoid Objects

Description

This function combines objects of class demonoid.

Usage

Combine(x, Data, Thinning=1)
Combine(x, Data, Thinning=1)

Arguments

`x`	This is a list of objects of class `demonoid`, and this list may be an object of class `demonoid.hpc`.
`Data`	This is the data, and must be identical to the data used to create the `demonoid` objects with `LaplacesDemon`.
`Thinning`	This is the amount of thinning to apply to the posterior samples after appending them together. `Thinning` defaults to 1, in which case all samples are retained. For example, in the case of, say, `Thinning=10`, then only every 10th sample would be retained. When combining parallel chains, `Thinning` is often left to its default. When combining consecutive updates, `Thinning` is usually applied, with the value equal to the number of objects of class `demonoid`. For more information on thinning, see the `Thin` function.

Details

The purpose of the Combine function is to enable a user to combine objects of class demonoid for one of three reasons. First, parallel chains from LaplacesDemon.hpc may be combined after convergence is assessed with Gelman.Diagnostic. Second, consecutive updates of single chains from LaplacesDemon or parallel chains from LaplacesDemon.hpc may be combined when the computer has insufficient random-access memory (RAM) for the user to update once with enough iterations. Third, consecutive single-chain or parallel-chain updates may be combined when it seems that the logarithm of the joint posterior distribution, LP, seems to be oscillating up and down, which is described in more detail below.

The most common use regards the combination of parallel chains output from LaplacesDemon.hpc. Typically, a user with parallel chains examines them graphically with the caterpillar.plot and plot (actually, plot.demonoid) functions, and assesses convergence with the Gelman.Diagnostic function. Thereafter, the parallel chain output in the object of class demonoid.hpc should be combined into a single object of class demonoid, before doing posterior predictive checks and making inferences. In this case, the Thinning argument usually is recommended to remain at its default.

It is also common with a high-dimensional model (a model with a large number of parameters) to need more posterior samples than allowed by the random-access memory (RAM) of the computer. In this case, it is best to use the LaplacesDemon.RAM function to estimate the amount of RAM that a given model will require with a given number of iterations, and then update LaplacesDemon almost as much as RAM allows, and save the output object of class demonoid. Then, the user is advised to continue onward with a consecutive update (after using as.initial.values and anything else appropriate to prepare for the consecutive update). Suppose a user desires to update a gigantic model with thousands of parameters, and with the aid of LaplacesDemon.RAM, estimates that they can safely update only 100,000 iterations, and that 150,000 iterations would exceed RAM and crash the computer. The patient user can update several consecutive models, each with retaining only 1,000 thinned posterior samples, and combine them later with the Combine function, by placing multiple objects into a list, as described below. In this way, it is possible for a user to update models that otherwise far exceed computer RAM.

Less commonly, multiple updates of single-chain objects should be combined into a single object of class demonoid. This is most useful in complicated models that are run for large numbers of iterations, where it may be suspected that stationarity has been achieved, but that thinning is insufficient, and the samples may be combined and thinned. If followed, then these suggestions may continue seemingly to infinity, and the unnormalized logarithm of the joint posterior density, LP, may seem to oscillate, sometimes improving and getting higher, and getting lower during other updates. For this purpose, the prior covariance matrix of the last model is retained (rather than combining them). This may be an unpleasant surprise for combining parallel updates, so be aware of it.

In these cases, which usually involve complicated models with high autocorrelation in the chains, the user may opt to use parallel processing with the LaplacesDemon.hpc function, or may use the LaplacesDemon function as follows. The user should save (meaning, not overwrite) each object of class demonoid, place multiple objects into a list, and use the Combine function to combine these objects.

For example, suppose a user names the object Fit, as in the LaplacesDemon example. Now, rather than overwriting object Fit, object Fit is renamed, after updating a million iterations, to Fit1. As suggested by Consort, another million iterations are used, but now to create object Fit2. Further suppose this user specified Thinning=1000 in LaplacesDemon, meaning that the million iterations are thinned by 1,000, so only 1,000 iterations are retained in each object, Fit1 and Fit2. In this case, Combine combines the information in Fit1 and Fit2, and returns an object the user names Fit3. Fit3 has only 1,000 iterations, which is the result of appending the iterations in Fit1 and Fit2, and thinning by 2. If 2,000,000 iterations were updated from the beginning, and were thinned by 2,000, then the same information exists now in Fit3. The Consort function can now be applied to Fit3, to see if stationarity is found. If not, then more objects of class demonoid can be collected and combined.

Value

This function returns an object of class demonoid. For more information on an object of class demonoid, see the LaplacesDemon function.

Author(s)

Statisticat, LLC. [email protected]

Conditional Plots

Description

This function provides several styles of conditional plots with base graphics.

Usage

cond.plot(x, y, z, Style="smoothscatter")
cond.plot(x, y, z, Style="smoothscatter")

Arguments

`x`	This required argument accepts a numeric vector.
`y`	This argument accepts a numeric vector, and is only used with some styles.
`z`	This required argument accepts a discrete vector.
`Style`	This argument specifies the style of plot, and accepts "boxplot", "densover" (density overlay), "hist", "scatter", or "smoothscatter".

Details

The cond.plot function provides simple conditional plots with base graphics. All plot styles are conditional upon z. Up to nine conditional plots are produced in a panel.

Plots include:

The cond.plot function is not intended to try to compete with some of the better graphics packages, but merely to provide simple functionality.

Value

Conditional plots are returned.

Author(s)

Statisticat, LLC. [email protected]

Examples

library(LaplacesDemon)
x <- rnorm(1000)
y <- runif(1000)
z <- rcat(1000, rep(1/4,4))
cond.plot(x, y, z, Style="smoothscatter")
library(LaplacesDemon)
x <- rnorm(1000)
y <- runif(1000)
z <- rcat(1000, rep(1/4,4))
cond.plot(x, y, z, Style="smoothscatter")

Consort with Laplace's Demon

Description

This may be used to consort with Laplace's Demon regarding an object of class demonoid. Laplace's Demon will offer suggestions.

Usage

Consort(object)Consort(object)

Arguments

object

This required argument is an object of class demonoid. For more information, see the LaplacesDemon function.

Details

First, Consort calls print.demonoid, which prints most of the components to the screen from the supplied object of class demonoid.

Second, Laplace's Demon considers a combination of five conditions when making the largest part of its suggestion. These conditions are: the algorithm, acceptance rate, MCSE, ESS, and stationarity. Other things are considered as well, such as the recommended thinning value is used to suggest a new number of iterations, how fast the algorithm is expected to be, and if the condition of diminishing adaptation (also called the vanishing adaptation condition) was met (for an adaptive algorithm). Diminishing adaptation occurs only when the absolute value of the proposed variances trends downward (toward zero) over the course of all adaptations. When an algorithm is adaptive and it does not have diminishing adaptations, the Consort function will suggest a different adaptive algorithm. The Periodicity argument is suggested to be set equal to the value of Rec.Thinning.

Appeasement applies only when all parameters are continuous.The Hangartner.Diagnostic should be considered for discrete parameters.

Appeasement Conditions

Algorithm: The final algorithm must be non-adaptive, so that the Markov property holds. This is conservative. A user may have an adaptive (non-final) algorithm in which adaptations in the latest update are stationary, or no longer diminishing. Laplace's Demon is unaware of previous updates, and conservatively interprets this as failing to meet the condition of diminishing adaptation, when the output may be satisfactory. On the other hand, if the adaptive algorithm has essentially stopped adapting, and if there is a non-adaptive version, then the user should consider switching to the non-adaptive algorithm. User discretion is advised.
Acceptance Rate: The acceptance rate is considered satisfactory if it is within the interval [15%,50%] for most algorithms. Some algorithms have different recommended intervals.
MCSE: The Monte Carlo Standard Error (MCSE) is considered satisfactory for each target distribution if it is less than 6.27% of the standard deviation of the target distribution. This allows the true mean to be within 5% of the area under a Gaussian distribution around the estimated mean. The MCSE function is used. Toft et al. (2007) propose a stricter criterion of 5%. The criterion of 6.27% for this stopping rule is arbitrary, and may be too lenient or strict, depending on the needs of the user. Nonetheless, it has performed well, and this type of stopping rule has been observed to perform better than MCMC convergence diagnostics (Flegal et al., 2008).
ESS: The effective sample size (ESS) is considered satisfactory for each target distribution if it is at least 100, which is usually enough to describe 95% probability intervals (see p.interval and LPL.interval for more information). The ESS function is used. When this criterion is unmet, the name of the worst mixing chain in Summary1 appears.
Stationarity: Each target distribution is considered satisfactory if it is estimated to be stationary with the BMK.Diagnostic function.

Bear in mind that the MCSE, ESS, and stationarity criteria are all univariate measures applied to each marginal posterior distribution. Multivariate forms are not included. By chance alone due to multiple independent tests, 5% of these diagnostics should indicate non-convergence when 'convergence' exists. In contrast, even one non-convergent nuisance parameter is associated with non-convergence in all other parameters. Assessing convergence is difficult.

If all five conditions are satisfactory, then Laplace's Demon is appeased. Otherwise, Laplace's Demon will suggest and supply R code that is ready to be copy/pasted and executed.

To visualize the MCSE-based stopping rule, run the following code:

x <- seq(from=-3, to=3, by=0.1); plot(x, dnorm(x,0,1), type="l"); abline(v=-0.0627); abline(v=0.0627); abline(v=2*-0.0627, col="red"); abline(v=2*0.0627, col="red")

The black vertical lines show the standard error, and the red vertical lines show the 95% interval.

If the user has an object of class demonoid.hpc, then the Consort function may be still be applied, but a particular chain in the object must be specified as a component in a list. For example, with an object called Fit and a goal of consorting over the second chain, the code would be: Consort(Fit[[2]]).

The Demonic Suggestion is usually very helpful, but should not be followed blindly. Do not let it replace critical thinking. For example, Consort may find that diminishing adaptation is unmet, and recommend a different algorithm. However, the user may be convinced that the current algorithm is best, and believe instead that MCMC found a local solution, and is leaving it to find the global solution, in which case adaptations may increase again. Diminishing adaptation may have occurred in a previous run, and is not found in the current run because adaptation is essentially finished. If either of these is true, then it may be best to ignore the newly suggested algorithm, and continue with the current algorithm. The suggested code may be helpful, but it is merely a suggestion.

If achieving the appeasement of Laplace's Demon is difficult, consider ignoring the MCSE criterion and terminate when all other criteria have been met, placing special emphasis on ESS.

Author(s)

Statisticat, LLC. [email protected]

References

Flegal, J.M., Haran, M., and Jones, G.L. (2008). "Markov chain Monte Carlo: Can We Trust the Third Significant Figure?". Statistical Science, 23, p. 250–260.

Toft, N., Innocent, G., Gettinby, G., and Reid, S. (2007). "Assessing the Convergence of Markov Chain Monte Carlo Methods: An Example from Evaluation of Diagnostic Tests in Absence of a Gold Standard". Preventive Veterinary Medicine, 79, p. 244–256.

Cumulative Sample Function

Description

The Cumulative Sample Function (CSF) is a visual MCMC diagnostic in which the user may select a measure (such as a variable, summary statistic, or other diagnostic), and observe a plot of how the measure changes over cumulative posterior samples from MCMC, such as the output of LaplacesDemon. This may be considered to be a generalized extension of the cumuplot in the coda package, which is a more restrictive form of the cusum diagnostic introduced by Yu and Myckland (1998).

Yu and Myckland (1998) suggest that CSF plots should be examined after traditional trace plots seem convergent, and assert that faster mixing chains (which are more desirable) result in CSF plots that are more ‘hairy’ (as opposed to smooth), though this is subjective and has been debated. The LaplacesDemon package neither supports nor contradicts the suggestion of mixing and ‘hairiness’, but suggests that CSF plots may be used to provide additional information about a chain. For example, a user may decide on a practical burnin given when a conditional mean obtains a certain standard error.

Usage

CSF(x, name, method="Quantiles", quantiles=c(0.025,0.500,0.975), output=FALSE)
CSF(x, name, method="Quantiles", quantiles=c(0.025,0.500,0.975), output=FALSE)

Arguments

`x`	This is a vector of posterior samples from MCMC.
`name`	This is an optional name for vector `x`, and is input as a quoted string, such as `name="theta"`.
`method`	This is a measure that will be observed over the course of cumulative samples of `x`. It defaults to `method="Quantiles"`, and optional methods include: `"ESS"`, `"Geweke.Diagnostic"`, `"HPD"`, `"is.stationary"`, `"Kurtosis"`, `"MCSE"`, `"MCSE.bm"`, `"MCSE.sv"`, `"Mean"`, `"Mode"`, `"N.Modes"`, `"Precision"`, `"Quantiles"`, and `"Skewness"`.
`quantiles`	This optional argument applies only when `method="Quantiles"`, in which case this vector indicates the probabilities that will be observed. It defaults to the median and 95% probability interval bounds (see `p.interval` for more information).
`output`	Logical. If `output=TRUE`, then the results of the measure over the course of the cumulative samples will be output as an object, either a vector or matrix, depending on the `method` argument. The `output` argument defaults to `FALSE`.

Details

When method="ESS", the effective sample size (ESS) is observed as a function of the cumulative samples of x. For more information, see the ESS function.

When method="Geweke.Diagnostic", the Z-score output of the Geweke diagnostic is observed as a function of the cumulative samples of x. For more information, see the Geweke.Diagnostic function.

When method="HPD", the Highest Posterior Density (HPD) interval is observed as a function of the cumulative samples of x. For more information, see the p.interval function.

When method="is.stationary", stationarity is logically tested and the result is observed as a function of the cumulative samples of x. For more information, see the is.stationary function.

When method="Kurtosis", kurtosis is observed as a function of the cumulative samples of x.

When method="MCSE", the Monte Carlo Standard Error (MCSE) estimated with the IMPS method is observed as a function of the cumulative samples of x. For more information, see the MCSE function.

When method="MCSE.bm", the Monte Carlo Standard Error (MCSE) estimated with the batch.means method is observed as a function of the cumulative samples of x. For more information, see the MCSE function.

When method="MCSE.sv", the Monte Carlo Standard Error (MCSE) estimated with the sample.variance method is observed as a function of the cumulative samples of x. For more information, see the MCSE function.

When method="Mean", the mean is observed as a function of the cumulative samples of x.

When method="Mode", the estimated mode is observed as a function of the cumulative samples of x. For more information, see the Mode function.

When method="N.Modes", the estimated number of modes is observed as a function of the cumulative samples of x. For more information, see the Modes function.

When method="Precision", the precision (inverse variance) is observed as a function of the cumulative samples of x.

When method="Quantiles", the quantiles selected with the quantiles argument are observed as a function of the cumulative samples of x.

When method="Skewness", skewness is observed as a function of the cumulative samples of x.

Author(s)

Statisticat, LLC. [email protected]

References

Yu, B. and Myckland, P. (1997). "Looking at Markov Samplers through Cusum Path Plots: A Simple Diagnostic Idea". Statistics and Computing, 8(3), p. 275–286.

Examples

#Commented-out because of run-time for package builds
#library(LaplacesDemon)
#x <- rnorm(1000)
#CSF(x, method="ESS")
#CSF(x, method="Geweke.Diagnostic")
#CSF(x, method="HPD")
#CSF(x, method="is.stationary")
#CSF(x, method="Kurtosis")
#CSF(x, method="MCSE")
#CSF(x, method="MCSE.bm")
#CSF(x, method="MCSE.sv")
#CSF(x, method="Mean")
#CSF(x, method="Mode")
#CSF(x, method="N.Modes")
#CSF(x, method="Precision")
#CSF(x, method="Quantiles")
#CSF(x, method="Skewness")
#Commented-out because of run-time for package builds
#library(LaplacesDemon)
#x <- rnorm(1000)
#CSF(x, method="ESS")
#CSF(x, method="Geweke.Diagnostic")
#CSF(x, method="HPD")
#CSF(x, method="is.stationary")
#CSF(x, method="Kurtosis")
#CSF(x, method="MCSE")
#CSF(x, method="MCSE.bm")
#CSF(x, method="MCSE.sv")
#CSF(x, method="Mean")
#CSF(x, method="Mode")
#CSF(x, method="N.Modes")
#CSF(x, method="Precision")
#CSF(x, method="Quantiles")
#CSF(x, method="Skewness")

Demon Choice Data Set

Description

This data set is for discrete choice models and consists of the choice of commuting route to school: arterial, two-lane, or freeway. There were 151 Pennsylvania commuters who started from a residential complex in State College, PA, and commute to downtown State College.

Usage

data(demonchoice)data(demonchoice)

Format

This data frame contains 151 rows of individual choices and 9 columns. The following data dictionary describes each variable or column.

Choice: This is the route choice: four-lane arterial (35 MPH speed limit), two-lane highway (35 MPH speed limit, with one lane in each direction), or a limited-access four-lane freeway (55 MPH speed liimit.)
HH.Income: This is an ordinal variable of annual household income of the commuter in USD. There are four categories: 1 is less than 20,000 USD, 2 is 20,000-29,999 USD, 3 is 30,000-39,999 USD, and 4 is 40,000 USD or greater.
Vehicle.Age: This is the age in years of the vehicle of the commuter.
Stop.Signs.Arterial: This is the number of stop signs along the arterial route.
Stop.Signs.Two.Lane: This is the number of stop signs along the two-lane route.
Stop.Signs.Freeway: This is the number of stop signs along the freeway route.
Distance.Arterial: This is distance in miles of the arterial route.
Distance.Two.Lane: This is the distance in miles of the two-lane route.
Distance.Freeway: This is the distance in miles of the freeway route.

Source

Washington, S., Congdon, P., Karlaftis, M., and Mannering, F. (2009). "Bayesian Multinomial Logit: Theory and Route Choice Example". Transportation Research Record, 2136, p. 28–36.

Demon FX Data Set

Description

This data set consists of daily currency pair prices from 2010 through 2014. Each currency pair has a close, high, and low price.

Usage

data(demonfx)data(demonfx)

Format

This data frame contains 1,301 rows as time-periods (with row names) and 39 columns of currency pair prices. The following data dictionary describes each time-series or column.

EURUSD.Close: This is the currency pair closing price.
EURUSD.High: This is the currency pair high price.
EURUSD.Low: This is the currency pair low price.
USDJPY.Close: This is the currency pair closing price.
USDJPY.High: This is the currency pair high price.
USDJPY.Low: This is the currency pair low price.
USDCHF.Close: This is the currency pair closing price.
USDCHF.High: This is the currency pair high price.
USDCHF.Low: This is the currency pair low price.
GBPUSD.Close: This is the currency pair closing price.
GBPUSD.High: This is the currency pair high price.
GBPUSD.Low: This is the currency pair low price.
USDCAD.Close: This is the currency pair closing price.
USDCAD.High: This is the currency pair high price.
USDCAD.Low: This is the currency pair low price.
EURGBP.Close: This is the currency pair closing price.
EURGBP.High: This is the currency pair high price.
EURGBP.Low: This is the currency pair low price.
EURJPY.Close: This is the currency pair closing price.
EURJPY.High: This is the currency pair high price.
EURJPY.Low: This is the currency pair low price.
EURCHF.Close: This is the currency pair closing price.
EURCHF.High: This is the currency pair high price.
EURCHF.Low: This is the currency pair low price.
AUDUSD.Close: This is the currency pair closing price.
AUDUSD.High: This is the currency pair high price.
AUDUSD.Low: This is the currency pair low price.
GBPJPY.Close: This is the currency pair closing price.
GBPJPY.High: This is the currency pair high price.
GBPJPY.Low: This is the currency pair low price.
CHFJPY.Close: This is the currency pair closing price.
CHFJPY.High: This is the currency pair high price.
CHFJPY.Low: This is the currency pair low price.
GBPCHF.Close: This is the currency pair closing price.
GBPCHF.High: This is the currency pair high price.
GBPCHF.Low: This is the currency pair low price.
NZDUSD.Close: This is the currency pair closing price.
NZDUSD.High: This is the currency pair high price.
NZDUSD.Low: This is the currency pair low price.

Source

https://www.global-view.com/forex-trading-tools/forex-history/index.html

Demon Sessions Data Set

Description

These are the monthly number of user sessions at https://web.archive.org/web/20141224051720/http://www.bayesian-inference.com/index by continent. Additional data may be added in the future.

Usage

data(demonsessions)data(demonsessions)

Format

This data frame contains 26 rows (with row names) and 6 columns. The following data dictionary describes each variable or column.

Africa: This is the African continent.
Americas: This is North and South America.
Asia: This is the Asian continent.
Europe: This is Europe as a continent.
Oceania: This is Oceania, such as Australia.
Not.Set: This includes sessions in which the continent was not set, or is unknown.

Source

https://web.archive.org/web/20141224051720/http://www.bayesian-inference.com/index

Demon Snacks Data Set

Description

Late one night, after witnessing Laplace's Demon in action, I followed him back to what seemed to be his lair. Minutes later, he left again. I snuck inside and saw something labeled 'Demon Snacks'. Hurriedly, I recorded the 39 items, each with a name and 10 nutritional attributes.

Usage

data(demonsnacks)data(demonsnacks)

Format

This data frame contains 39 rows (with row names) and 10 columns. The following data dictionary describes each variable or column.

Serving.Size: This is serving size in grams.
Calories: This is the number of calories.
Total.Fat: This is total fat in grams.
Saturated.Fat: This is saturated fat in grams.
Cholesterol: This is cholesterol in milligrams.
Sodium: This is sodium in milligrams.
Total.Carbohydrate: This is the total carbohydrates in grams.
Dietary.Fiber: This is dietary fiber in grams.
Sugars: This is sugar in grams.
Protein: This is protein in grams.

Source

This data was obtained from the lair of Laplace's Demon!

Demon Space-Time Data Set

Description

This data set is for space-time models that require latitude and longitude, or coordinates. This data set consists of the minimum, mean, and maximum temperatures in Texas for 13 months.

Usage

data(demontexas)data(demontexas)

Format

This data frame contains 369 rows of sites in Texas and 43 columns. The following data dictionary describes each variable or column.

Elevation: This is the elevation of the site.
Latitude: This is the latitude of the site.
Longitude: This is the longitude of the site.
Gulf: This is a gulf indicator of the site.
Max1: This is the maximum temperature in month 1.
Max2: This is the maximum temperature in month 2.
Max3: This is the maximum temperature in month 3.
Max4: This is the maximum temperature in month 4.
Max5: This is the maximum temperature in month 5.
Max6: This is the maximum temperature in month 6.
Max7: This is the maximum temperature in month 7.
Max8: This is the maximum temperature in month 8.
Max9: This is the maximum temperature in month 9.
Max10: This is the maximum temperature in month 10.
Max11: This is the maximum temperature in month 11.
Max12: This is the maximum temperature in month 12.
Max13: This is the maximum temperature in month 13.
Mean1: This is the mean temperature in month 1.
Mean2: This is the mean temperature in month 2.
Mean3: This is the mean temperature in month 3.
Mean4: This is the mean temperature in month 4.
Mean5: This is the mean temperature in month 5.
Mean6: This is the mean temperature in month 6.
Mean7: This is the mean temperature in month 7.
Mean8: This is the mean temperature in month 8.
Mean9: This is the mean temperature in month 9.
Mean10: This is the mean temperature in month 10.
Mean11: This is the mean temperature in month 11.
Mean12: This is the mean temperature in month 12.
Mean13: This is the mean temperature in month 13.
Min1: This is the minimum temperature in month 1.
Min2: This is the minimum temperature in month 2.
Min3: This is the minimum temperature in month 3.
Min4: This is the minimum temperature in month 4.
Min5: This is the minimum temperature in month 5.
Min6: This is the minimum temperature in month 6.
Min7: This is the minimum temperature in month 7.
Min8: This is the minimum temperature in month 8.
Min9: This is the minimum temperature in month 9.
Min10: This is the minimum temperature in month 10.
Min11: This is the minimum temperature in month 11.
Min12: This is the minimum temperature in month 12.
Min13: This is the minimum temperature in month 13.

Source

http://www.stat.ufl.edu/~winner/datasets.html

de Finetti's Game

Description

The de.Finetti.Game function estimates the interval of a subjective probability regarding a possible event in the near future.

Usage

de.Finetti.Game(width)
de.Finetti.Game(width)

Arguments

width

This is the maximum acceptable width of the interval for the returned subjective probability. The user must specify a width between 0 and 1.

Details

This function is a variation on the game introduced by de Finetti, who is one of the main developers of subjective probability, along with Ramsey and Savage. In the original context, de Finetti proposed a gamble regarding life on Mars one billion years ago.

The frequentist interpretation of probability defines the probability of an event as the limit of its relative frequency in a large number of trials. Frequentist inference is undefined, for example, when there are no trials from which to calculate a probability. By defining probability relative to frequencies of physical events, frequentists attempt to objectify probability. However, de Finetti asserts that the frequentist (or objective) interpretation always reduces to a subjective interpretation of probability, because probability is a human construct and does not exist independently of humans in nature. Therefore, probability is a degree of belief, and is called subjective or personal probability.

Value

The de.Finetti.Game function returns a vector of length two. The respective elements are the lower and upper bounds of the subjective probability of the participant regarding the possible event in the near future.

Author(s)

Statisticat, LLC. [email protected]

De-Burn

Description

The deburn function discards or removes a user-specified number of burn-in iterations from an object of class demonoid.

Usage

deburn(x, BurnIn=0)
deburn(x, BurnIn=0)

Arguments

`x`	This is an object of class `demonoid`.
`BurnIn`	This argument defaults to `BurnIn=0`, and accepts an integer that indicates the number of iterations to discard as burn-in.

Details

Documentation for the burnin function provides an introduction to the concept of burn-in as it relates to Markov chains.

The deburn function discards a number of the first posterior samples, as specified by the BurnIn argument. Stationarity is not checked, because it is assumed the user has a reason for using the deburn function, rather than using the results from the object of class demonoid. Therefore, the posterior samples in Posterior1 and Posterior2 are identical, as are Summary1 and Summary2.

Value

The deburn function returns an object of class demonoid.

Author(s)

Statisticat, LLC. [email protected]

Examples

### Assuming the user has Fit which is an object of class demonoid:
#library(LaplacesDemon)
#Fit2 <- deburn(Fit, BurnIn=100)
### Assuming the user has Fit which is an object of class demonoid:
#library(LaplacesDemon)
#Fit2 <- deburn(Fit, BurnIn=100)

Asymmetric Laplace Distribution: Univariate

Description

These functions provide the density, distribution function, quantile function, and random generation for the univariate, asymmetric Laplace distribution with location parameter location, scale parameter scale, and asymmetry or skewness parameter kappa.

Usage

dalaplace(x, location=0, scale=1, kappa=1, log=FALSE)
palaplace(q, location=0, scale=1, kappa=1)
qalaplace(p, location=0, scale=1, kappa=1)
ralaplace(n, location=0, scale=1, kappa=1)
dalaplace(x, location=0, scale=1, kappa=1, log=FALSE)
palaplace(q, location=0, scale=1, kappa=1)
qalaplace(p, location=0, scale=1, kappa=1)
ralaplace(n, location=0, scale=1, kappa=1)

Arguments

`x`, `q`	These are each a vector of quantiles.
`p`	This is a vector of probabilities.
`n`	This is the number of observations, which must be a positive integer that has length 1.
`location`	This is the location parameter $\mu$ .
`scale`	This is the scale parameter $\lambda$ , which must be positive.
`kappa`	This is the asymmetry or skewness parameter $\kappa$ , which must be positive.
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Univariate
Density: $p(\theta) = \frac{\kappa \sqrt{2}}{\lambda (1+\kappa^2)} \exp(-|\theta-\mu| \frac{\sqrt{2}}{\lambda} \kappa^{|\theta-\mu|} |\theta-\mu|)$
Inventor: Kotz, Kozubowski, and Podgorski (2001)
Notation 1: $\theta \sim \mathcal{AL}(\mu, \lambda, \kappa)$
Notation 2: $p(\theta) = \mathcal{AL}(\theta | \mu, \lambda, \kappa)$
Parameter 1: location parameter $\mu$
Parameter 2: scale parameter $\lambda > 0$
Parameter 3: skewness parameter $\kappa > 0$
Mean: $E(\theta) = \mu + \lambda \frac{1/\kappa - \kappa}{\sqrt{2}}$
Variance: $var(\theta) = \lambda^2 \frac{1 + \kappa^4}{2 \kappa^2}$
Mode: $mode(\theta) = \mu$

The asymmetric Laplace of Kotz, Kozubowski, and Podgorski (2001), also referred to as AL, is an extension of the univariate, symmetric Laplace distribution to allow for skewness. It is parameterized according to three parameters: location parameter $\mu$ , scale parameter $\lambda$ , and asymmetry or skewness parameter $\kappa$ . The special case of $\kappa=1$ is the symmetric Laplace distribution. Values of $\kappa$ in the intervals $(0, 1)$ and $(1, \infty)$ , correspond to positive (right) and negative (left) skewness, respectively. The AL distribution is leptokurtic, and its kurtosis ranges from 3 to 6 as $\kappa$ ranges from 1 to infinity. The skewness of the AL has been useful in engineering and finance. As an example, the AL distribution has been used as a replacement for Gaussian-distributed GARCH residuals. There is also an extension to the asymmetric multivariate Laplace distribution.

The asymmetric Laplace distribution is demonstrated in Kozubowski and Podgorski (2001) to be well-suited for financial modeling, specifically with currency exchange rates.

These functions are similar to those in the VGAM package.

Value

dalaplace gives the density, palaplace gives the distribution function, qalaplace gives the quantile function, and ralaplace generates random deviates.

References

Kotz, S., Kozubowski, T.J., and Podgorski, K. (2001). "The Laplace Distribution and Generalizations: a Revisit with Applications to Communications, Economics, Engineering, and Finance". Boston: Birkhauser.

Kozubowski, T.J. and Podgorski, K. (2001). "Asymmetric Laplace Laws and Modeling Financial Data". Mathematical and Computer Modelling, 34, p. 1003-1021.

Examples

library(LaplacesDemon)
x <- dalaplace(1,0,1,1)
x <- palaplace(1,0,1,1)
x <- qalaplace(0.5,0,1,1)
x <- ralaplace(100,0,1,1)

#Plot Probability Functions
x <- seq(from=-5, to=5, by=0.1)
plot(x, dalaplace(x,0,1,0.5), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dalaplace(x,0,1,1), type="l", col="green")
lines(x, dalaplace(x,0,1,5), type="l", col="blue")
legend(1, 0.9, expression(paste(mu==0, ", ", lambda==1, ", ", kappa==0.5),
     paste(mu==0, ", ", lambda==1, ", ", kappa==1),
     paste(mu==0, ", ", lambda==1, ", ", kappa==5)),
     lty=c(1,1,1), col=c("red","green","blue"))
library(LaplacesDemon)
x <- dalaplace(1,0,1,1)
x <- palaplace(1,0,1,1)
x <- qalaplace(0.5,0,1,1)
x <- ralaplace(100,0,1,1)

#Plot Probability Functions
x <- seq(from=-5, to=5, by=0.1)
plot(x, dalaplace(x,0,1,0.5), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dalaplace(x,0,1,1), type="l", col="green")
lines(x, dalaplace(x,0,1,5), type="l", col="blue")
legend(1, 0.9, expression(paste(mu==0, ", ", lambda==1, ", ", kappa==0.5),
     paste(mu==0, ", ", lambda==1, ", ", kappa==1),
     paste(mu==0, ", ", lambda==1, ", ", kappa==5)),
     lty=c(1,1,1), col=c("red","green","blue"))

Asymmetric Log-Laplace Distribution

Description

These functions provide the density, distribution function, quantile function, and random generation for the univariate, asymmetric, log-Laplace distribution with location parameter $\mu$ , scale parameter $\lambda$ , and asymmetry or skewness parameter $\kappa$ .

Usage

dallaplace(x, location=0, scale=1, kappa=1, log=FALSE)
pallaplace(q, location=0, scale=1, kappa=1)
qallaplace(p, location=0, scale=1, kappa=1)
rallaplace(n, location=0, scale=1, kappa=1)
dallaplace(x, location=0, scale=1, kappa=1, log=FALSE)
pallaplace(q, location=0, scale=1, kappa=1)
qallaplace(p, location=0, scale=1, kappa=1)
rallaplace(n, location=0, scale=1, kappa=1)

Arguments

`x`, `q`	These are each a vector of quantiles.
`p`	This is a vector of probabilities.
`n`	This is the number of observations, which must be a positive integer that has length 1.
`location`	This is the location parameter $\mu$ .
`scale`	This is the scale parameter $\lambda$ , which must be positive.
`kappa`	This is the asymmetry or skewness parameter $\kappa$ , which must be positive.
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Univariate
Density 1: $p(\theta) = \exp(-\mu)\frac{(\sqrt(2)\kappa / \lambda)(\sqrt(2) / \lambda\kappa)}{(\sqrt(2)\kappa / \lambda)+(\sqrt(2) / (\lambda\kappa))} \exp(-(\frac{\sqrt(2)\kappa}{\lambda})+1), \quad \theta \ge \exp(\mu)$
Density 2: $p(\theta) = \exp(-\mu) \frac{(\sqrt(2)\kappa / \lambda) (\sqrt(2) / (\lambda\kappa))}{(\sqrt(2)\kappa / \lambda) + (\sqrt(2) / (\lambda\kappa))} \exp(\frac{\sqrt(2)(\log(\theta)-\mu)}{\lambda\kappa} - (\log(\theta)-\mu)), \quad \theta < \exp(\mu)$
Inventor: Pierre-Simon Laplace
Notation 1: $\theta \sim \mathcal{ALL}(\mu, \lambda, \kappa)$
Notation 2: $p(\theta) = \mathcal{ALL}(\theta | \mu, \lambda, \kappa)$
Parameter 1: location parameter $\mu$
Parameter 2: scale parameter $\lambda > 0$
Mean: $E(\theta) =$
Variance: $var(\theta) =$
Mode: $mode(\theta) =$

The univariate, asymmetric log-Laplace distribution is derived from the Laplace distribution. Multivariate and symmetric versions also exist.

These functions are similar to those in the VGAM package.

Value

dallaplace gives the density, pallaplace gives the distribution function, qallaplace gives the quantile function, and rallaplace generates random deviates.

References

Kozubowski, T. J. and Podgorski, K. (2003). "Log-Laplace Distributions". International Mathematical Journal, 3, p. 467–495.

Examples

library(LaplacesDemon)
x <- dallaplace(1,0,1,1)
x <- pallaplace(1,0,1,1)
x <- qallaplace(0.5,0,1,1)
x <- rallaplace(100,0,1,1)

#Plot Probability Functions
x <- seq(from=0.1, to=10, by=0.1)
plot(x, dallaplace(x,0,1,0.5), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dallaplace(x,0,1,1), type="l", col="green")
lines(x, dallaplace(x,0,1,5), type="l", col="blue")
legend(5, 0.9, expression(paste(mu==0, ", ", lambda==1, ", ", kappa==0.5),
     paste(mu==0, ", ", lambda==1, ", ", kappa==1),
     paste(mu==0, ", ", lambda==1, ", ", kappa==5)),
     lty=c(1,1,1), col=c("red","green","blue"))
library(LaplacesDemon)
x <- dallaplace(1,0,1,1)
x <- pallaplace(1,0,1,1)
x <- qallaplace(0.5,0,1,1)
x <- rallaplace(100,0,1,1)

#Plot Probability Functions
x <- seq(from=0.1, to=10, by=0.1)
plot(x, dallaplace(x,0,1,0.5), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dallaplace(x,0,1,1), type="l", col="green")
lines(x, dallaplace(x,0,1,5), type="l", col="blue")
legend(5, 0.9, expression(paste(mu==0, ", ", lambda==1, ", ", kappa==0.5),
     paste(mu==0, ", ", lambda==1, ", ", kappa==1),
     paste(mu==0, ", ", lambda==1, ", ", kappa==5)),
     lty=c(1,1,1), col=c("red","green","blue"))

Asymmetric Multivariate Laplace Distribution

Description

These functions provide the density and random generation for the asymmetric multivariate Laplace distribution with location and skew parameter $\mu$ and covariance $\Sigma$ .

Usage

daml(x, mu, Sigma, log=FALSE)
raml(n, mu, Sigma)
daml(x, mu, Sigma, log=FALSE)
raml(n, mu, Sigma)

Arguments

`x`	This is a $N \times K$ matrix of data, or a vector of length $K$ .
`n`	This is the number of observations, which must be a positive integer that has length 1.
`mu`	This is the location and skew parameter $\mu$ . This may be a $N \times K$ matrix, or a vector of length $K$ .
`Sigma`	This is the $K \times K$ positive-definite covariance matrix $\Sigma$ .
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Multivariate
Density: $p(\theta) = \frac{2\exp(\theta\Omega\theta)}{(2\pi)^{k/2}|\Sigma|^0.5} \frac{\theta\Omega\theta}{2 + \mu\Omega\mu}^{(2-k)/4} K_{(2-k)/2}(\sqrt{(2 + \mu\Omega\mu)(\theta\Omega\theta)})$
Inventor: Kotz, Kozubowski, and Podgorski (2003)
Notation 1: $\theta \sim \mathcal{AL}_K(\mu, \Sigma)$
Notation 2: $p(\theta) = \mathcal{AL}_K(\theta | \mu, \Sigma)$
Parameter 1: location-skew parameter $\mu$
Parameter 2: positive-definite covariance matrix $\Sigma$
Mean: Unknown
Variance: Unknown
Mode: $mode(\theta) = \mu$

The asymmetric multivariate Laplace distribution of Kotz, Kozubowski, and Podgorski (2003) is a multivariate extension of the univariate, asymmetric Laplace distribution. It is parameterized according to two parameters: location-skew parameter $\mu$ and positive-definite covariance matrix $\Sigma$ . Location and skew occur in the same parameter. When $\mu=0$ , the density is the (symmetric) multivariate Laplace of Anderson (1992). As each location deviates from zero, the marginal distribution becomes more skewed. Since location and skew are combined, it is appropriate for zero-centered variables, such as a matrix of centered and scaled dependent variables in cluster analysis, factor analysis, multivariate regression, or multivariate time-series.

The asymmetric multivariate Laplace distribution is also discussed earlier in Kozubowski and Podgorski (2001), and is well-suited for financial modeling via multivariate regression, specifically with currency exchange rates. Cajigas and Urga (2005) fit residuals in a multivariate GARCH model with the asymmetric multivariate Laplace distribution, regarding stocks and bonds. They find that it "overwhelmingly outperforms" normality.

Value

daml gives the density, and raml generates random deviates.

References

Anderson, D.N. (1992). "A Multivariate Linnik Distribution". Statistical Probability Letters, 14, p. 333–336.

Cajigas, J.P. and Urga, G. (2005) "Dynamic Conditional Correlation Models with Asymmetric Laplace Innovations". Centre for Economic Analysis: Cass Business School.

Kotz, S., Kozubowski, T.J., and Podgorski, K. (2003). "An Asymmetric Multivariate Laplace Distribution". Working Paper.

Kozubowski, T.J. and Podgorski, K. (2001). "Asymmetric Laplace Laws and Modeling Financial Data". Mathematical and Computer Modelling, 34, p. 1003–1021.

Examples

library(LaplacesDemon)
x <- daml(c(1,2,3), c(0,1,2), diag(3))
X <- raml(1000, c(0,1,2), diag(3))
joint.density.plot(X[,1], X[,2], color=FALSE)
library(LaplacesDemon)
x <- daml(c(1,2,3), c(0,1,2), diag(3))
X <- raml(1000, c(0,1,2), diag(3))
joint.density.plot(X[,1], X[,2], color=FALSE)

Bernoulli Distribution

Description

These functions provide the density, distribution function, quantile function, and random generation for the Bernoulli distribution.

Usage

dbern(x, prob, log=FALSE)
pbern(q, prob, lower.tail=TRUE, log.p=FALSE)
qbern(p, prob, lower.tail=TRUE, log.p=FALSE)
rbern(n, prob)
dbern(x, prob, log=FALSE)
pbern(q, prob, lower.tail=TRUE, log.p=FALSE)
qbern(p, prob, lower.tail=TRUE, log.p=FALSE)
rbern(n, prob)

Arguments

`x`, `q`	These are each a vector of quantiles.
`p`	This is a vector of probabilities.
`n`	This is the number of observations. If `length(n) > 1`, then the length is taken to be the number required.
`prob`	This is the probability of success on each trial.
`log`, `log.p`	Logical. if `TRUE`, probabilities $p$ are given as $\log(p)$ .
`lower.tail`	Logical. if `TRUE` (default), probabilities are $Pr[X \le x]$ , otherwise, $Pr[X > x]$ .

Details

Application: Continuous Univariate
Density: $p(\theta) = {p}^{\theta} {(1-p)}^{1-\theta}$ , $\theta = 0,1$
Inventor: Jacob Bernoulli
Notation 1: $\theta \sim \mathcal{BERN}(p)$
Notation 2: $p(\theta) = \mathcal{BERN}(\theta | p)$
Parameter 1: probability parameter $0 \le p \le 1$
Mean: $E(\theta) = p$
Variance: $var(\theta) = \frac{p}{1-p}$
Mode: $mode(\theta) =$

The Bernoulli distribution is a binomial distribution with $n=1$ , and one instance of a Bernoulli distribution is called a Bernoulli trial. One coin flip is a Bernoulli trial, for example. The categorical distribution is the generalization of the Bernoulli distribution for variables with more than two discrete values. The beta distribution is the conjugate prior distribution of the Bernoulli distribution. The geometric distribution is the number of Bernoulli trials needed to get one success.

Value

dbern gives the density, pbern gives the distribution function, qbern gives the quantile function, and rbern generates random deviates.

Examples

library(LaplacesDemon)
dbern(1, 0.7)
rbern(10, 0.5)
library(LaplacesDemon)
dbern(1, 0.7)
rbern(10, 0.5)

Categorical Distribution

Description

This is the density and random deviates function for the categorical distribution with probabilities parameter $p$ .

Usage

dcat(x, p, log=FALSE)
qcat(pr, p, lower.tail=TRUE, log.pr=FALSE)
rcat(n, p)
dcat(x, p, log=FALSE)
qcat(pr, p, lower.tail=TRUE, log.pr=FALSE)
rcat(n, p)

Arguments

`x`	This is a vector of discrete data with $k$ discrete categories, and is of length $n$ . This function also accepts $x$ after it has been converted to an $n \times k$ indicator matrix, such as with the `as.indicator.matrix` function.
`n`	This is the number of observations, which must be a positive integer that has length 1. When `p` is supplied to `rcat` as a matrix, `n` must equal the number of rows in `p`.
`p`	This is a vector of length $k$ or $n \times k$ matrix of probabilities. The `qcat` function requires a vector.
`pr`	This is a vector of probabilities, or log-probabilities.
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.
`log.pr`	Logical. if `TRUE`, probabilities $pr$ are given as $\log(pr)$ .
`lower.tail`	Logical. if `TRUE` (default), probabilities are $Pr[X \le x]$ , otherwise, $Pr[X > x]$ .

Details

Application: Discrete Univariate
Density: $p(\theta) = \sum \theta p$
Inventor: Unknown (to me, anyway)
Notation 1: $\theta \sim \mathcal{CAT}(p)$
Notation 2: $p(\theta) = \mathcal{CAT}(\theta | p)$
Parameter 1: probabilities $p$
Mean: $E(\theta)$ = Unknown
Variance: $var(\theta)$ = Unknown
Mode: $mode(\theta)$ = Unknown

Also called the discrete distribution, the categorical distribution describes the result of a random event that can take on one of $k$ possible outcomes, with the probability $p$ of each outcome separately specified. The vector $p$ of probabilities for each event must sum to 1. The categorical distribution is often used, for example, in the multinomial logit model. The conjugate prior is the Dirichlet distribution.

Value

dcat gives the density and rcat generates random deviates.

Author(s)

Statisticat, LLC. [email protected]

Examples

library(LaplacesDemon)
dcat(x=1, p=c(0.3,0.3,0.4))
rcat(n=10, p=c(0.1,0.3,0.6))
library(LaplacesDemon)
dcat(x=1, p=c(0.3,0.3,0.4))
rcat(n=10, p=c(0.1,0.3,0.6))

Continuous Relaxation of a Markov Random Field Distribution

Description

This is the density function and random generation from the continuous relaxation of a Markov random field (MRF) distribution.

Usage

dcrmrf(x, alpha, Omega, log=FALSE)
rcrmrf(n, alpha, Omega)
dcrmrf(x, alpha, Omega, log=FALSE)
rcrmrf(n, alpha, Omega)

Arguments

`x`	This is a vector of length $k$ .
`n`	This is the number of random deviates to generate.
`alpha`	This is a vector of length $k$ of shape parameters.
`Omega`	This is the $k \times k$ precision matrix $\Omega$ .
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Multivariate
Density:

$p(\theta) \propto \exp(-\frac{1}{2} \theta^T \Omega^{-1} \theta) \prod_i (1 + \exp(\theta_i + alpha_i))$
Inventor: Zhang et al. (2012)
Notation 1: $\theta \sim \mathcal{CRMRF}(\alpha, \Omega)$
Notation 2: $p(\theta) = \mathcal{CRMRF}(\theta | \alpha, \Omega)$
Parameter 1: shape vector $\alpha$
Parameter 2: positive-definite $k \times k$ matrix $\Omega$
Mean: $E(\theta)$
Variance: $var(\theta)$
Mode: $mode(\theta)$

It is often easier to solve or optimize a problem with continuous variables rather than a problem that involves discrete variables. A continuous variable may also have a gradient, contour, and curvature that may be useful for optimization or sampling. Continuous MCMC samplers are far more common.

Zhang et al. (2012) introduced a generalized form of the Gaussian integral trick from statistical physics to transform a discrete variable so that it may be estimated with continuous variables. An auxiliary Gaussian variable is added to a discrete Markov random field (MRF) so that discrete dependencies cancel out, allowing the discrete variable to be summed away, and leaving a continuous problem. The resulting continuous representation of the problem allows the model to be updated with a continuous MCMC sampler, and may benefit from a MCMC sampler that uses derivatives. Another advantage of continuous MCMC is that stationarity of discrete Markov chains is problematic to assess.

A disadvantage of solving a discrete problem with continuous parameters is that the continuous solution requires more parameters.

Value

dcrmrf gives the density and rcrmrf generates random deviates.

References

Zhang, Y., Ghahramani, Z., Storkey, A.J., and Sutton, C.A. (2012). "Continuous Relaxations for Discrete Hamiltonian Monte Carlo". Advances in Neural Information Processing Systems, 25, p. 3203–3211.

Examples

library(LaplacesDemon)
x <- dcrmrf(rnorm(5), rnorm(5), diag(5))
x <- rcrmrf(10, rnorm(5), diag(5))
library(LaplacesDemon)
x <- dcrmrf(rnorm(5), rnorm(5), diag(5))
x <- rcrmrf(10, rnorm(5), diag(5))

Dirichlet Distribution

Description

This is the density function and random generation from the Dirichlet distribution.

Usage

ddirichlet(x, alpha, log=FALSE)
rdirichlet(n, alpha)
ddirichlet(x, alpha, log=FALSE)
rdirichlet(n, alpha)

Arguments

`x`	This is a vector containing a single deviate or matrix containing one random deviate per row. Each vector, or matrix row, must sum to 1.
`n`	This is the number of random deviates to generate.
`alpha`	This is a vector or matrix of shape parameters.
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Multivariate
Density:

$p(\theta) = \frac{\gamma(\alpha_1 + \dots + \alpha_k)}{\gamma \alpha_1 \dots \gamma \alpha_k} \theta^{(\alpha[1]-1)}_1 \dots \theta^{(\alpha[k]-1)}_k, \quad \theta_1, \dots, \theta_k > 0, \quad \sum^k_{j=1} \theta_j = 1$
Inventor: Johann Peter Gustav Lejeune Dirichlet (1805-1859)
Notation 1: $\theta \sim$ Dirichlet( $\alpha_1,\dots,\alpha_k$ )
Notation 2: $p(\theta) =$ Dirichlet( $\theta | \alpha_1,\dots,\alpha_k$ )
Notation 3: $\theta \sim \mathcal{DIR}(\alpha_1,\dots,\alpha_k)$
Notation 4: $p(\theta) = \mathcal{DIR}(\theta | \alpha_1,\dots,\alpha_k)$
Parameter: 'prior sample sizes' $\alpha_j > 0, \alpha_0 = \sum^k_{j=1} \alpha_j$
Mean: $E(\theta_j) = \frac{\alpha_j}{\alpha_0}$
Variance: $var(\theta_j) = \frac{\alpha_j (\alpha_0 - \alpha_j)}{\alpha^2_0 (\alpha_0 + 1)}$
Covariance: $cov(\theta_i, \theta_j) = - \frac{\alpha_i \alpha_j}{\alpha^2_0 (\alpha_0 + 1)}$
Mode: $mode(\theta_j) = \frac{\alpha_j - 1}{\alpha_0 - k}$

The Dirichlet distribution is the multivariate generalization of the univariate beta distribution. Its probability density function returns the belief that the probabilities of $k$ rival events are $\theta_j$ given that each event has been observed $\alpha_j - 1$ times.

The Dirichlet distribution is commonly used as a prior distribution in Bayesian inference. The Dirichlet distribution is the conjugate prior distribution for the parameters of the categorical and multinomial distributions.

A very common special case is the symmetric Dirichlet distribution, where all of the elements in parameter vector $\alpha$ have the same value. Symmetric Dirichlet distributions are often used as vague or weakly informative Dirichlet prior distributions, so that one component is not favored over another. The single value that is entered into all elements of $\alpha$ is called the concentration parameter.

Value

ddirichlet gives the density and rdirichlet generates random deviates.

Examples

library(LaplacesDemon)
x <- ddirichlet(c(.1,.3,.6), c(1,1,1))
x <- rdirichlet(10, c(1,1,1))
library(LaplacesDemon)
x <- ddirichlet(c(.1,.3,.6), c(1,1,1))
x <- rdirichlet(10, c(1,1,1))

Generalized Pareto Distribution

Description

These are the density and random generation functions for the generalized Pareto distribution.

Usage

dgpd(x, mu, sigma, xi, log=FALSE)
rgpd(n, mu, sigma, xi)
dgpd(x, mu, sigma, xi, log=FALSE)
rgpd(n, mu, sigma, xi)

Arguments

`x`	This is a vector of data.
`n`	This is a positive scalar integer, and is the number of observations to generate randomly.
`mu`	This is a scalar or vector location parameter $\mu$ . When $\xi$ is non-negative, $\mu$ must not be greater than $\textbf{x}$ . When $\xi$ is negative, $\mu$ must be less than $\textbf{x} + \sigma / \xi$ .
`sigma`	This is a positive-only scalar or vector of scale parameters $\sigma$ .
`xi`	This is a scalar or vector of shape parameters $\xi$ .
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Univariate
Density: $p(\theta) = \frac{1}{\sigma}(1 + \xi\textbf{z})^(-1/\xi + 1)$ where $\textbf{z} = \frac{\theta - \mu}{\sigma}$
Inventor: Pickands (1975)
Notation 1: $\theta \sim \mathcal{GPD}(\mu, \sigma, \xi)$
Notation 2: $p(\theta) \sim \mathcal{GPD}(\theta | \mu, \sigma, \xi)$
Parameter 1: location $\mu$ , where $\mu \le \theta$ when $\xi \ge 0$ , and $\mu \ge \theta + \sigma / \xi$ when $\xi < 0$
Parameter 2: scale $\sigma > 0$
Parameter 3: shape $\xi$
Mean: $\mu + \frac{\sigma}{1 - \xi}$ when $\xi < 1$
Variance: $\frac{\sigma^2}{(1 - \xi)^2 (1 - 2\xi)}$ when $\xi < 0.5$
Mode:

The generalized Pareto distribution (GPD) is a more flexible extension of the Pareto (dpareto) distribution. It is equivalent to the exponential distribution when both $\mu = 0$ and $\xi = 0$ , and it is equivalent to the Pareto distribution when $\mu = \sigma / \xi$ and $\xi > 0$ .

The GPD is often used to model the tails of another distribution, and the shape parameter $\xi$ relates to tail-behavior. Distributions with tails that decrease exponentially are modeled with shape $\xi = 0$ . Distributions with tails that decrease as a polynomial are modeled with a positive shape parameter. Distributions with finite tails are modeled with a negative shape parameter.

Value

dgpd gives the density, and rgpd generates random deviates.

References

Pickands J. (1975). "Statistical Inference Using Extreme Order Statistics". The Annals of Statistics, 3, p. 119–131.

Examples

library(LaplacesDemon)
x <- dgpd(0,0,1,0,log=TRUE)
x <- rgpd(10,0,1,0)
library(LaplacesDemon)
x <- dgpd(0,0,1,0,log=TRUE)
x <- rgpd(10,0,1,0)

Generalized Poisson Distribution

Description

The density function is provided for the univariate, discrete, generalized Poisson distribution with location parameter $\lambda$ and scale parameter $\omega$ .

Usage

dgpois(x, lambda=0, omega=0, log=FALSE)
dgpois(x, lambda=0, omega=0, log=FALSE)

Arguments

`x`	This is a vector of quantiles.
`lambda`	This is the parameter $\lambda$ .
`omega`	This is the parameter $\omega$ , which should be in the interval [0,1) for positive counts.
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Discrete Univariate
Density: $p(\theta) = (1 - \omega) \lambda \frac{[(1 - \omega) \lambda + \omega \theta]^{\theta - 1}}{\theta!} \exp{-[(1 - \omega) \lambda + \omega \theta]}$
Inventor: Consul (1989) and Ntzoufras et al. (2005)
Notation 1: $\theta \sim \mathrm{GP}(\lambda,\omega)$
Notation 2: $p(\theta) = \mathrm{GP}(\theta | \lambda, \omega)$
Parameter 1: location parameter $\lambda$
Parameter 2: scale parameter $\omega \in [0,1)$
Mean: $E(\theta) = \lambda$
Variance: $var(\theta) = \lambda(1 - \omega)^{-2}$

The generalized Poisson distribution (Consul, 1989) is also called the Lagrangian Poisson distribution. The simple Poisson distribution is a special case of the generalized Poisson distribution. The generalized Poisson distribution is used in generalized Poisson regression as an extension of Poisson regression that accounts for overdispersion.

The dgpois function is parameterized according to Ntzoufras et al. (2005), which is easier to interpret and estimates better with MCMC.

Valid values for omega are in the interval [0,1) for positive counts. For $\omega = 0$ , the generalized Poisson reduces to a simple Poisson with mean $\lambda$ . Note that it is possible for $\omega < 0$ , but this implies underdispersion in count data, which is uncommon. The dgpois function returns warnings or errors, so $\omega$ should be non-negative here.

The dispersion index (DI) is a variance-to-mean ratio, and is $DI = (1 - \omega)^{-2}$ . A simple Poisson has DI=1. When DI is far from one, the assumption that the variance equals the mean of a simple Poisson is violated.

Value

dgpois gives the density.

References

Consul, P. (1989). '"Generalized Poisson Distribution: Properties and Applications". Marcel Decker: New York, NY.

Ntzoufras, I., Katsis, A., and Karlis, D. (2005). "Bayesian Assessment of the Distribution of Insurance Claim Counts using Reversible Jump MCMC", North American Actuarial Journal, 9, p. 90–108.

Examples

library(LaplacesDemon)
y <- rpois(100, 5)
lambda <- rpois(100, 5)
x <- dgpois(y, lambda, 0.5)

#Plot Probability Functions
x <- seq(from=0, to=20, by=1)
plot(x, dgpois(x,1,0.5), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dlaplace(x,1,0.6), type="l", col="green")
lines(x, dlaplace(x,1,0.7), type="l", col="blue")
legend(2, 0.9, expression(paste(lambda==1, ", ", omega==0.5),
     paste(lambda==1, ", ", omega==0.6), paste(lambda==1, ", ", omega==0.7)),
     lty=c(1,1,1), col=c("red","green","blue"))
library(LaplacesDemon)
y <- rpois(100, 5)
lambda <- rpois(100, 5)
x <- dgpois(y, lambda, 0.5)

#Plot Probability Functions
x <- seq(from=0, to=20, by=1)
plot(x, dgpois(x,1,0.5), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dlaplace(x,1,0.6), type="l", col="green")
lines(x, dlaplace(x,1,0.7), type="l", col="blue")
legend(2, 0.9, expression(paste(lambda==1, ", ", omega==0.5),
     paste(lambda==1, ", ", omega==0.6), paste(lambda==1, ", ", omega==0.7)),
     lty=c(1,1,1), col=c("red","green","blue"))

Half-Cauchy Distribution

Description

These functions provide the density, distribution function, quantile function, and random generation for the half-Cauchy distribution.

Usage

dhalfcauchy(x, scale=25, log=FALSE)
phalfcauchy(q, scale=25)
qhalfcauchy(p, scale=25)
rhalfcauchy(n, scale=25)
dhalfcauchy(x, scale=25, log=FALSE)
phalfcauchy(q, scale=25)
qhalfcauchy(p, scale=25)
rhalfcauchy(n, scale=25)

Arguments

`x`, `q`	These are each a vector of quantiles.
`p`	This is a vector of probabilities.
`n`	This is the number of observations, which must be a positive integer that has length 1.
`scale`	This is the scale parameter $\alpha$ , which must be positive.
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Univariate
Density: $p(\theta) = \frac{2 \alpha}{\pi(\theta^2 + \alpha^2)}, \quad \theta > 0$
Inventor: Derived from Cauchy
Notation 1: $\theta \sim \mathcal{HC}(\alpha)$
Notation 2: $p(\theta) = \mathcal{HC}(\theta | \alpha)$
Parameter 1: scale parameter $\alpha > 0$
Mean: $E(\theta)$ = does not exist
Variance: $var(\theta)$ = does not exist
Mode: $mode(\theta) = 0$

The half-Cauchy distribution with scale $\alpha=25$ is a recommended, default, weakly informative prior distribution for a scale parameter. Otherwise, the scale, $\alpha$ , is recommended to be set to be just a little larger than the expected standard deviation, as a weakly informative prior distribution on a standard deviation parameter.

The Cauchy distribution is known as a pathological distribution because its mean and variance are undefined, and it does not satisfy the central limit theorem.

Value

dhalfcauchy gives the density, phalfcauchy gives the distribution function, qhalfcauchy gives the quantile function, and rhalfcauchy generates random deviates.

Examples

library(LaplacesDemon)
x <- dhalfcauchy(1,25)
x <- phalfcauchy(1,25)
x <- qhalfcauchy(0.5,25)
x <- rhalfcauchy(1,25)

#Plot Probability Functions
x <- seq(from=0, to=20, by=0.1)
plot(x, dhalfcauchy(x,1), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dhalfcauchy(x,5), type="l", col="green")
lines(x, dhalfcauchy(x,10), type="l", col="blue")
legend(2, 0.9, expression(alpha==1, alpha==5, alpha==10),
     lty=c(1,1,1), col=c("red","green","blue"))
library(LaplacesDemon)
x <- dhalfcauchy(1,25)
x <- phalfcauchy(1,25)
x <- qhalfcauchy(0.5,25)
x <- rhalfcauchy(1,25)

#Plot Probability Functions
x <- seq(from=0, to=20, by=0.1)
plot(x, dhalfcauchy(x,1), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dhalfcauchy(x,5), type="l", col="green")
lines(x, dhalfcauchy(x,10), type="l", col="blue")
legend(2, 0.9, expression(alpha==1, alpha==5, alpha==10),
     lty=c(1,1,1), col=c("red","green","blue"))

Half-Normal Distribution

Description

These functions provide the density, distribution function, quantile function, and random generation for the half-normal distribution.

Usage

dhalfnorm(x, scale=sqrt(pi/2), log=FALSE)
phalfnorm(q, scale=sqrt(pi/2), lower.tail=TRUE, log.p=FALSE)
qhalfnorm(p, scale=sqrt(pi/2), lower.tail=TRUE, log.p=FALSE)
rhalfnorm(n, scale=sqrt(pi/2))
dhalfnorm(x, scale=sqrt(pi/2), log=FALSE)
phalfnorm(q, scale=sqrt(pi/2), lower.tail=TRUE, log.p=FALSE)
qhalfnorm(p, scale=sqrt(pi/2), lower.tail=TRUE, log.p=FALSE)
rhalfnorm(n, scale=sqrt(pi/2))

Arguments

`x`, `q`	These are each a vector of quantiles.
`p`	This is a vector of probabilities.
`n`	This is the number of observations, which must be a positive integer that has length 1.
`scale`	This is the scale parameter $\sigma$ , which must be positive.
`log`, `log.p`	Logical. If `log=TRUE`, then the logarithm of the density or result is returned.
`lower.tail`	Logical. If `lower.tail=TRUE` (default), probabilities are $Pr[X \le x]$ , otherwise, $Pr[X > x]$ .

Details

Application: Continuous Univariate
Density: $p(\theta) = \frac{2 \sigma}{\pi} \exp(-\frac{\theta^2 \sigma^2}{\pi}), \quad \theta \ge 0$
Inventor: Derived from the normal or Gaussian
Notation 1: $\theta \sim \mathcal{HN}(\sigma)$
Notation 2: $p(\theta) = \mathcal{HN}(\theta | \sigma)$
Parameter 1: scale parameter $\sigma > 0$
Mean: $E(\theta) = \frac{1}{\sigma}$
Variance: $var(\theta) = \frac{\pi-2}{2 \sigma^2}$
Mode: $mode(\theta) = 0$

The half-normal distribution is recommended as a weakly informative prior distribution for a scale parameter that may be useful as an alternative to the half-Cauchy, half-t, or vague gamma.

Value

dhalfnorm gives the density, phalfnorm gives the distribution function, qhalfnorm gives the quantile function, and rhalfnorm generates random deviates.

Examples

library(LaplacesDemon)
x <- dhalfnorm(1)
x <- phalfnorm(1)
x <- qhalfnorm(0.5)
x <- rhalfnorm(10)

#Plot Probability Functions
x <- seq(from=0.1, to=20, by=0.1)
plot(x, dhalfnorm(x,0.1), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dhalfnorm(x,0.5), type="l", col="green")
lines(x, dhalfnorm(x,1), type="l", col="blue")
legend(2, 0.9, expression(sigma==0.1, sigma==0.5, sigma==1),
     lty=c(1,1,1), col=c("red","green","blue"))
library(LaplacesDemon)
x <- dhalfnorm(1)
x <- phalfnorm(1)
x <- qhalfnorm(0.5)
x <- rhalfnorm(10)

#Plot Probability Functions
x <- seq(from=0.1, to=20, by=0.1)
plot(x, dhalfnorm(x,0.1), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dhalfnorm(x,0.5), type="l", col="green")
lines(x, dhalfnorm(x,1), type="l", col="blue")
legend(2, 0.9, expression(sigma==0.1, sigma==0.5, sigma==1),
     lty=c(1,1,1), col=c("red","green","blue"))

Half-t Distribution

Description

These functions provide the density, distribution function, quantile function, and random generation for the half-t distribution.

Usage

dhalft(x, scale=25, nu=1, log=FALSE)
phalft(q, scale=25, nu=1)
qhalft(p, scale=25, nu=1)
rhalft(n, scale=25, nu=1)
dhalft(x, scale=25, nu=1, log=FALSE)
phalft(q, scale=25, nu=1)
qhalft(p, scale=25, nu=1)
rhalft(n, scale=25, nu=1)

Arguments

`x`, `q`	These are each a vector of quantiles.
`p`	This is a vector of probabilities.
`n`	This is the number of observations, which must be a positive integer that has length 1.
`scale`	This is the scale parameter $\alpha$ , which must be positive.
`nu`	This is the scalar degrees of freedom parameter, which is usually represented as $\nu$ .
`log`	Logical. If `log=TRUE` then the logarithm of the density is returned.

Details

Application: Continuous Univariate
Density: $p(\theta) = (1 + \frac{1}{\nu} (\theta / \alpha)^2)^{-(\nu+1)/2}, \quad \theta \ge 0$
Inventor: Derived from the Student t
Notation 1: $\theta \sim \mathcal{HT}(\alpha, \nu)$
Notation 2: $p(\theta) = \mathcal{HT}(\theta | \alpha, \nu)$
Parameter 1: scale parameter $\alpha > 0$
Parameter 2: degrees of freedom parameter $\nu$
Mean: $E(\theta)$ = unknown
Variance: $var(\theta)$ = unknown
Mode: $mode(\theta) = 0$

The half-t distribution is derived from the Student t distribution, and is useful as a weakly informative prior distribution for a scale parameter. It is more adaptable than the default recommended half-Cauchy, though it may also be more difficult to estimate due to its additional degrees of freedom parameter, $\nu$ . When $\nu=1$ , the density is proportional to a proper half-Cauchy distribution. When $\nu=-1$ , the density becomes an improper, uniform prior distribution. For more information on propriety, see is.proper.

Wand et al. (2011) demonstrated that the half-t distribution may be represented as a scale mixture of inverse-gamma distributions. This representation is useful for conjugacy.

Value

dhalft gives the density, phalft gives the distribution function, qhalft gives the quantile function, and rhalft generates random deviates.

References

Wand, M.P., Ormerod, J.T., Padoan, S.A., and Fruhwirth, R. (2011). "Mean Field Variational Bayes for Elaborate Distributions". Bayesian Analysis, 6: p. 847–900.

Examples

library(LaplacesDemon)
x <- dhalft(1,25,1)
x <- phalft(1,25,1)
x <- qhalft(0.5,25,1)
x <- rhalft(10,25,1)

#Plot Probability Functions
x <- seq(from=0.1, to=20, by=0.1)
plot(x, dhalft(x,1,-1), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dhalft(x,1,0.5), type="l", col="green")
lines(x, dhalft(x,1,500), type="l", col="blue")
legend(2, 0.9, expression(paste(alpha==1, ", ", nu==-1),
     paste(alpha==1, ", ", nu==0.5), paste(alpha==1, ", ", nu==500)),
     lty=c(1,1,1), col=c("red","green","blue"))
library(LaplacesDemon)
x <- dhalft(1,25,1)
x <- phalft(1,25,1)
x <- qhalft(0.5,25,1)
x <- rhalft(10,25,1)

#Plot Probability Functions
x <- seq(from=0.1, to=20, by=0.1)
plot(x, dhalft(x,1,-1), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dhalft(x,1,0.5), type="l", col="green")
lines(x, dhalft(x,1,500), type="l", col="blue")
legend(2, 0.9, expression(paste(alpha==1, ", ", nu==-1),
     paste(alpha==1, ", ", nu==0.5), paste(alpha==1, ", ", nu==500)),
     lty=c(1,1,1), col=c("red","green","blue"))

Horseshoe Distribution

Description

This is the density function and random generation from the horseshoe distribution.

Usage

dhs(x, lambda, tau, log=FALSE)
rhs(n, lambda, tau)
dhs(x, lambda, tau, log=FALSE)
rhs(n, lambda, tau)

Arguments

`n`	This is the number of draws from the distribution.
`x`	This is a location vector at which to evaluate density.
`lambda`	This vector is a positive-only local parameter $\lambda$ .
`tau`	This scalar is a positive-only global parameter $\tau$ .
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Multivariate Scale Mixture
Density: (see below)
Inventor: Carvalho et al. (2008)
Notation 1: $\theta \sim \mathcal{HS}(\lambda, \tau)$
Notation 2: $p(\theta) = \mathcal{HS}(\theta | \lambda, \tau)$
Parameter 1: local scale $\lambda > 0$
Parameter 2: global scale $\tau > 0$
Mean: $E(\theta)$
Variance: $var(\theta)$
Mode: $mode(\theta)$

The horseshoe distribution (Carvalho et al., 2008) is a heavy-tailed mixture distribution that can be considered a variance mixture, and it is in the family of multivariate scale mixtures of normals.

The horseshoe distribution was proposed as a prior distribution, and recommended as a default choice for shrinkage priors in the presence of sparsity. Horseshoe priors are most appropriate in large-p models where dimension reduction is necessary to avoid overly complex models that predict poorly, and also perform well in estimating a sparse covariance matrix via Cholesky decomposition (Carvalho et al., 2009).

When the number of parameters in variable selection is assumed to be sparse, meaning that most elements are zero or nearly zero, a horseshoe prior is a desirable alternative to the Laplace-distributed parameters in the LASSO, or the parameterization in ridge regression. When the true value is far from zero, the horseshoe prior leaves the parameter unshrunk. Yet, the horseshoe prior is accurate in shrinking parameters that are truly zero or near-zero. Parameters near zero are shrunk more than parameters far from zero. Therefore, parameters far from zero experience less shrinkage and are closer to their true values. The horseshoe prior is valuable in discriminating signal from noise.

By replacing the Laplace-distributed parameters in LASSO with horseshoe-distributed parameters and including a global scale, the result is called horseshoe regression.

Value

dhs gives the density and rhs generates random deviates.

References

Carvalho, C.M., Polson, N.G., and Scott, J.G. (2008). "The Horseshoe Estimator for Sparse Signals". Discussion Paper 2008-31. Duke University Department of Statistical Science.

Carvalho, C.M., Polson, N.G., and Scott, J.G. (2009). "Handling Sparsity via the Horseshoe". Journal of Machine Learning Research, 5, p. 73–80.

Examples

library(LaplacesDemon)
x <- rnorm(100)
lambda <- rhalfcauchy(100, 5)
tau <- 5
x <- dhs(x, lambda, tau, log=TRUE)
x <- rhs(100, lambda=lambda, tau=tau)
plot(density(x))
library(LaplacesDemon)
x <- rnorm(100)
lambda <- rhalfcauchy(100, 5)
tau <- 5
x <- dhs(x, lambda, tau, log=TRUE)
x <- rhs(100, lambda=lambda, tau=tau)
plot(density(x))

Huang-Wand Distribution

Description

These are the density and random generation functions for the Huang-Wand prior distribution for a covariance matrix.

Usage

dhuangwand(x, nu=2, a, A, log=FALSE)
dhuangwandc(x, nu=2, a, A, log=FALSE)
rhuangwand(nu=2, a, A)
rhuangwandc(nu=2, a, A)
dhuangwand(x, nu=2, a, A, log=FALSE)
dhuangwandc(x, nu=2, a, A, log=FALSE)
rhuangwand(nu=2, a, A)
rhuangwandc(nu=2, a, A)

Arguments

`x`	This is a $k \times k$ positive-definite covariance matrix $\Sigma$ for `dhuangwand`, or the Cholesky factor $\textbf{U}$ of the covariance matrix for `dhuangwandc`.
`nu`	This is a scalar degrees of freedom parameter $\nu$ . The default is `nu=2`, which is an uninformative prior, resulting in marginal uniform distributions on the correlation matrix.
`a`	This is a positive-only vector of scale parameters $a$ of length $k$ .
`A`	This is a positive-only vector of of scale hyperparameters $A$ of length $k$ . Larger values result in a more uninformative prior. A default, uninformative prior is `A=rep(1e6,k)`.
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Multivariate
Density: $p(\theta) = \mathcal{W}^{-1}_{\nu+k-1}(2 \nu diag(1/a)) \mathcal{G}^{-1}(1/2, 1/A^2)$
Inventor: Huang and Wand (2013)
Notation 1: $\theta \sim \mathcal{HW}_\nu(\textbf{a}, \textbf{A})$
Notation 2: $p(\theta) \sim \mathcal{HW}_\nu(\theta | \textbf{a}, \textbf{A})$
Parameter 1: degrees of freedom $\nu$
Parameter 2: scale $a > 0$
Parameter 3: scale $A > 0$
Mean:
Variance:
Mode:

Huang and Wand (2013) proposed a prior distribution for a covariance matrix that uses a hierarchical inverse Wishart. This is a more flexible alternative to the inverse Wishart distribution, and the Huang-Wand prior retains conjugacy. The Cholesky parameterization is also provided here.

The Huang-Wand prior distribution alleviates two main limitations of an inverse Wishart distribution. First, the uncertainty in the diagonal variances of a covariance matrix that is inverse Wishart distributed is represented with only one degrees of freedom parameter, which may be too restrictive. The Huang-Wand prior overcomes this limitation. Second, the inverse Wishart distribution imposes a dependency between variance and correlation. The Huang-Wand prior lessens, but does not fully remove, this dependency.

The standard deviations of a Huang-Wand distributed covariance matrix are half-t distributed, as $\mathcal{HT}(\nu, \textbf{A})$ . This is in accord with modern assumptions about distributions of scale parameters, and is also useful for sparse covariance matrices.

The rhuangwand function allows either a or A to be missing. When a is missing, the covariance matrix is generated from the hyperparameters. When A is missing, the covariance matrix is generated from the parameters.

Value

dhuangwand and dhuangwandc give the density, and rhuangwand and rhuangwandc generate random deviates.

References

Huang, A., Wand, M., et al. (2013), "Simple Marginally Noninformative Prior Distributions for Covariance Matrices". Bayesian Analysis, 8, p. 439–452.

Examples

library(LaplacesDemon)
dhuangwand(diag(3), nu=2, a=runif(3), A=rep(1e6,3), log=TRUE)
rhuangwand(nu=2, A=rep(1e6, 3)) #Missing a
rhuangwand(nu=2, a=runif(3)) #Missing A
library(LaplacesDemon)
dhuangwand(diag(3), nu=2, a=runif(3), A=rep(1e6,3), log=TRUE)
rhuangwand(nu=2, A=rep(1e6, 3)) #Missing a
rhuangwand(nu=2, a=runif(3)) #Missing A

Inverse Beta Distribution

Description

This is the density function and random generation from the inverse beta distribution.

Usage

dinvbeta(x, a, b, log=FALSE)
rinvbeta(n, a, b)
dinvbeta(x, a, b, log=FALSE)
rinvbeta(n, a, b)

Arguments

`n`	This is the number of draws from the distribution.
`x`	This is a location vector at which to evaluate density.
`a`	This is the scalar shape parameter $\alpha$ .
`b`	This is the scalar shape parameter $\beta$
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Univariate
Density: $p(\theta) = \frac{\theta^{\alpha - 1} (1 + \theta)^{-\alpha - \beta}}{\beta(\alpha, \beta)}$
Inventor: Dubey (1970)
Notation 1: $\theta \sim \mathcal{B}^{-1}(\alpha, \beta)$
Notation 2: $p(\theta) = \mathcal{B}^{-1}(\theta | \alpha, \beta)$
Parameter 1: shape $\alpha > 0$
Parameter 2: shape $\beta > 0$
Mean: $E(\theta) = \frac{\alpha}{\beta - 1}$ , for $\beta > 1$
Variance: $var(\theta) = \frac{\alpha(\alpha + \beta - 1)}{(\beta - 1)^2 (\beta - 2)}$
Mode: $mode(\theta) = \frac{\alpha - 1}{\beta + 1}$

The inverse-beta, also called the beta prime distribution, applies to variables that are continuous and positive. The inverse beta is the conjugate prior distribution of a parameter of a Bernoulli distribution expressed in odds.

The inverse-beta distribution has also been extended to the generalized beta prime distribution, though it is not (yet) included here.

Value

dinvbeta gives the density and rinvbeta generates random deviates.

References

Dubey, S.D. (1970). "Compound Gamma, Beta and F Distributions". Metrika, 16, p. 27–31.

Examples

library(LaplacesDemon)
x <- dinvbeta(5:10, 2, 3)
x <- rinvbeta(10, 2, 3)

#Plot Probability Functions
x <- seq(from=0.1, to=20, by=0.1)
plot(x, dinvbeta(x,2,2), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dinvbeta(x,2,3), type="l", col="green")
lines(x, dinvbeta(x,3,2), type="l", col="blue")
legend(2, 0.9, expression(paste(alpha==2, ", ", beta==2),
     paste(alpha==2, ", ", beta==3), paste(alpha==3, ", ", beta==2)),
     lty=c(1,1,1), col=c("red","green","blue"))
library(LaplacesDemon)
x <- dinvbeta(5:10, 2, 3)
x <- rinvbeta(10, 2, 3)

#Plot Probability Functions
x <- seq(from=0.1, to=20, by=0.1)
plot(x, dinvbeta(x,2,2), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dinvbeta(x,2,3), type="l", col="green")
lines(x, dinvbeta(x,3,2), type="l", col="blue")
legend(2, 0.9, expression(paste(alpha==2, ", ", beta==2),
     paste(alpha==2, ", ", beta==3), paste(alpha==3, ", ", beta==2)),
     lty=c(1,1,1), col=c("red","green","blue"))

(Scaled) Inverse Chi-Squared Distribution

Description

This is the density function and random generation for the (scaled) inverse chi-squared distribution.

Usage

dinvchisq(x, df, scale, log=FALSE)
rinvchisq(n, df, scale=1/df)
dinvchisq(x, df, scale, log=FALSE)
rinvchisq(n, df, scale=1/df)

Arguments

`x`	This is a vector of quantiles.
`n`	This is the number of observations. If `length(n) > 1`, then the length is taken to be the number required.
`df`	This is the degrees of freedom parameter, usually represented as $\nu$ .
`scale`	This is the scale parameter, usually represented as $\lambda$ .
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Univariate
Density:

$p(\theta) = \frac{{\nu/2}^{\nu/2}}{\Gamma(\nu/2)} \lambda^\nu \frac{1}{\theta}^{\nu/2+1} \exp(-\frac{\nu \lambda^2}{2\theta}), \theta \ge 0$
Inventor: Derived from the chi-squared distribution
Notation 1: $\theta \sim \chi^{-2}(\nu, \lambda)$
Notation 2: $p(\theta) = \chi^{-2}(\theta | \nu, \lambda)$
Parameter 1: degrees of freedom parameter $\nu > 0$
Parameter 2: scale parameter $\lambda$
Mean: $E(\theta)$ = unknown
Variance: $var(\theta)$ = unknown
Mode: $mode(\theta) =$

The inverse chi-squared distribution, also called the inverted chi-square distribution, is the multiplicate inverse of the chi-squared distribution. If $x$ has the chi-squared distribution with $\nu$ degrees of freedom, then $1 / x$ has the inverse chi-squared distribution with $\nu$ degrees of freedom, and $\nu / x$ has the inverse chi-squared distribution with $\nu$ degrees of freedom.

These functions are similar to those in the GeoR package.

Value

dinvchisq gives the density and rinvchisq generates random deviates.

Examples

library(LaplacesDemon)
x <- dinvchisq(1,1,1)
x <- rinvchisq(10,1)

#Plot Probability Functions
x <- seq(from=0.1, to=5, by=0.01)
plot(x, dinvchisq(x,0.5,1), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dinvchisq(x,1,1), type="l", col="green")
lines(x, dinvchisq(x,5,1), type="l", col="blue")
legend(3, 0.9, expression(paste(nu==0.5, ", ", lambda==1),
     paste(nu==1, ", ", lambda==1), paste(nu==5, ", ", lambda==1)),
     lty=c(1,1,1), col=c("red","green","blue"))
library(LaplacesDemon)
x <- dinvchisq(1,1,1)
x <- rinvchisq(10,1)

#Plot Probability Functions
x <- seq(from=0.1, to=5, by=0.01)
plot(x, dinvchisq(x,0.5,1), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dinvchisq(x,1,1), type="l", col="green")
lines(x, dinvchisq(x,5,1), type="l", col="blue")
legend(3, 0.9, expression(paste(nu==0.5, ", ", lambda==1),
     paste(nu==1, ", ", lambda==1), paste(nu==5, ", ", lambda==1)),
     lty=c(1,1,1), col=c("red","green","blue"))

Inverse Gamma Distribution

Description

This is the density function and random generation from the inverse gamma distribution.

Usage

dinvgamma(x, shape=1, scale=1, log=FALSE)
rinvgamma(n, shape=1, scale=1)
dinvgamma(x, shape=1, scale=1, log=FALSE)
rinvgamma(n, shape=1, scale=1)

Arguments

`n`	This is the number of draws from the distribution.
`x`	This is the scalar location to evaluate density.
`shape`	This is the scalar shape parameter $\alpha$ , which defaults to one.
`scale`	This is the scalar scale parameter $\beta$ , which defaults to one.
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Univariate
Density: $p(\theta) = \frac{\beta^\alpha}{\Gamma(\alpha)} \theta^{-(\alpha + 1)} \exp(-\frac{\beta}{\theta}), \quad \theta > 0$
Inventor: Unknown (to me, anyway)
Notation 1: $\theta \sim \mathcal{G}^{-1}(\alpha, \beta)$
Notation 2: $p(\theta) = \mathcal{G}^{-1}(\theta | \alpha, \beta)$
Parameter 1: shape $\alpha > 0$
Parameter 2: scale $\beta > 0$
Mean: $E(\theta) = \frac{\beta}{\alpha - 1}$ , for $\alpha > 1$
Variance: $var(\theta) = \frac{\beta^2}{(\alpha - 1)^2 (\alpha - 2)}, \alpha > 2$
Mode: $mode(\theta) = \frac{\beta}{\alpha + 1}$

The inverse-gamma is the conjugate prior distribution for the normal or Gaussian variance, and has been traditionally specified as a vague prior in that application. The density is always finite; its integral is finite if $\alpha > 0$ . Prior information decreases as $\alpha, \beta \rightarrow 0$ .

These functions are similar to those in the MCMCpack package.

Value

dinvgamma gives the density and rinvgamma generates random deviates. The parameterization is consistent with the Gamma Distribution in the stats package.

Examples

library(LaplacesDemon)
x <- dinvgamma(4.3, 1.1)
x <- rinvgamma(10, 3.3)

#Plot Probability Functions
x <- seq(from=0.1, to=20, by=0.1)
plot(x, dinvgamma(x,1,1), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dinvgamma(x,1,0.6), type="l", col="green")
lines(x, dinvgamma(x,0.6,1), type="l", col="blue")
legend(2, 0.9, expression(paste(alpha==1, ", ", beta==1),
     paste(alpha==1, ", ", beta==0.6), paste(alpha==0.6, ", ", beta==1)),
     lty=c(1,1,1), col=c("red","green","blue"))
library(LaplacesDemon)
x <- dinvgamma(4.3, 1.1)
x <- rinvgamma(10, 3.3)

#Plot Probability Functions
x <- seq(from=0.1, to=20, by=0.1)
plot(x, dinvgamma(x,1,1), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dinvgamma(x,1,0.6), type="l", col="green")
lines(x, dinvgamma(x,0.6,1), type="l", col="blue")
legend(2, 0.9, expression(paste(alpha==1, ", ", beta==1),
     paste(alpha==1, ", ", beta==0.6), paste(alpha==0.6, ", ", beta==1)),
     lty=c(1,1,1), col=c("red","green","blue"))

Inverse Gaussian Distribution

Description

This is the density function and random generation from the inverse gaussian distribution.

Usage

dinvgaussian(x, mu, lambda, log=FALSE)
rinvgaussian(n, mu, lambda)
dinvgaussian(x, mu, lambda, log=FALSE)
rinvgaussian(n, mu, lambda)

Arguments

`n`	This is the number of draws from the distribution.
`x`	This is the scalar location to evaluate density.
`mu`	This is the mean parameter, $\mu$ .
`lambda`	This is the inverse-variance parameter, $\lambda$ .
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Univariate
Density: $p(\theta) = \frac{\lambda}{(2 \pi \theta^3)^{1/2}} \exp(-\frac{\lambda (\theta - \mu)^2}{2 \mu^2 \theta}), \theta > 0$
Inventor: Schrodinger (1915)
Notation 1: $\theta \sim \mathcal{N}^{-1}(\mu, \lambda)$
Notation 2: $p(\theta) = \mathcal{N}^{-1}(\theta | \mu, \lambda)$
Parameter 1: shape $\mu > 0$
Parameter 2: scale $\lambda > 0$
Mean: $E(\theta) = \mu$
Variance: $var(\theta) = \frac{\mu^3}{\lambda}$
Mode: $mode(\theta) = \mu((1 + \frac{9 \mu^2}{4 \lambda^2})^{1/2} - \frac{3 \mu}{2 \lambda})$

The inverse-Gaussian distribution, also called the Wald distribution, is used when modeling dependent variables that are positive and continuous. When $\lambda \rightarrow \infty$ (or variance to zero), the inverse-Gaussian distribution becomes similar to a normal (Gaussian) distribution. The name, inverse-Gaussian, is misleading, because it is not the inverse of a Gaussian distribution, which is obvious from the fact that $\theta$ must be positive.

Value

dinvgaussian gives the density and rinvgaussian generates random deviates.

References

Schrodinger E. (1915). "Zur Theorie der Fall-und Steigversuche an Teilchenn mit Brownscher Bewegung". Physikalische Zeitschrift, 16, p. 289–295.

Examples

library(LaplacesDemon)
x <- dinvgaussian(2, 1, 1)
x <- rinvgaussian(10, 1, 1)

#Plot Probability Functions
x <- seq(from=1, to=20, by=0.1)
plot(x, dinvgaussian(x,1,0.5), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dinvgaussian(x,1,1), type="l", col="green")
lines(x, dinvgaussian(x,1,5), type="l", col="blue")
legend(2, 0.9, expression(paste(mu==1, ", ", sigma==0.5),
     paste(mu==1, ", ", sigma==1), paste(mu==1, ", ", sigma==5)),
     lty=c(1,1,1), col=c("red","green","blue"))
library(LaplacesDemon)
x <- dinvgaussian(2, 1, 1)
x <- rinvgaussian(10, 1, 1)

#Plot Probability Functions
x <- seq(from=1, to=20, by=0.1)
plot(x, dinvgaussian(x,1,0.5), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dinvgaussian(x,1,1), type="l", col="green")
lines(x, dinvgaussian(x,1,5), type="l", col="blue")
legend(2, 0.9, expression(paste(mu==1, ", ", sigma==0.5),
     paste(mu==1, ", ", sigma==1), paste(mu==1, ", ", sigma==5)),
     lty=c(1,1,1), col=c("red","green","blue"))

Inverse Matrix Gamma Distribution

Description

This function provides the density for the inverse matrix gamma distribution.

Usage

dinvmatrixgamma(X, alpha, beta, Psi, log=FALSE)
dinvmatrixgamma(X, alpha, beta, Psi, log=FALSE)

Arguments

`X`	This is a $k \times k$ positive-definite covariance matrix.
`alpha`	This is a scalar shape parameter (the degrees of freedom), $\alpha$ .
`beta`	This is a scalar, positive-only scale parameter, $\beta$ .
`Psi`	This is a $k \times k$ positive-definite scale matrix.
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Multivariate Matrix
Density: $p(\theta) = \frac{|\Psi|^\alpha}{\beta^{k \alpha} \Gamma_k(\alpha)} |\theta|^{-\alpha-(k+1)/2}\exp(tr(-\frac{1}{\beta}\Psi\theta^{-1}))$
Inventors: Unknown
Notation 1: $\theta \sim \mathcal{IMG}_k(\alpha, \beta, \Psi)$
Notation 2: $p(\theta) = \mathcal{IMG}_k(\theta | \alpha, \beta, \Psi)$
Parameter 1: shape $\alpha > 2$
Parameter 2: scale $\beta > 0$
Parameter 3: positive-definite $k \times k$ scale matrix $\Psi$
Mean:
Variance:
Mode:

The inverse matrix gamma (IMG), also called the inverse matrix-variate gamma, distribution is a generalization of the inverse gamma distribution to positive-definite matrices. It is a more general and flexible version of the inverse Wishart distribution (dinvwishart), and is a conjugate prior of the covariance matrix of a multivariate normal distribution (dmvn) and matrix normal distribution (dmatrixnorm).

The compound distribution resulting from compounding a matrix normal with an inverse matrix gamma prior over the covariance matrix is a generalized matrix t-distribution.

The inverse matrix gamma distribution is identical to the inverse Wishart distribution when $\alpha = \nu / 2$ and $\beta = 2$ .

Value

dinvmatrixgamma gives the density.

Author(s)

Statisticat, LLC. [email protected]

Examples

library(LaplacesDemon)
k <- 10
dinvmatrixgamma(X=diag(k), alpha=(k+1)/2, beta=2, Psi=diag(k), log=TRUE)
dinvwishart(Sigma=diag(k), nu=k+1, S=diag(k), log=TRUE)
library(LaplacesDemon)
k <- 10
dinvmatrixgamma(X=diag(k), alpha=(k+1)/2, beta=2, Psi=diag(k), log=TRUE)
dinvwishart(Sigma=diag(k), nu=k+1, S=diag(k), log=TRUE)

Inverse Wishart Distribution

Description

These functions provide the density and random number generation for the inverse Wishart distribution.

Usage

   dinvwishart(Sigma, nu, S, log=FALSE)
   rinvwishart(nu, S)
dinvwishart(Sigma, nu, S, log=FALSE)
   rinvwishart(nu, S)

Arguments

`Sigma`	This is the symmetric, positive-definite $k \times k$ matrix $\Sigma$ .
`nu`	This is the scalar degrees of freedom, $\nu$ .
`S`	This is the symmetric, positive-semidefinite $k \times k$ scale matrix $\textbf{S}$ .
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Multivariate
Density: $p(\theta) = (2^{\nu k/2} \pi^{k(k-1)/4} \prod^k_{i=1} \Gamma(\frac{\nu+1-i}{2}))^{-1} |\textbf{S}|^{\nu/2} |\Sigma|^{-(\nu+k+1)/2} \exp(-\frac{1}{2} tr(\textbf{S} \Sigma^{-1}))$
Inventor: John Wishart (1928)
Notation 1: $\Sigma \sim \mathcal{W}^{-1}_{\nu}(\textbf{S}^{-1})$
Notation 2: $p(\Sigma) = \mathcal{W}^{-1}_{\nu}(\Sigma | \textbf{S}^{-1})$
Parameter 1: degrees of freedom $\nu$
Parameter 2: symmetric, positive-semidefinite $k \times k$ scale matrix $\textbf{S}$
Mean: $E(\Sigma) = \frac{\textbf{S}}{\nu - k - 1}$
Variance:
Mode: $mode(\Sigma) = \frac{\textbf{S}}{\nu + k + 1}$

The inverse Wishart distribution is a probability distribution defined on real-valued, symmetric, positive-definite matrices, and is used as the conjugate prior for the covariance matrix, $\Sigma$ , of a multivariate normal distribution. The inverse-Wishart density is always finite, and the integral is always finite. A degenerate form occurs when $\nu < k$ .

When applicable, the alternative Cholesky parameterization should be preferred. For more information, see dinvwishartc.

The inverse Wishart prior lacks flexibility, having only one parameter, $\nu$ , to control the variability for all $k(k + 1)/2$ elements. Popular choices for the scale matrix $\textbf{S}$ include an identity matrix or sample covariance matrix. When the model sample size is small, the specification of the scale matrix can be influential.

The inverse Wishart distribution has a dependency between variance and correlation, although its relative for a precision matrix (inverse covariance matrix), the Wishart distribution, does not have this dependency. This relationship becomes weaker with more degrees of freedom.

Due to these limitations (lack of flexibility, and dependence between variance and correlation), alternative distributions have been developed. Alternative distributions that are available here include Huang-Wand (dhuangwand), inverse matrix gamma (dinvmatrixgamma), Scaled Inverse Wishart (dsiw), and Yang-Berger (dyangberger).

These functions are parameterized as per Gelman et al. (2004).

Value

dinvwishart gives the density and rinvwishart generates random deviates.

References

Gelman, A., Carlin, J., Stern, H., and Rubin, D. (2004). "Bayesian Data Analysis, Texts in Statistical Science, 2nd ed.". Chapman and Hall, London.

Wishart, J. (1928). "The Generalised Product Moment Distribution in Samples from a Normal Multivariate Population". Biometrika, 20A(1-2), p. 32–52.

Examples

library(LaplacesDemon)
x <- dinvwishart(matrix(c(2,-.3,-.3,4),2,2), 3, matrix(c(1,.1,.1,1),2,2))
x <- rinvwishart(3, matrix(c(1,.1,.1,1),2,2))
library(LaplacesDemon)
x <- dinvwishart(matrix(c(2,-.3,-.3,4),2,2), 3, matrix(c(1,.1,.1,1),2,2))
x <- rinvwishart(3, matrix(c(1,.1,.1,1),2,2))

Inverse Wishart Distribution: Cholesky Parameterization

Description

These functions provide the density and random number generation for the inverse Wishart distribution with the Cholesky parameterization.

Usage

   dinvwishartc(U, nu, S, log=FALSE)
   rinvwishartc(nu, S)
dinvwishartc(U, nu, S, log=FALSE)
   rinvwishartc(nu, S)

Arguments

`U`	This is the upper-triangular $k \times k$ matrix for the Cholesky factor $\textbf{U}$ of covariance matrix $\Sigma$ .
`nu`	This is the scalar degrees of freedom, $\nu$ .
`S`	This is the symmetric, positive-semidefinite $k \times k$ scale matrix $\textbf{S}$ .
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Multivariate
Density: $p(\theta) = (2^{\nu k/2} \pi^{k(k-1)/4} \prod^k_{i=1} \Gamma(\frac{\nu+1-i}{2}))^{-1} |\textbf{S}|^{\nu/2} |\Sigma|^{-(\nu+k+1)/2} \exp(-\frac{1}{2} tr(\textbf{S} \Sigma^{-1}))$
Inventor: John Wishart (1928)
Notation 1: $\Sigma \sim \mathcal{W}^{-1}_{\nu}(\textbf{S}^{-1})$
Notation 2: $p(\Sigma) = \mathcal{W}^{-1}_{\nu}(\Sigma | \textbf{S}^{-1})$
Parameter 1: degrees of freedom $\nu$
Parameter 2: symmetric, positive-semidefinite $k \times k$ scale matrix $\textbf{S}$
Mean: $E(\Sigma) = \frac{\textbf{S}}{\nu - k - 1}$
Variance:
Mode: $mode(\Sigma) = \frac{\textbf{S}}{\nu + k + 1}$

The inverse Wishart distribution is a probability distribution defined on real-valued, symmetric, positive-definite matrices, and is used as the conjugate prior for the covariance matrix, $\Sigma$ , of a multivariate normal distribution. In this parameterization, $\Sigma$ has been decomposed to the upper-triangular Cholesky factor $\textbf{U}$ , as per chol. The inverse-Wishart density is always finite, and the integral is always finite. A degenerate form occurs when $\nu < k$ .

In practice, $\textbf{U}$ is fully unconstrained for proposals when its diagonal is log-transformed. The diagonal is exponentiated after a proposal and before other calculations. Overall, the Cholesky parameterization is faster than the traditional parameterization. Compared with dinvwishart, dinvwishartc must additionally matrix-multiply the Cholesky back to the covariance matrix, but it does not have to check for or correct the covariance matrix to positive-semidefiniteness, which overall is slower. Compared with rinvwishart, rinvwishartc must additionally calculate a Cholesky decomposition, and is therefore slower.

Due to these limitations (lack of flexibility, and dependence between variance and correlation), alternative distributions have been developed. Alternative distributions that are available here include the inverse matrix gamma (dinvmatrixgamma), Scaled Inverse Wishart (dsiw) and Huang-Wand (dhuangwand). Huang-Wand is recommended.

Value

dinvwishartc gives the density and rinvwishartc generates random deviates.

References

Wishart, J. (1928). "The Generalised Product Moment Distribution in Samples from a Normal Multivariate Population". Biometrika, 20A(1-2), p. 32–52.

Examples

library(LaplacesDemon)
Sigma <- matrix(c(2,-.3,-.3,4),2,2)
U <- chol(Sigma)
x <- dinvwishartc(U, 3, matrix(c(1,.1,.1,1),2,2))
x <- rinvwishartc(3, matrix(c(1,.1,.1,1),2,2))
library(LaplacesDemon)
Sigma <- matrix(c(2,-.3,-.3,4),2,2)
U <- chol(Sigma)
x <- dinvwishartc(U, 3, matrix(c(1,.1,.1,1),2,2))
x <- rinvwishartc(3, matrix(c(1,.1,.1,1),2,2))

Laplace Distribution: Univariate Symmetric

Description

These functions provide the density, distribution function, quantile function, and random generation for the univariate, symmetric, Laplace distribution with location parameter $\mu$ and scale parameter $\lambda$ .

Usage

dlaplace(x, location=0, scale=1, log=FALSE)
plaplace(q, location=0, scale=1)
qlaplace(p, location=0, scale=1)
rlaplace(n, location=0, scale=1)
dlaplace(x, location=0, scale=1, log=FALSE)
plaplace(q, location=0, scale=1)
qlaplace(p, location=0, scale=1)
rlaplace(n, location=0, scale=1)

Arguments

`x`, `q`	These are each a vector of quantiles.
`p`	This is a vector of probabilities.
`n`	This is the number of observations, which must be a positive integer that has length 1.
`location`	This is the location parameter $\mu$ .
`scale`	This is the scale parameter $\lambda$ , which must be positive.
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Univariate
Density: $p(\theta) = \frac{1}{2 \lambda} \exp(-\frac{|\theta - \mu|}{\lambda})$
Inventor: Pierre-Simon Laplace (1774)
Notation 1: $\theta \sim \mathrm{Laplace}(\mu,\lambda)$
Notation 2: $\theta \sim \mathcal{L}(\mu, \lambda)$
Notation 3: $p(\theta) = \mathrm{Laplace}(\theta | \mu, \lambda)$
Notation 4: $p(\theta) = \mathcal{L}(\theta | \mu, \lambda)$
Parameter 1: location parameter $\mu$
Parameter 2: scale parameter $\lambda > 0$
Mean: $E(\theta) = \mu$
Variance: $var(\theta) = 2 \lambda^2$
Mode: $mode(\theta) = \mu$

The Laplace distribution (Laplace, 1774) is also called the double exponential distribution, because it looks like two exponential distributions back to back with respect to location $\mu$ . It is also called the “First Law of Laplace”, just as the normal distribution is referred to as the “Second Law of Laplace”. The Laplace distribution is symmetric with respect to $\mu$ , though there are asymmetric versions of the Laplace distribution. The PDF of the Laplace distribution is reminiscent of the normal distribution; however, whereas the normal distribution is expressed in terms of the squared difference from the mean $\mu$ , the Laplace density is expressed in terms of the absolute difference from the mean, $\mu$ . Consequently, the Laplace distribution has fatter tails than the normal distribution. It has been argued that the Laplace distribution fits most things in nature better than the normal distribution.

There are many extensions to the Laplace distribution, such as the asymmetric Laplace, asymmetric log-Laplace, Laplace (re-parameterized for precision), log-Laplace, multivariate Laplace, and skew-Laplace, among many more.

These functions are similar to those in the VGAM package.

Value

dlaplace gives the density, plaplace gives the distribution function, qlaplace gives the quantile function, and rlaplace generates random deviates.

References

Laplace, P. (1774). "Memoire sur la Probabilite des Causes par les Evenements." l'Academie Royale des Sciences, 6, 621–656. English translation by S.M. Stigler in 1986 as "Memoir on the Probability of the Causes of Events" in Statistical Science, 1(3), p. 359–378.

Examples

library(LaplacesDemon)
x <- dlaplace(1,0,1)
x <- plaplace(1,0,1)
x <- qlaplace(0.5,0,1)
x <- rlaplace(100,0,1)

#Plot Probability Functions
x <- seq(from=-5, to=5, by=0.1)
plot(x, dlaplace(x,0,0.5), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dlaplace(x,0,1), type="l", col="green")
lines(x, dlaplace(x,0,2), type="l", col="blue")
legend(2, 0.9, expression(paste(mu==0, ", ", lambda==0.5),
     paste(mu==0, ", ", lambda==1), paste(mu==0, ", ", lambda==2)),
     lty=c(1,1,1), col=c("red","green","blue"))
library(LaplacesDemon)
x <- dlaplace(1,0,1)
x <- plaplace(1,0,1)
x <- qlaplace(0.5,0,1)
x <- rlaplace(100,0,1)

#Plot Probability Functions
x <- seq(from=-5, to=5, by=0.1)
plot(x, dlaplace(x,0,0.5), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dlaplace(x,0,1), type="l", col="green")
lines(x, dlaplace(x,0,2), type="l", col="blue")
legend(2, 0.9, expression(paste(mu==0, ", ", lambda==0.5),
     paste(mu==0, ", ", lambda==1), paste(mu==0, ", ", lambda==2)),
     lty=c(1,1,1), col=c("red","green","blue"))

Mixture of Laplace Distributions

Description

These functions provide the density, cumulative, and random generation for the mixture of univariate Laplace distributions with probability $p$ , location $\mu$ and scale $\sigma$ .

Usage

dlaplacem(x, p, location, scale, log=FALSE)
plaplacem(q, p, location, scale)
rlaplacem(n, p, location, scale)
dlaplacem(x, p, location, scale, log=FALSE)
plaplacem(q, p, location, scale)
rlaplacem(n, p, location, scale)

Arguments

`x`, `q`	This is vector of values at which the density will be evaluated.
`p`	This is a vector of length $M$ of probabilities for $M$ components. The sum of the vector must be one.
`n`	This is the number of observations, which must be a positive integer that has length 1.
`location`	This is a vector of length $M$ that is the location parameter $\mu$ .
`scale`	This is a vector of length $M$ that is the scale parameter $\sigma$ , which must be positive.
`log`	Logical. If `TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Univariate
Density: $p(\theta) = \sum p_i \mathcal{L}(\mu_i, \sigma_i)$
Inventor: Unknown
Notation 1: $\theta \sim \mathcal{L}(\mu, \sigma)$
Notation 2: $p(\theta) = \mathcal{L}(\theta | \mu, \sigma)$
Parameter 1: location parameters $\mu$
Parameter 2: scale parameters $\sigma > 0$
Mean: $E(\theta) = \sum p_i \mu_i$
Variance:
Mode:

A mixture distribution is a probability distribution that is a combination of other probability distributions, and each distribution is called a mixture component, or component. A probability (or weight) exists for each component, and these probabilities sum to one. A mixture distribution (though not these functions here in particular) may contain mixture components in which each component is a different probability distribution. Mixture distributions are very flexible, and are often used to represent a complex distribution with an unknown form. When the number of mixture components is unknown, Bayesian inference is the only sensible approach to estimation.

A Laplace mixture distribution is a combination of Laplace probability distributions.

One of many applications of Laplace mixture distributions is the Laplace Mixture Model (LMM).

Value

dlaplacem gives the density, plaplacem returns the CDF, and rlaplacem generates random deviates.

Author(s)

Statisticat, LLC. [email protected]

Examples

library(LaplacesDemon)
p <- c(0.3,0.3,0.4)
mu <- c(-5, 1, 5)
sigma <- c(1,2,1)
x <- seq(from=-10, to=10, by=0.1)
plot(x, dlaplacem(x, p, mu, sigma, log=FALSE), type="l") #Density
plot(x, plaplacem(x, p, mu, sigma), type="l") #CDF
plot(density(rlaplacem(10000, p, mu, sigma))) #Random Deviates
library(LaplacesDemon)
p <- c(0.3,0.3,0.4)
mu <- c(-5, 1, 5)
sigma <- c(1,2,1)
x <- seq(from=-10, to=10, by=0.1)
plot(x, dlaplacem(x, p, mu, sigma, log=FALSE), type="l") #Density
plot(x, plaplacem(x, p, mu, sigma), type="l") #CDF
plot(density(rlaplacem(10000, p, mu, sigma))) #Random Deviates

Laplace Distribution: Precision Parameterization

Description

Usage

dlaplacep(x, mu=0, tau=1, log=FALSE)
plaplacep(q, mu=0, tau=1)
qlaplacep(p, mu=0, tau=1)
rlaplacep(n, mu=0, tau=1)
dlaplacep(x, mu=0, tau=1, log=FALSE)
plaplacep(q, mu=0, tau=1)
qlaplacep(p, mu=0, tau=1)
rlaplacep(n, mu=0, tau=1)

Arguments

`x`, `q`	These are each a vector of quantiles.
`p`	This is a vector of probabilities.
`n`	This is the number of observations, which must be a positive integer that has length 1.
`mu`	This is the location parameter $\mu$ .
`tau`	This is the precision parameter $\tau$ , which must be positive.
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Univariate
Density: $p(\theta) = \frac{\tau}{2} \exp(-\tau |\theta-\mu|)$
Inventor: Pierre-Simon Laplace (1774)
Notation 1: $\theta \sim \mathrm{Laplace}(\mu,\tau^{-1})$
Notation 2: $\theta \sim \mathcal{L}(\mu, \tau^{-1})$
Notation 3: $p(\theta) = \mathrm{Laplace}(\mu,\tau^{-1})$
Notation 4: $p(\theta) = \mathcal{L}(\theta | \mu, \tau^{-1})$
Parameter 1: location parameter $\mu$
Parameter 2: precision parameter $\tau > 0$
Mean: $E(\theta) = \mu$
Variance: $var(\theta) = 2 \tau^{-2}$
Mode: $mode(\theta) = \mu$

The Laplace distribution is also called the double exponential distribution, because it looks like two exponential distributions back to back with respect to location $\mu$ . It is also called the “First Law of Laplace”, just as the normal distribution is referred to as the “Second Law of Laplace”. The Laplace distribution is symmetric with respect to $\mu$ , though there are asymmetric versions of the Laplace distribution. The PDF of the Laplace distribution is reminiscent of the normal distribution; however, whereas the normal distribution is expressed in terms of the squared difference from the mean $\mu$ , the Laplace density is expressed in terms of the absolute difference from the mean, $\mu$ . Consequently, the Laplace distribution has fatter tails than the normal distribution. It has been argued that the Laplace distribution fits most things in nature better than the normal distribution. Elsewhere, there are a large number of extensions to the Laplace distribution, including asymmetric versions and multivariate versions, among many more. These functions provide the precision parameterization for convenience and familiarity in Bayesian inference.

Value

dlaplacep gives the density, plaplacep gives the distribution function, qlaplacep gives the quantile function, and rlaplacep generates random deviates.

Author(s)

Statisticat, LLC. [email protected]

Examples

library(LaplacesDemon)
x <- dlaplacep(1,0,1)
x <- plaplacep(1,0,1)
x <- qlaplacep(0.5,0,1)
x <- rlaplacep(100,0,1)

#Plot Probability Functions
x <- seq(from=-5, to=5, by=0.1)
plot(x, dlaplacep(x,0,0.5), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dlaplacep(x,0,1), type="l", col="green")
lines(x, dlaplacep(x,0,2), type="l", col="blue")
legend(2, 0.9, expression(paste(mu==0, ", ", tau==0.5),
     paste(mu==0, ", ", tau==1), paste(mu==0, ", ", tau==2)),
     lty=c(1,1,1), col=c("red","green","blue"))
library(LaplacesDemon)
x <- dlaplacep(1,0,1)
x <- plaplacep(1,0,1)
x <- qlaplacep(0.5,0,1)
x <- rlaplacep(100,0,1)

#Plot Probability Functions
x <- seq(from=-5, to=5, by=0.1)
plot(x, dlaplacep(x,0,0.5), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dlaplacep(x,0,1), type="l", col="green")
lines(x, dlaplacep(x,0,2), type="l", col="blue")
legend(2, 0.9, expression(paste(mu==0, ", ", tau==0.5),
     paste(mu==0, ", ", tau==1), paste(mu==0, ", ", tau==2)),
     lty=c(1,1,1), col=c("red","green","blue"))

LASSO Distribution

Description

These functions provide the density and random generation for the Bayesian LASSO prior distribution.

Usage

dlasso(x, sigma, tau, lambda, a=1, b=1, log=FALSE)
rlasso(n, sigma, tau, lambda, a=1, b=1)
dlasso(x, sigma, tau, lambda, a=1, b=1, log=FALSE)
rlasso(n, sigma, tau, lambda, a=1, b=1)

Arguments

`x`	This is a location vector of length $J$ at which to evaluate density.
`n`	This is the number of observations, which must be a positive integer that has length 1.
`sigma`	This is a positive-only scalar hyperparameter $\sigma$ , which is also the residual standard deviation.
`tau`	This is a positive-only vector of hyperparameters, $\tau$ , of length $J$ regarding local sparsity.
`lambda`	This is a positive-only scalar hyperhyperparameter, $\lambda$ , of global sparsity.
`a`, `b`	These are positive-only scalar hyperhyperhyperparameters for gamma distributed $\lambda$ .
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Multivariate Scale Mixture
Density: $p(\theta) \sim \mathcal{N}_k(0, \sigma^2 diag(\tau^2))(\frac{1}{sigma^2}) \mathcal{EXP}(\frac{\lambda^2}{2}) \mathcal{G}(a,b)$
Inventor: Parks and Casella (2008)
Notation 1: $\theta \sim \mathcal{LASSO}(\sigma, \tau, \lambda, a, b)$
Notation 2: $p(\theta) = \mathcal{LASSO}(\theta | \sigma, \tau, \lambda, a, b)$
Parameter 1: hyperparameter global scale $\sigma > 0$
Parameter 2: hyperparameter local scale $\tau > 0$
Parameter 3: hyperhyperparameter global scale $\lambda > 0$
Parameter 4: hyperhyperhyperparameter scale $a > 0$
Parameter 5: hyperhyperhyperparameter scale $b > 0$
Mean: $E(\theta)$
Variance:
Mode:

The Bayesian LASSO distribution (Parks and Casella, 2008) is a heavy-tailed mixture distribution that can be considered a variance mixture, and it is in the family of multivariate scale mixtures of normals.

The LASSO distribution was proposed as a prior distribution, as a Bayesian version of the frequentist LASSO, introduced by Tibshirani (1996). It is applied as a shrinkage prior in the presence of sparsity for $J$ regression effects. LASSO priors are most appropriate in large-dimensional models where dimension reduction is necessary to avoid overly complex models that predict poorly.

The Bayesian LASSO results in regression effects that are a compromise between regression effects in the frequentist LASSO and ridge regression. The Bayesian LASSO applies more shrinkage to weak regression effects than ridge regression.

The Bayesian LASSO is an alternative to horseshoe regression and ridge regression.

Value

dlasso gives the density and rlasso generates random deviates.

References

Park, T. and Casella, G. (2008). "The Bayesian Lasso". Journal of the American Statistical Association, 103, p. 672–680.

Tibshirani, R. (1996). "Regression Shrinkage and Selection via the Lasso". Journal of the Royal Statistical Society, Series B, 58, p. 267–288.

Examples

library(LaplacesDemon)
x <- rnorm(100)
sigma <- rhalfcauchy(1, 5)
tau <- rhalfcauchy(100, 5)
lambda <- rhalfcauchy(1, 5)
x <- dlasso(x, sigma, tau, lambda, log=TRUE)
x <- rlasso(length(tau), sigma, tau, lambda)
library(LaplacesDemon)
x <- rnorm(100)
sigma <- rhalfcauchy(1, 5)
tau <- rhalfcauchy(100, 5)
lambda <- rhalfcauchy(1, 5)
x <- dlasso(x, sigma, tau, lambda, log=TRUE)
x <- rlasso(length(tau), sigma, tau, lambda)

Log-Laplace Distribution: Univariate Symmetric

Description

These functions provide the density, distribution function, quantile function, and random generation for the univariate, symmetric, log-Laplace distribution with location parameter location and scale parameter scale.

Usage

dllaplace(x, location=0, scale=1, log=FALSE)
pllaplace(q, location=0, scale=1)
qllaplace(p, location=0, scale=1)
rllaplace(n, location=0, scale=1)
dllaplace(x, location=0, scale=1, log=FALSE)
pllaplace(q, location=0, scale=1)
qllaplace(p, location=0, scale=1)
rllaplace(n, location=0, scale=1)

Arguments

`x`, `q`	These are each a vector of quantiles.
`p`	This is a vector of probabilities.
`n`	This is the number of observations, which must be a positive integer that has length 1.
`location`	This is the location parameter $\mu$ .
`scale`	This is the scale parameter $\lambda$ , which must be positive.
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Univariate
Density 1: $p(\theta) = \frac{(\sqrt{2}/\lambda)^2}{2(\sqrt{2}/\lambda)} \exp(-(\sqrt{2}/\lambda)(\theta - \mu)), \theta \ge \exp(\mu)$
Density 2: $p(\theta) = \frac{(\sqrt{2}/\lambda)^2}{2(\sqrt{2}/\lambda)} \exp((\sqrt{2}/\lambda)(\theta - \mu)), \theta < \exp(\mu)$
Inventor: Pierre-Simon Laplace
Notation 1: $\theta \sim \mathcal{LL}(\mu, \lambda)$
Notation 2: $p(\theta) = \mathcal{LL}(\theta | \mu, \lambda)$
Parameter 1: location parameter $\mu$
Parameter 2: scale parameter $\lambda > 0$
Mean: $E(\theta) =$
Variance: $var(\theta) =$
Mode: $mode(\theta) =$

The univariate, symmetric log-Laplace distribution is derived from the Laplace distribution. Multivariate and asymmetric versions also exist.

These functions are similar to those in the VGAM package.

Value

dllaplace gives the density, pllaplace gives the distribution function, qllaplace gives the quantile function, and rllaplace generates random deviates.

References

Kozubowski, T. J. and Podgorski, K. (2003). "Log-Laplace Distributions". International Mathematical Journal, 3, p. 467–495.

Examples

library(LaplacesDemon)
x <- dllaplace(1,0,1)
x <- pllaplace(1,0,1)
x <- qllaplace(0.5,0,1)
x <- rllaplace(100,0,1)

#Plot Probability Functions
x <- seq(from=0.1, to=20, by=0.1)
plot(x, dllaplace(x,0,0.1), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dllaplace(x,0,0.5), type="l", col="green")
lines(x, dllaplace(x,0,1.5), type="l", col="blue")
legend(2, 0.9, expression(paste(mu==0, ", ", lambda==0.1),
     paste(mu==0, ", ", lambda==0.5), paste(mu==0, ", ", lambda==1.5)),
     lty=c(1,1,1), col=c("red","green","blue"))
library(LaplacesDemon)
x <- dllaplace(1,0,1)
x <- pllaplace(1,0,1)
x <- qllaplace(0.5,0,1)
x <- rllaplace(100,0,1)

#Plot Probability Functions
x <- seq(from=0.1, to=20, by=0.1)
plot(x, dllaplace(x,0,0.1), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dllaplace(x,0,0.5), type="l", col="green")
lines(x, dllaplace(x,0,1.5), type="l", col="blue")
legend(2, 0.9, expression(paste(mu==0, ", ", lambda==0.1),
     paste(mu==0, ", ", lambda==0.5), paste(mu==0, ", ", lambda==1.5)),
     lty=c(1,1,1), col=c("red","green","blue"))

Log-Normal Distribution: Precision Parameterization

Description

These functions provide the density, distribution function, quantile function, and random generation for the univariate log-normal distribution with mean $\mu$ and precision $\tau$ .

Usage

dlnormp(x, mu, tau=NULL, var=NULL, log=FALSE)
plnormp(q, mu, tau, lower.tail=TRUE, log.p=FALSE)
qlnormp(p, mu, tau, lower.tail=TRUE, log.p=FALSE)
rlnormp(n, mu, tau=NULL, var=NULL)
dlnormp(x, mu, tau=NULL, var=NULL, log=FALSE)
plnormp(q, mu, tau, lower.tail=TRUE, log.p=FALSE)
qlnormp(p, mu, tau, lower.tail=TRUE, log.p=FALSE)
rlnormp(n, mu, tau=NULL, var=NULL)

Arguments

`x`, `q`	These are each a vector of quantiles.
`p`	This is a vector of probabilities.
`n`	This is the number of observations, which must be a positive integer that has length 1.
`mu`	This is the mean parameter $\mu$ .
`tau`	This is the precision parameter $\tau$ , which must be positive. Tau and var cannot be used together
`var`	This is the variance parameter, which must be positive. Tau and var cannot be used together
`log`, `log.p`	Logical. If `TRUE`, then probabilities $p$ are given as $\log(p)$ .
`lower.tail`	Logical. If `TRUE` (default), then probabilities are $Pr[X \le x]$ , otherwise, $Pr[X > x]$ .

Details

Application: Continuous Univariate
Density: $p(\theta) = \sqrt{\frac{\tau}{2\pi}} \frac{1}{\theta} \exp(-\frac{\tau}{2} (\log(\theta - \mu))^2)$
Inventor: Carl Friedrich Gauss or Abraham De Moivre
Notation 1: $\theta \sim \mathrm{Log-}\mathcal{N}(\mu, \tau^{-1})$
Notation 2: $p(\theta) = \mathrm{Log-}\mathcal{N}(\theta | \mu, \tau^{-1})$
Parameter 1: mean parameter $\mu$
Parameter 2: precision parameter $\tau > 0$
Mean: $E(\theta) = \exp(\mu + \tau^{-1} / 2)$
Variance: $var(\theta) = (\exp(\tau^{-1}) - 1)\exp(2\mu + \tau^{-1})$
Mode: $mode(\theta) = \exp(\mu - \tau^{-1})$

The log-normal distribution, also called the Galton distribution, is applied to a variable whose logarithm is normally-distributed. The distribution is usually parameterized with mean and variance, or in Bayesian inference, with mean and precision, where precision is the inverse of the variance. In contrast, Base R parameterizes the log-normal distribution with the mean and standard deviation. These functions provide the precision parameterization for convenience and familiarity.

A flat distribution is obtained in the limit as $\tau \rightarrow 0$ .

These functions are similar to those in base R.

Value

dlnormp gives the density, plnormp gives the distribution function, qlnormp gives the quantile function, and rlnormp generates random deviates.

Author(s)

Statisticat, LLC. [email protected]

Examples

library(LaplacesDemon)
x <- dlnormp(1,0,1)
x <- plnormp(1,0,1)
x <- qlnormp(0.5,0,1)
x <- rlnormp(100,0,1)

#Plot Probability Functions
x <- seq(from=0.1, to=3, by=0.01)
plot(x, dlnormp(x,0,0.1), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dlnormp(x,0,1), type="l", col="green")
lines(x, dlnormp(x,0,5), type="l", col="blue")
legend(2, 0.9, expression(paste(mu==0, ", ", tau==0.1),
     paste(mu==0, ", ", tau==1), paste(mu==0, ", ", tau==5)),
     lty=c(1,1,1), col=c("red","green","blue"))
library(LaplacesDemon)
x <- dlnormp(1,0,1)
x <- plnormp(1,0,1)
x <- qlnormp(0.5,0,1)
x <- rlnormp(100,0,1)

#Plot Probability Functions
x <- seq(from=0.1, to=3, by=0.01)
plot(x, dlnormp(x,0,0.1), ylim=c(0,1), type="l", main="Probability Function",
     ylab="density", col="red")
lines(x, dlnormp(x,0,1), type="l", col="green")
lines(x, dlnormp(x,0,5), type="l", col="blue")
legend(2, 0.9, expression(paste(mu==0, ", ", tau==0.1),
     paste(mu==0, ", ", tau==1), paste(mu==0, ", ", tau==5)),
     lty=c(1,1,1), col=c("red","green","blue"))

Matrix Gamma Distribution

Description

This function provides the density for the matrix gamma distribution.

Usage

dmatrixgamma(X, alpha, beta, Sigma, log=FALSE)
dmatrixgamma(X, alpha, beta, Sigma, log=FALSE)

Arguments

`X`	This is a $k \times k$ positive-definite precision matrix.
`alpha`	This is a scalar shape parameter (the degrees of freedom), $\alpha$ .
`beta`	This is a scalar, positive-only scale parameter, $\beta$ .
`Sigma`	This is a $k \times k$ positive-definite scale matrix.
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Multivariate Matrix
Density: $p(\theta) = \frac{|\Sigma|^{-\alpha}}{\beta^{k \alpha} \Gamma_k(\alpha)} |\theta|^{\alpha-(k+1)/2}\exp(tr(-\frac{1}{\beta}\Sigma^{-1}\theta))$
Inventors: Unknown
Notation 1: $\theta \sim \mathcal{MG}_k(\alpha, \beta, \Sigma)$
Notation 2: $p(\theta) = \mathcal{MG}_k(\theta | \alpha, \beta, \Sigma)$
Parameter 1: shape $\alpha > 2$
Parameter 2: scale $\beta > 0$
Parameter 3: positive-definite $k \times k$ scale matrix $\Sigma$
Mean:
Variance:
Mode:

The matrix gamma (MG), also called the matrix-variate gamma, distribution is a generalization of the gamma distribution to positive-definite matrices. It is a more general and flexible version of the Wishart distribution (dwishart), and is a conjugate prior of the precision matrix of a multivariate normal distribution (dmvnp) and matrix normal distribution (dmatrixnorm).

The compound distribution resulting from compounding a matrix normal with a matrix gamma prior over the precision matrix is a generalized matrix t-distribution.

The matrix gamma distribution is identical to the Wishart distribution when $\alpha = \nu / 2$ and $\beta = 2$ .

Value

dmatrixgamma gives the density.

Author(s)

Statisticat, LLC. [email protected]

Examples

library(LaplacesDemon)
k <- 10
dmatrixgamma(X=diag(k), alpha=(k+1)/2, beta=2, Sigma=diag(k), log=TRUE)
dwishart(Omega=diag(k), nu=k+1, S=diag(k), log=TRUE)
library(LaplacesDemon)
k <- 10
dmatrixgamma(X=diag(k), alpha=(k+1)/2, beta=2, Sigma=diag(k), log=TRUE)
dwishart(Omega=diag(k), nu=k+1, S=diag(k), log=TRUE)

Matrix Normal Distribution

Description

These functions provide the density and random number generation for the matrix normal distribution.

Usage

dmatrixnorm(X, M, U, V, log=FALSE) 
rmatrixnorm(M, U, V)
dmatrixnorm(X, M, U, V, log=FALSE) 
rmatrixnorm(M, U, V)

Arguments

`X`	This is data or parameters in the form of a matrix with $n$ rows and $k$ columns.
`M`	This is mean matrix with $n$ rows and $k$ columns.
`U`	This is a $n \times n$ positive-definite scale matrix.
`V`	This is a $k \times k$ positive-definite scale matrix.
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Multivariate Matrix
Density: $p(\theta) = \frac{\exp(-0.5tr[V^{-1}(X-M)'U^{-1}(X-M)])}{(2\pi)^{nk/2}|V|^{n/2}|U|^{k/2}}$
Inventors: Unknown
Notation 1: $\theta \sim \mathcal{MN}_{n \times k}(M, U, V)$
Notation 2: $p(\theta) = \mathcal{MN}_{n \times k}(\theta | M, U, V)$
Parameter 1: location $n \times k$ matrix $M$
Parameter 2: positive-definite $n \times n$ scale matrix $U$
Parameter 3: positive-definite $k \times k$ scale matrix $V$
Mean: $E(\theta) = M$
Variance: Unknown
Mode: Unknown

The matrix normal distribution is also called the matrix Gaussian, matrix-variate normal, or matrix-variate Gaussian distribution. It is a generalization of the multivariate normal distribution to matrix-valued random variables.

An example of the use of a matrix normal distribution is multivariate regression, in which there is a $j \times k$ matrix of regression effects of $j$ predictors for $k$ dependent variables. For univariate regression, having only one dependent variable, the $j$ regression effects may be multivariate normally distributed. For multivariate regression, this multivariate normal distribution may be extended to a matrix normal distribution to account for relationships of the regression effects across $k$ dependent variables. In this example, the matrix normal distribution is the conjugate prior distribution for these regression effects.

The matrix normal distribution has two covariance matrices, one for the rows and one for the columns. When $U$ is diagonal, the rows are independent. When $V$ is diagonal, the columns are independent.

Value

dmatrixnorm gives the density and rmatrixnorm generates random deviates.

Author(s)

Statisticat, LLC. [email protected]

Examples

library(LaplacesDemon)
N <- 10
K <- 4
U <- as.positive.definite(matrix(rnorm(N*N),N,N))
V <- as.positive.definite(matrix(rnorm(K*K),K,K))
x <- dmatrixnorm(matrix(0,N,K), matrix(0,N,K), U, V)
X <- rmatrixnorm(matrix(0,N,K), U, V)
joint.density.plot(X[,1], X[,2], color=TRUE)
library(LaplacesDemon)
N <- 10
K <- 4
U <- as.positive.definite(matrix(rnorm(N*N),N,N))
V <- as.positive.definite(matrix(rnorm(K*K),K,K))
x <- dmatrixnorm(matrix(0,N,K), matrix(0,N,K), U, V)
X <- rmatrixnorm(matrix(0,N,K), U, V)
joint.density.plot(X[,1], X[,2], color=TRUE)

Multivariate Cauchy Distribution

Description

These functions provide the density and random number generation for the multivariate Cauchy distribution.

Usage

dmvc(x, mu, S, log=FALSE)
rmvc(n=1, mu, S)
dmvc(x, mu, S, log=FALSE)
rmvc(n=1, mu, S)

Arguments

`x`	This is either a vector of length $k$ or a matrix with a number of columns, $k$ , equal to the number of columns in scale matrix $\textbf{S}$ .
`n`	This is the number of random draws.
`mu`	This is a numeric vector representing the location parameter, $\mu$ (the mean vector), of the multivariate distribution It must be of length $k$ , as defined above.
`S`	This is a $k \times k$ positive-definite scale matrix $\textbf{S}$ .
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Multivariate
Density:

$p(\theta) = \frac{\Gamma[(1+k)/2]}{\Gamma(1/2)1^{k/2}\pi^{k/2}|\Sigma|^{1/2}[1+(\theta-\mu)^{\mathrm{T}}\Sigma^{-1}(\theta-\mu)]^{(1+k)/2}}$
Inventor: Unknown (to me, anyway)
Notation 1: $\theta \sim \mathcal{MC}_k(\mu, \Sigma)$
Notation 2: $p(\theta) = \mathcal{MC}_k(\theta | \mu, \Sigma)$
Parameter 1: location vector $\mu$
Parameter 2: positive-definite $k \times k$ scale matrix $\Sigma$
Mean: $E(\theta) = \mu$
Variance: $var(\theta) = undefined$
Mode: $mode(\theta) = \mu$

The multivariate Cauchy distribution is a multidimensional extension of the one-dimensional or univariate Cauchy distribution. The multivariate Cauchy distribution is equivalent to a multivariate t distribution with 1 degree of freedom. A random vector is considered to be multivariate Cauchy-distributed if every linear combination of its components has a univariate Cauchy distribution.

The Cauchy distribution is known as a pathological distribution because its mean and variance are undefined, and it does not satisfy the central limit theorem.

Value

dmvc gives the density and rmvc generates random deviates.

Author(s)

Statisticat, LLC. [email protected]

Examples

library(LaplacesDemon)
x <- seq(-2,4,length=21)
y <- 2*x+10
z <- x+cos(y) 
mu <- c(1,12,2)
Sigma <- matrix(c(1,2,0,2,5,0.5,0,0.5,3), 3, 3)
f <- dmvc(cbind(x,y,z), mu, Sigma)

X <- rmvc(1000, rep(0,2), diag(2))
X <- X[rowSums((X >= quantile(X, probs=0.025)) &
     (X <= quantile(X, probs=0.975)))==2,]
joint.density.plot(X[,1], X[,2], color=TRUE)
library(LaplacesDemon)
x <- seq(-2,4,length=21)
y <- 2*x+10
z <- x+cos(y) 
mu <- c(1,12,2)
Sigma <- matrix(c(1,2,0,2,5,0.5,0,0.5,3), 3, 3)
f <- dmvc(cbind(x,y,z), mu, Sigma)

X <- rmvc(1000, rep(0,2), diag(2))
X <- X[rowSums((X >= quantile(X, probs=0.025)) &
     (X <= quantile(X, probs=0.975)))==2,]
joint.density.plot(X[,1], X[,2], color=TRUE)

Multivariate Cauchy Distribution: Cholesky Parameterization

Description

These functions provide the density and random number generation for the multivariate Cauchy distribution, given the Cholesky parameterization.

Usage

dmvcc(x, mu, U, log=FALSE)
rmvcc(n=1, mu, U)
dmvcc(x, mu, U, log=FALSE)
rmvcc(n=1, mu, U)

Arguments

`x`	This is either a vector of length $k$ or a matrix with a number of columns, $k$ , equal to the number of columns in scale matrix $\textbf{S}$ .
`n`	This is the number of random draws.
`mu`	This is a numeric vector representing the location parameter, $\mu$ (the mean vector), of the multivariate distribution It must be of length $k$ , as defined above.
`U`	This is the $k \times k$ upper-triangular matrix that is Cholesky factor $\textbf{U}$ of the positive-definite scale matrix $\textbf{S}$ .
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Multivariate
Density:

$p(\theta) = \frac{\Gamma[(1+k)/2]}{\Gamma(1/2)1^{k/2}\pi^{k/2}|\Sigma|^{1/2}[1+(\theta-\mu)^{\mathrm{T}}\Sigma^{-1}(\theta-\mu)]^{(1+k)/2}}$
Inventor: Unknown (to me, anyway)
Notation 1: $\theta \sim \mathcal{MC}_k(\mu, \Sigma)$
Notation 2: $p(\theta) = \mathcal{MC}_k(\theta | \mu, \Sigma)$
Parameter 1: location vector $\mu$
Parameter 2: positive-definite $k \times k$ scale matrix $\Sigma$
Mean: $E(\theta) = \mu$
Variance: $var(\theta) =$
Mode: $mode(\theta) = \mu$

The Cauchy distribution is known as a pathological distribution because its mean and variance are undefined, and it does not satisfy the central limit theorem.

In practice, $\textbf{U}$ is fully unconstrained for proposals when its diagonal is log-transformed. The diagonal is exponentiated after a proposal and before other calculations. Overall, the Cholesky parameterization is faster than the traditional parameterization. Compared with dmvc, dmvcc must additionally matrix-multiply the Cholesky back to the scake matrix, but it does not have to check for or correct the scale matrix to positive-definiteness, which overall is slower. Compared with rmvc, rmvcc is faster because the Cholesky decomposition has already been performed.

Value

dmvcc gives the density and rmvcc generates random deviates.

Author(s)

Statisticat, LLC. [email protected]

Examples

library(LaplacesDemon)
x <- seq(-2,4,length=21)
y <- 2*x+10
z <- x+cos(y) 
mu <- c(1,12,2)
Sigma <- matrix(c(1,2,0,2,5,0.5,0,0.5,3), 3, 3)
U <- chol(Sigma)
f <- dmvcc(cbind(x,y,z), mu, U)

X <- rmvcc(1000, rep(0,2), diag(2))
X <- X[rowSums((X >= quantile(X, probs=0.025)) &
     (X <= quantile(X, probs=0.975)))==2,]
joint.density.plot(X[,1], X[,2], color=TRUE)
library(LaplacesDemon)
x <- seq(-2,4,length=21)
y <- 2*x+10
z <- x+cos(y) 
mu <- c(1,12,2)
Sigma <- matrix(c(1,2,0,2,5,0.5,0,0.5,3), 3, 3)
U <- chol(Sigma)
f <- dmvcc(cbind(x,y,z), mu, U)

X <- rmvcc(1000, rep(0,2), diag(2))
X <- X[rowSums((X >= quantile(X, probs=0.025)) &
     (X <= quantile(X, probs=0.975)))==2,]
joint.density.plot(X[,1], X[,2], color=TRUE)

Multivariate Cauchy Distribution: Precision Parameterization

Description

These functions provide the density and random number generation for the multivariate Cauchy distribution. These functions use the precision parameterization.

Usage

dmvcp(x, mu, Omega, log=FALSE)
rmvcp(n=1, mu, Omega)
dmvcp(x, mu, Omega, log=FALSE)
rmvcp(n=1, mu, Omega)

Arguments

`x`	This is either a vector of length $k$ or a matrix with a number of columns, $k$ , equal to the number of columns in precision matrix $\Omega$ .
`n`	This is the number of random draws.
`mu`	This is a numeric vector representing the location parameter, $\mu$ (the mean vector), of the multivariate distribution. It must be of length $k$ , as defined above.
`Omega`	This is a $k \times k$ positive-definite precision matrix $\Omega$ .
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Multivariate
Density:

$p(\theta) = \frac{\Gamma((1+k)/2)}{\Gamma(1/2)1^{k/2}\pi^{k/2}} |\Omega|^{1/2} (1 + (\theta-\mu)^T \Omega (\theta-\mu))^{-(1+k)/2}$
Inventor: Unknown (to me, anyway)
Notation 1: $\theta \sim \mathcal{MC}_k(\mu, \Omega^{-1})$
Notation 2: $p(\theta) = \mathcal{MC}_k(\theta | \mu, \Omega^{-1})$
Parameter 1: location vector $\mu$
Parameter 2: positive-definite $k \times k$ precision matrix $\Omega$
Mean: $E(\theta) = \mu$
Variance: $var(\theta) = undefined$
Mode: $mode(\theta) = \mu$

The multivariate Cauchy distribution is a multidimensional extension of the one-dimensional or univariate Cauchy distribution. A random vector is considered to be multivariate Cauchy-distributed if every linear combination of its components has a univariate Cauchy distribution. The multivariate Cauchy distribution is equivalent to a multivariate t distribution with 1 degree of freedom.

The Cauchy distribution is known as a pathological distribution because its mean and variance are undefined, and it does not satisfy the central limit theorem.

It is usually parameterized with mean and a covariance matrix, or in Bayesian inference, with mean and a precision matrix, where the precision matrix is the matrix inverse of the covariance matrix. These functions provide the precision parameterization for convenience and familiarity. It is easier to calculate a multivariate Cauchy density with the precision parameterization, because a matrix inversion can be avoided.

This distribution has a mean parameter vector $\mu$ of length $k$ , and a $k \times k$ precision matrix $\Omega$ , which must be positive-definite.

Value

dmvcp gives the density and rmvcp generates random deviates.

Author(s)

Statisticat, LLC. [email protected]

Examples

library(LaplacesDemon)
x <- seq(-2,4,length=21)
y <- 2*x+10
z <- x+cos(y) 
mu <- c(1,12,2)
Omega <- matrix(c(1,2,0,2,5,0.5,0,0.5,3), 3, 3)
f <- dmvcp(cbind(x,y,z), mu, Omega)

X <- rmvcp(1000, rep(0,2), diag(2))
X <- X[rowSums((X >= quantile(X, probs=0.025)) &
     (X <= quantile(X, probs=0.975)))==2,]
joint.density.plot(X[,1], X[,2], color=TRUE)
library(LaplacesDemon)
x <- seq(-2,4,length=21)
y <- 2*x+10
z <- x+cos(y) 
mu <- c(1,12,2)
Omega <- matrix(c(1,2,0,2,5,0.5,0,0.5,3), 3, 3)
f <- dmvcp(cbind(x,y,z), mu, Omega)

X <- rmvcp(1000, rep(0,2), diag(2))
X <- X[rowSums((X >= quantile(X, probs=0.025)) &
     (X <= quantile(X, probs=0.975)))==2,]
joint.density.plot(X[,1], X[,2], color=TRUE)

Multivariate Cauchy Distribution: Precision-Cholesky Parameterization

Description

These functions provide the density and random number generation for the multivariate Cauchy distribution. These functions use the precision and Cholesky parameterization.

Usage

dmvcpc(x, mu, U, log=FALSE)
rmvcpc(n=1, mu, U)
dmvcpc(x, mu, U, log=FALSE)
rmvcpc(n=1, mu, U)

Arguments

`x`	This is either a vector of length $k$ or a matrix with a number of columns, $k$ , equal to the number of columns in precision matrix $\Omega$ .
`n`	This is the number of random draws.
`mu`	This is a numeric vector representing the location parameter, $\mu$ (the mean vector), of the multivariate distribution. It must be of length $k$ , as defined above.
`U`	This is the $k \times k$ upper-triangular matrix that is Cholesky factor $\textbf{U}$ of the positive-definite precision matrix $\Omega$ .
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Multivariate
Density:

$p(\theta) = \frac{\Gamma((1+k)/2)}{\Gamma(1/2)1^{k/2}\pi^{k/2}} |\Omega|^{1/2} (1 + (\theta-\mu)^T \Omega (\theta-\mu))^{-(1+k)/2}$
Inventor: Unknown (to me, anyway)
Notation 1: $\theta \sim \mathcal{MC}_k(\mu, \Omega^{-1})$
Notation 2: $p(\theta) = \mathcal{MC}_k(\theta | \mu, \Omega^{-1})$
Parameter 1: location vector $\mu$
Parameter 2: positive-definite $k \times k$ precision matrix $\Omega$
Mean: $E(\theta) = \mu$
Variance: $var(\theta) =$
Mode: $mode(\theta) = \mu$

The Cauchy distribution is known as a pathological distribution because its mean and variance are undefined, and it does not satisfy the central limit theorem.

This distribution has a mean parameter vector $\mu$ of length $k$ , and a $k \times k$ precision matrix $\Omega$ , which must be positive-definite. The precision matrix is replaced with the upper-triangular Cholesky factor, as in chol.

In practice, $\textbf{U}$ is fully unconstrained for proposals when its diagonal is log-transformed. The diagonal is exponentiated after a proposal and before other calculations. Overall, Cholesky parameterization is faster than the traditional parameterization. Compared with dmvcp, dmvcpc must additionally matrix-multiply the Cholesky back to the covariance matrix, but it does not have to check for or correct the precision matrix to positive-definiteness, which overall is slower. Compared with rmvcp, rmvcpc is faster because the Cholesky decomposition has already been performed.

Value

dmvcpc gives the density and rmvcpc generates random deviates.

Author(s)

Statisticat, LLC. [email protected]

Examples

library(LaplacesDemon)
x <- seq(-2,4,length=21)
y <- 2*x+10
z <- x+cos(y) 
mu <- c(1,12,2)
Omega <- matrix(c(1,2,0,2,5,0.5,0,0.5,3), 3, 3)
U <- chol(Omega)
f <- dmvcpc(cbind(x,y,z), mu, U)

X <- rmvcpc(1000, rep(0,2), diag(2))
X <- X[rowSums((X >= quantile(X, probs=0.025)) &
     (X <= quantile(X, probs=0.975)))==2,]
joint.density.plot(X[,1], X[,2], color=TRUE)
library(LaplacesDemon)
x <- seq(-2,4,length=21)
y <- 2*x+10
z <- x+cos(y) 
mu <- c(1,12,2)
Omega <- matrix(c(1,2,0,2,5,0.5,0,0.5,3), 3, 3)
U <- chol(Omega)
f <- dmvcpc(cbind(x,y,z), mu, U)

X <- rmvcpc(1000, rep(0,2), diag(2))
X <- X[rowSums((X >= quantile(X, probs=0.025)) &
     (X <= quantile(X, probs=0.975)))==2,]
joint.density.plot(X[,1], X[,2], color=TRUE)

Multivariate Laplace Distribution

Description

These functions provide the density and random number generation for the multivariate Laplace distribution.

Usage

dmvl(x, mu, Sigma, log=FALSE)
rmvl(n, mu, Sigma)
dmvl(x, mu, Sigma, log=FALSE)
rmvl(n, mu, Sigma)

Arguments

`x`	This is data or parameters in the form of a vector of length $k$ or a matrix with $k$ columns.
`n`	This is the number of random draws.
`mu`	This is mean vector $\mu$ with length $k$ or matrix with $k$ columns.
`Sigma`	This is the $k \times k$ covariance matrix $\Sigma$ .
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Multivariate
Density:

$p(\theta) = \frac{2}{(2\pi)^{k/2} |\Sigma|^{1/2}} \frac{(\pi/(2\sqrt{2(\theta - \mu)^T \Sigma^{-1} (\theta - \mu)}))^{1/2} \exp(-\sqrt{2(\theta - \mu)^T \Sigma^{-1} (\theta - \mu)})}{\sqrt{((\theta - \mu)^T \Sigma^{-1} (\theta - \mu) / 2)}^{k/2-1}}$
Inventor: Fang et al. (1990)
Notation 1: $\theta \sim \mathcal{MVL}(\mu, \Sigma)$
Notation 2: $\theta \sim \mathcal{L}_k(\mu, \Sigma)$
Notation 3: $p(\theta) = \mathcal{MVL}(\theta | \mu, \Sigma)$
Notation 4: $p(\theta) = \mathcal{L}_k(\theta | \mu, \Sigma)$
Parameter 1: location vector $\mu$
Parameter 2: positive-definite $k \times k$ covariance matrix $\Sigma$
Mean: $E(\theta) = \mu$
Variance: $var(\theta) = \Sigma$
Mode: $mode(\theta) = \mu$

The multivariate Laplace distribution is a multidimensional extension of the one-dimensional or univariate symmetric Laplace distribution. There are multiple forms of the multivariate Laplace distribution.

The bivariate case was introduced by Ulrich and Chen (1987), and the first form in larger dimensions may have been Fang et al. (1990), which requires a Bessel function. Alternatively, multivariate Laplace was soon introduced as a special case of a multivariate Linnik distribution (Anderson, 1992), and later as a special case of the multivariate power exponential distribution (Fernandez et al., 1995; Ernst, 1998). Bayesian considerations appear in Haro-Lopez and Smith (1999). Wainwright and Simoncelli (2000) presented multivariate Laplace as a Gaussian scale mixture. Kotz et al. (2001) present the distribution formally. Here, the density is calculated with the asymptotic formula for the Bessel function as presented in Wang et al. (2008).

The multivariate Laplace distribution is an attractive alternative to the multivariate normal distribution due to its wider tails, and remains a two-parameter distribution (though alternative three-parameter forms have been introduced as well), unlike the three-parameter multivariate t distribution, which is often used as a robust alternative to the multivariate normal distribution.

Value

dmvl gives the density, and rmvl generates random deviates.

Author(s)

Statisticat, LLC. [email protected]

References

Anderson, D.N. (1992). "A Multivariate Linnik Distribution". Statistical Probability Letters, 14, p. 333–336.

Eltoft, T., Kim, T., and Lee, T. (2006). "On the Multivariate Laplace Distribution". IEEE Signal Processing Letters, 13(5), p. 300–303.

Ernst, M. D. (1998). "A Multivariate Generalized Laplace Distribution". Computational Statistics, 13, p. 227–232.

Fang, K.T., Kotz, S., and Ng, K.W. (1990). "Symmetric Multivariate and Related Distributions". Monographs on Statistics and Probability, 36, Chapman-Hall, London.

Fernandez, C., Osiewalski, J. and Steel, M.F.J. (1995). "Modeling and Inference with v-spherical Distributions". Journal of the American Statistical Association, 90, p. 1331–1340.

Gomez, E., Gomez-Villegas, M.A., and Marin, J.M. (1998). "A Multivariate Generalization of the Power Exponential Family of Distributions". Communications in Statistics-Theory and Methods, 27(3), p. 589–600.

Haro-Lopez, R.A. and Smith, A.F.M. (1999). "On Robust Bayesian Analysis for Location and Scale Parameters". Journal of Multivariate Analysis, 70, p. 30–56.

Kotz., S., Kozubowski, T.J., and Podgorski, K. (2001). "The Laplace Distribution and Generalizations: A Revisit with Applications to Communications, Economics, Engineering, and Finance". Birkhauser: Boston, MA.

Ulrich, G. and Chen, C.C. (1987). "A Bivariate Double Exponential Distribution and its Generalization". ASA Proceedings on Statistical Computing, p. 127–129.

Wang, D., Zhang, C., and Zhao, X. (2008). "Multivariate Laplace Filter: A Heavy-Tailed Model for Target Tracking". Proceedings of the 19th International Conference on Pattern Recognition: FL.

Wainwright, M.J. and Simoncelli, E.P. (2000). "Scale Mixtures of Gaussians and the Statistics of Natural Images". Advances in Neural Information Processing Systems, 12, p. 855–861.

Examples

library(LaplacesDemon)
x <- dmvl(c(1,2,3), c(0,1,2), diag(3))
X <- rmvl(1000, c(0,1,2), diag(3))
joint.density.plot(X[,1], X[,2], color=TRUE)
library(LaplacesDemon)
x <- dmvl(c(1,2,3), c(0,1,2), diag(3))
X <- rmvl(1000, c(0,1,2), diag(3))
joint.density.plot(X[,1], X[,2], color=TRUE)

Multivariate Laplace Distribution: Cholesky Parameterization

Description

These functions provide the density and random number generation for the multivariate Laplace distribution, given the Cholesky parameterization.

Usage

dmvlc(x, mu, U, log=FALSE)
rmvlc(n, mu, U)
dmvlc(x, mu, U, log=FALSE)
rmvlc(n, mu, U)

Arguments

`x`	This is data or parameters in the form of a vector of length $k$ or a matrix with $k$ columns.
`n`	This is the number of random draws.
`mu`	This is mean vector $\mu$ with length $k$ or matrix with $k$ columns.
`U`	This is the $k \times k$ upper-triangular matrix that is Cholesky factor $\textbf{U}$ of covariance matrix $\Sigma$ .
`log`	Logical. If `log=TRUE`, then the logarithm of the density is returned.

Details

Application: Continuous Multivariate
Density:

$p(\theta) = \frac{2}{(2\pi)^{k/2} |\Sigma|^{1/2}} \frac{(\pi/(2\sqrt{2(\theta - \mu)^T \Sigma^{-1} (\theta - \mu)}))^{1/2} \exp(-\sqrt{2(\theta - \mu)^T \Sigma^{-1} (\theta - \mu)})}{\sqrt{((\theta - \mu)^T \Sigma^{-1} (\theta - \mu) / 2)}^{k/2-1}}$
Inventor: Fang et al. (1990)
Notation 1: $\theta \sim \mathcal{MVL}(\mu, \Sigma)$
Notation 2: $\theta \sim \mathcal{L}_k(\mu, \Sigma)$
Notation 3: $p(\theta) = \mathcal{MVL}(\theta | \mu, \Sigma)$
Notation 4: $p(\theta) = \mathcal{L}_k(\theta | \mu, \Sigma)$
Parameter 1: location vector $\mu$
Parameter 2: positive-definite $k \times k$ covariance matrix $\Sigma$
Mean: $E(\theta) = \mu$
Variance: $var(\theta) = \Sigma$
Mode: $mode(\theta) = \mu$

In practice, $\textbf{U}$ is fully unconstrained for proposals when its diagonal is log-transformed. The diagonal is exponentiated after a proposal and before other calculations. Overall, the Cholesky parameterization is faster than the traditional parameterization. Compared with dmvl, dmvlc must additionally matrix-multiply the Cholesky back to the covariance matrix, but it does not have to check for or correct the covariance matrix to positive-definiteness, which overall is slower. Compared with rmvl, rmvlc is faster because the Cholesky decomposition has already been performed.

Value

dmvlc gives the density, and rmvlc generates random deviates.