For known variance V: x = vector (length n) of observations m = (unknown) mean V = variance m0 = prior mean on m V0 = prior variance on m p(m) ~ MVN(m0,V0) p(x|m,V) ~ MVN(m,V) Let Vinv = V^(-1), V0inv = V0^(-1) mp = posterior mean on m Vp = posterior variance on m p(m|X,V) ~ MVN(mp,Vp) Vp = (V0inv + Vinv)^(-1) mp = Vp (V0inv m0 + Vinv x) Marginally p(x) ~ MVN(?,V0+V) # Bayesian regression y = vector of n phenotypes x = matrix of n by L genotypes b = vector of L coefficients ssqE = error variance b0 = prior mean coefficient ssqB = prior variance on effect sizes # A priori b_i ~ N(b0,ssqB) b ~ MVN(b0 1_L, ssqB I_L) ssqE ~ Improper Log-Uniform # Likelihood yhat = x b y_i | b,ssqE ~ N(yhat_i, ssqE) ~ N(x_i b, ssqE) y | b,ssqE ~ MVN(x b, ssqE I_n) # Marginal likelihood y_i | ssqE ~ N(x_i b0, ##################################### ssq ~ InvGamma(a0,b0) [a0=b0=0 for improper inverse] b ~ MVN(m0,ssq L0inv) y | b,ssq ~ MVN(x b, ssq I_n) --> Marginal likelihood Pr(y) = (2*pi)^(-n/2) sqrt(det(L0))/sqrt(det(Ln)) b0^a0/bn^an GAMMA(an)/GAMMA(a0) where: Ln = inv(t(x) x + L0) mn = Ln (L0 m0 + t(X) X bhat) = Ln (L0 m0 + t(X) y) an = a0 + n/2 bn = b0 + 0.5*(t(y) y + t(m0) L0 m0 - t(mn) Ln mn) Simplifying [ridge regression]: a0 = b0 = 0 m0 = 0_L L0 = l0 I_L then: Ln = inv(t(x) x + l0 I_L) mn = Ln t(X) y an = n/2 bn = 0.5*(t(y) y - t(mn) Ln mn) = 0.5*(t(y) y - t(Ln t(X) y) Ln Ln t(X) y) = 0.5*(t(y) y - t(y) X t(Ln) Ln Ln t(X) y) = 0.5*(t(y) [I_n - X t(Ln) Ln Ln t(X)] y) --> Marginal likelihood Pr(y) = (2*pi)^(-n/2) l0^(L/2) 1/sqrt(det(Ln)) b0^a0/bn^an GAMMA(an)/GAMMA(a0) # O'Hagan and Forster p.318 f(y) = sqrt(det(Vstar)*a^d)*GAMMA(dstar/2)/ sqrt(pi^n*det(V)*astar^dstar)/GAMMA(d/2) where Vstar = inv(inv(V) + t(X)*X) mstar = Vstar * (inv(V)*m + t(X)*y) astar = a + t(m)*inv(V)*m + t(y)*y - t(mstar)*inv(Vstar)*mstar dstar = d + n Suppose m = 0, V = 1/c*I_L (ridge regression, p. 326) then Vstar = inv(c*I + t(X)*X) mstar = Vstar * t(X) * y Further, a = 0, d = 0 gives the improper prior on sigma^2 f(sigma^2) = (sigma^2)^(-1) so astar = t(y)*y - t(mstar)*inv(Vstar)*mstar dstar = n Note that mstar does not feature directly in f(y). With improper prior it can only be computed up to a normalizing constant: g(y) = sqrt(det(Vstar)) / sqrt(pi^n*det(V)*astar^n) astar = t(y)*y - t(mstar) * inv(Vstar) * Vstar * t(X) * y = t(y)*y - t(mstar) * t(X) * y = t(y)*y - t(y) * X * t(Vstar) * t(X) * y = t(y)*y - t(y) * X * Vstar * t(X) * y = t(y) * [I_n - X * Vstar * t(X)] * y