# The random component of a Generalized Linear Model (GLM) consists of a response variable Y with… 1 answer below »

4. (25pts) The random component of a Generalized Linear Model (GLM) consists of a response variable Y with independent observations (yi, • • • , y.) from a distribution in the exponential dispersion family. This family has a probability density function or mass function of the form

— b P(M; 0” 0) = exP {Y"0a(cb)s(0 1 c(M, 0)}

(1)

where the parameter Oi is called the natural parameter and q5 is called the dispersion parameter. Let xis denote the value of predictor j (j = 1, , p) for subject i, and xio 1. Let yi denote the mean of Yi given xij's. The systematic component of a GLM relates pi to the explanatory variables through a linear model and a link function g(.):

g (pi) = E 1, . , n. j=0

(13) Show that the Bernoulli distribution for a binary response variable Y, p(yz; = irt (1 —)"., yi = 0, 1, is a member of the exponential family (1). Clearly specify what the natural parameter Oi is, along with the specification for a(0), b(0i) and c(yz, 0). (14) The link function g for which g(µ4) = 0, is called the canonical link. For the binary response Y, the GLM with the canonical link is known as the logistic regression model. Find the canonical link function g and write down the log likelihood function L(/3) for the logistic model, where 0 = (3o, ,31 • • • ,13p)T. (15) Let x = (1, xi, . , x5), and 13 be the MLE of /3. Let P[Y = 11x] denote the fitted probability as a function of x (computed by plugging in 13 ). Show that

E P[Yi = 1 'xi] = E N. i=1 i=1

(16) Consider the linear combination of the predictors qi = ajX„i. Given Yi = y, suppose that ni has a N (Trty, a2) distribution, y = 0,1. Let 7 = P (Yi = 1). Show that the relationship between XD's and Y follows a logistic regression model. Clearly specify the formulas for the regression coefficients i3is in terms of ma, 7721, a2, 7 and ad's. (17) For simplicity, let's restrict our attention to one single predictor X (i.e., p = 1). Given = y, suppose that Xi has a N (my, a) distribution, y = 0,1. Note that XIY = 1 and X IY = 0 have different variances now. Can you still use a logistic regression model to relate X to Y? If yes, clearly specify the model and formulas for the regression coefficients in terms of mo, in1i aa, a? and 7.