Flexible Regression and Smoothing

Size: px
Start display at page:

Download "Flexible Regression and Smoothing"

Transcription

1 Flexible Regression and Smoothing Mixed Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche a de Comunita, Universita degli studi di Milano, Italy, January 2018 Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

2 Outline 1 Mixed distributions Zero adjusted distributions: on zero and the positive real line [0, ) Inflated distributions at 0 and 1 Data analysis: Lung function data 2 Finite mixtures Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

3 Mixed distributions Mixed distributions A mixed distribution is a mixture of two components: a continuous distribution and a discrete distribution i.e. it is a continuous distribution where the range of Y also includes discrete values with non-zero probabilities. Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

4 Mixed distributions Mixed distributions (a) ZAGA (b) BEINF density function(y) dbeinf(y, mu = mu, sigma = sigma, nu = nu, tau = tau) y x Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

5 Mixed distributions Zero adjusted distributions: on zero and the positive real line [0, ) Zero adjusted distributions: on zero and the positive real line [0, ) They are a mixture of a discrete value 0 with probability p, and a continuous distribution on the positive real line (0, ) with probability (1 p). The probability (density) function of Y is f Y (y) given by f Y (y) = { p if y = 0 (1 p)f W (y) if y > 0 (1) for 0 y <, where 0 < p < 1 and f W (y) is a probability density function defined on (0, ), i.e. for 0 < y <. Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

6 Mixed distributions Zero adjusted distributions: on zero and the positive real line [0, ) Zero adjusted distributions on zero and the positive real line [0, ) Appropriate when Y = 0 has non-zero probability and otherwise Y > 0. For example when Y measures the amount of rainfall in a day (where some days have zero rainfall), the river flow at a specific time each day (where some days the river flow is zero), the total amount of insurance claims in a year for individuals (where some people do not claim at all and therefore their total claim is zero). Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

7 Mixed distributions Zero adjusted distributions: on zero and the positive real line [0, ) Zero adjusted gamma distribution, ZAGA(µ,σ,ν) The zero adjusted gamma distribution is a mixture of a discrete value 0 with probability ν, and a gamma GA(µ, σ) distribution on the positive real line (0, ) with probability (1 ν). Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

8 Mixed distributions Zero adjusted distributions: on zero and the positive real line [0, ) Zero adjusted gamma distribution, ZAGA(µ,σ,ν) The probability (density) function of the zero adjusted gamma distribution, denoted by ZAGA(µ,σ,ν), is given by f Y (y µ, σ, ν) = { ν if y = 0 (1 ν)f W (y µ, σ) if y > 0 (2) for 0 y <, where µ > 0 and σ > 0 and 0 < ν < 1, and W GA(µ, σ) has a gamma distribution. Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

9 Mixed distributions Zero adjusted gamma distribution Zero adjusted distributions: on zero and the positive real line [0, ) Zero adjusted GA pdf y Zero adjusted Gamma c.d.f. F(y) y Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

10 Mixed distributions Zero adjusted distributions: on zero and the positive real line [0, ) Zero adjusted gamma distribution, ZAGA(µ,σ,ν) The default link functions relating the parameters (µ, σ, ν) to the predictors (η 1, η 2, η 3 ), which may depend on explanatory variables, are log µ = η 1 log σ = η 2 ( ) ν log = η 3. 1 ν Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

11 Mixed distributions Zero adjusted distributions: on zero and the positive real line [0, ) Zero adjusted gamma distribution, ZAGA(µ,σ,ν) The ZAGA model is equivalent to a gamma distribution GA(µ, σ) model for Y > 0 together with a binary model for recoded variable Y 1 iven by Y 1 = { 0 if Y > 0 1 if Y = 0 (3) i.e. p(y 1 = y 1 ) = { (1 ν) if y1 = 0 ν if y 1 = 1 (4) Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

12 Mixed distributions Zero adjusted distributions Zero adjusted distributions: on zero and the positive real line [0, ) package gamlss.dist: ZAGA() and ZAIG() distributions package gamlss.inf any 0 to distribution any (, ) distribution which been exponentiated i.e gen.family(..., "log ) truncated i.e gen.trun(0,"sst",type="left") See vignette: Zero adjusted distributions on the positive real line Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

13 Mixed distributions Zero adjusted distributions Zero adjusted distributions: on zero and the positive real line [0, ) The GAMLSS model changes to Y ind D(µ, σ, ν, τ, ξ 0 ) η 1 = g 1 (µ) = X 1 β 1 + s 11 (x 11 ) s 1J1 (x 1J1 ) η 2 = g 2 (σ) = X 2 β 2 + s 21 (x 21 ) s 2J2 (x 2J2 ) η 3 = g 3 (ν) = X 3 β 3 + s 31 (x 31 ) s 3J3 (x 3J3 ) η 4 = g 4 (τ ) = X 4 β 4 + s 41 (x 41 ) s 4J4 (x 4J4 ) η 5 = g 5 (ξ 0 ) = X 5 β 5 + s 51 (x 51 ) s 5J5 (x 5J5 ) Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

14 Mixed distributions Inflated distributions at 0 and 1 Inflated distributions at 0 and 1 These distributions are appropriate when the response variable Y takes values from 0 to 1 including 0 and 1, i.e. range [0,1]. They are a mixture of three components: a discrete value 0 with probability p 0, a discrete value 1 with probability p 1, and a continuous distribution on the unit interval (0, 1) with probability (1 p 0 p 1 ). Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

15 Mixed distributions Inflated distributions at 0 and 1 Inflated distributions at 0 and 1 The probability (density) function of Y is f Y (y) given by p 0 if y = 0 f Y (y) = (1 p 0 p 1 )f W (y) if 0 < y < 1 p 1 if y = 1 (5) or 0 y 1, where 0 < p 0 < 1, 0 < p 1 < 1 and 0 < p 0 + p 1 < 1 and f W (y) is a probability density function defined on (0, 1), i.e. for 0 < y < 1. Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

16 Beta Inflated distribution Mixed distributions Inflated distributions at 0 and 1 probability density function f(y) x cumulative distribution function F(y) y Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

17 Mixed distributions Inflated distributions at 0 and 1 Beta inflated distribution, BEINF(µ,σ,ν,τ) The probability (density) function of the beta inflated distribution, denoted by BEINF(µ,σ,ν,τ), is defined by p 0 if y = 0 f Y (y µ, σ, ν, τ) = (1 p 0 p 1 )f W (y µ, σ) if 0 < y < 1 p 1 if y = 1 (6) for 0 y 1, where W BE(µ, σ) has a beta distribution, ν = p 0 /p 2 and τ = p 1 /p 2 where p 2 = 1 p 0 p 1. Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

18 BEINF: demo Mixed distributions Inflated distributions at 0 and 1 1 demo.beinf() Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

19 Mixed distributions Inflated distributions at 0 and 1 The default link functions relating the parameters (µ, σ, ν, τ) to the predictors (η 1, η 2, η 3, η 4 ), which may depend on explanatory variables, are ( ) µ log = η 1 1 µ ( ) σ log = η 2 1 σ ( ) p0 log ν = log = η 3 log τ = log. p 2 ( p1 p 2 ) = η 4 Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

20 Mixed distributions Inflated distributions at 0 and 1 The model is equivalent to a beta distribution BE(µ, σ) model for 0 < Y < 1 a multinomial model MN3(ν, τ) with three levels for recoded variable Y 1 given by i.e. 0 if Y = 0 Y 1 = 1 if Y = 1 2 if 0 < Y < 1 p 0 if y 1 = 0 p(y 1 = y 1 ) = p 1 if y 1 = 1 1 p 0 p 1 if y 1 = 2 (7) (8) where ν = p 0 /p 2 and τ = p 1 /p 2 where p 2 = 1 p 0 p 1 Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

21 Mixed distributions Inflated distributions at 0 and 1 Beta inflated at 0 distribution, BEINF0(µ,σ,ν) The probability (density) function of the beta inflated at 0 distribution, denoted by BEINF0(µ,σ,ν), is given by f Y (y µ, σ, ν) = { p0 if y = 0 (1 p 0 )f W (y µ, σ) if 0 < y < 1 (9) for 0 y < 1, where W BE(µ, σ) has a beta distribution where ν = p 0 /(1 p 0 ). Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

22 Mixed distributions Inflated distributions at 0 and 1 Beta inflated at 1 distribution, BEINF1(µ,σ,ν) The probability (density) function of the beta inflated at 1 distribution, denoted by BEINF1(µ,σ,ν), is given by f Y (y µ, σ, ν) = { p1 if y = 1 (1 p 1 )f W (y µ, σ) if 0 < y < 1 (10) for 0 < y 1, where W BE(µ, σ) has a beta distribution where ν = p 1 /(1 p 1 ). Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

23 Inflated distributions Mixed distributions Inflated distributions at 0 and 1 package gamlss.dist: BEINF() and BEINF0(), BEZI(), BEINF1() BEOI() distributions package gamlss.inf any (, ) distribution which been transformed, i.e gen.family(..., "logit ) truncated i.e gen.trun(c(0,1),"sst",type="both") See vignette: Inflated distributions on the interval [0,1] Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

24 Inflated distributions Mixed distributions Inflated distributions at 0 and 1 Y ind D(µ, σ, ν, τ, ξ 0, ξ 1 ) η 1 = g 1 (µ) = X 1 β 1 + s 11 (x 11 ) s 1J1 (x 1J1 ) η 2 = g 2 (σ) = X 2 β 2 + s 21 (x 21 ) s 2J2 (x 2J2 ) η 3 = g 3 (ν) = X 3 β 3 + s 31 (x 31 ) s 3J3 (x 3J3 ) η 4 = g 4 (τ ) = X 4 β 4 + s 41 (x 41 ) s 4J4 (x 4J4 ) η 5 = g 5 (ξ 0 ) = X 5 β 5 + s 51 (x 51 ) s 5J5 (x 5J5 ) η 6 = g 6 (ξ 1 ) = X 6 β 6 + s 61 (x 61 ) s 6J6 (x 6J6 ) Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

25 Mixed distributions Data analysis: Lung function data 3164 male observations of lung function data FEV1/FVC height Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

26 Mixed distributions The lung function data Data analysis: Lung function data Y = FEV 1 /FVC : the Spirometric lung function an established index for diagnosing airway obstruction (3164 male) age : the height in cm Source: Stanojevic et al Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

27 Mixed distributions Lung function data: distribution Data analysis: Lung function data A logitsst distribution inflated at 1 was used. The probability (density) function of Y is f Y (y) given by f Y (y µ, σ, ν, τ, p) = { p if y = 1 (1 p)f W (y µ, σ, ν, τ) if 0 < y < 1 (11) for 0 < y 1, where W logitsst (µ, σ, ν, τ) has a logitsst distribution with < µ < and σ > 0, ν > 0, τ > 0 and where 0 < p < 1. Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

28 Mixed distributions Lung function data: link functions Data analysis: Lung function data The default link functions relate the parameters (µ, σ, ν, τ, p) to the predictors (η 1, η 2, η 3, η 4, η 5 ), which are modelled as smooth functions of lht = log(height), i.e. µ = η 1 = s(lht) log σ = η 2 = s(lht) log ν = η 3 = s(lht). log τ = η 4 = s(lht) ( ) p log = η 5 = s(lht) 1 p Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

29 Mixed distributions Data analysis: Lung function data The lung function data: fitted centile curves (a) LMS (b) BEINF1 FEV1/FVC FEV1/FVC height height (c) Inf. logitsst (d) Gen. Tobit FEV1/FVC FEV1/FVC height height Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

30 Finite mixtures: Why Finite mixtures Dealing with multimodal distributions A way to introduce simple random effect models in GAMLSS distinction between Finite mixtures with no parameters in common Finite mixtures with parameters in common Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

31 Finite mixtures Finite mixtures example: Enzyme data enzyme$act Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

32 Finite mixtures Finite mixtures: Distribution function Suppose that the random variable Y comes from component k, having probability (density) function f k (y), with probability π k for k = 1, 2,..., K, then the (marginal) density of Y is given by f Y (y) = K π k f k (y) k=1 where 0 π k 1 is the prior (or mixing) probability of component k, for k = 1, 2,..., K and K k=1 π k = 1. Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

33 Finite mixtures Finite mixtures: Distribution function The probability (density) function f k (y) for component k may depend on parameters θ k and explanatory variables x k, i.e. f k (y) = f k (y θ k, x k ). Similarly f Y (y) depends on parameters ψ = (θ, π) where θ = (θ 1, θ 2,..., θ K ) and π T = (π 1, π 2,..., π K ) and explanatory variables x = (x 1, x 2,..., x K ), i.e. f Y (y) = f Y (y ψ, x), and f Y (y ψ, x) = K π k f k (y θ k, x k ) k=1 Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

34 Finite mixtures Finite mixtures: the log Likelihood l = l(ψ, y) = [ n K ] log π k f k (y i ) i=1 k=1 We wish to maximize l with respect to ψ, i.e. with respect to θ and π. It turns out that it is easier to maximise using EM algorithm: define the full likelihood take expections maximise Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

35 Finite mixtures Finite mixtures: the complete log Likelihood δ ik = { 1, if observation i comes from component k 0, otherwise Let δ T i = (δ i1, δ i2,..., δ ik ) be the indicator vector for observation i. Let δ T = (δ T 1, δ T 2,..., δ T n ) combine all the indicator variable vectors. l c = l c (ψ, y, δ) = n K n K δ ik log f k (y i ) + δ ik log π k i=1 k=1 i=1 k=1 Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

36 Finite mixtures Finite mixtures: EM-steps E-step Q = E δ [ l c y, ˆψ (r)] = K n k=1 i=1 ŵ (r+1) ik log f k (y i ) + K n k=1 i=1 ŵ (r+1) ik log π k M-step weighted log likelihood for GAMLSS model Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

37 Finite mixtures Finite mixtures: the weights ŵ (r+1) ik = E [δ ik y, ˆψ (r)] = ˆπ (r) k f k(y i ˆθ (r) k ) K k=1 ˆπ(r) k f k(y i ˆθ (r) k ) Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

38 Finite mixtures Finite mixtures: the gamlssmx() function m1 <- gamlssmx(act ~ 1, family = NO, K = 2) m2 <- gamlssmx(act ~ 1, family = GA, K = 2) m3 <- gamlssmx(act ~ 1, family = RG, K = 2) m4 <- gamlssmx(act ~ 1, family = c(no, GA), K = 2) m5 <- gamlssmx(act ~ 1, family = c(ga, RG), K = 2) AIC(m1, m2, m3, m4, m5) df AIC m m m m m Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

39 Finite mixtures Finite mixtures: the gamlssmx() function > m3 Mu Coefficients for model: 1 (Intercept) Sigma Coefficients for model: 1 (Intercept) Mu Coefficients for model: 2 (Intercept) Sigma Coefficients for model: 2 (Intercept) Estimated probabilities: Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

40 Finite mixtures Finite mixtures: the gamlssmx() function truehist(enzyme$act, h = 0.1) fyrg <- dmx(y = seq(0, 3, 0.01), mu = list( 1.127, ), sigma = list(0.336, ), pi = list(0.376, 0.624), family = list("rg","rg")) lines(seq(0, 3, 0.01), fyrg, col = "red", lty = 1) lines(density(enzyme$act, width = "SJ-dpi"), lty = 2) Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

41 Finite mixtures Finite mixtures example: Enzyme data enzyme$act Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

42 Finite mixtures Finite mixtures: conclusions Finite mixtures of K components, each having a GAMLSS model, can be fitted using gamlssmx() if the K components have no parameters in common Modelling the mixing probabilities can be done (a multinomial logistic model is used) Finite mixtures with parameters in common can be fitted using the function gamlssnp() Mixed distributions are special case of finite mixtures Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43

43 Finite mixtures END for more information see Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing / 43