A multilevel multistate competing risks model for event history data

A General Multilevel Multistate Competing Risks Model for Event History Data, with
an Application to a Study of Contraceptive Use Dynamics
Fiona Steele, Harvey Goldstein* Centre for Multilevel Modelling Institute of Education University of London William Browne* Mathematical Sciences University of Nottingham Nottingham NG7 2RD Contact author: Fiona Steele Tel: 020 7612 6657 Fax: 020 7612 6658 Email: [email protected] *Fiona Steele is Research Lecturer in Statistics, Harvey Goldstein is Professor of Statistical Methods, and William Browne is Lecturer in Statistics. Abstract
We propose a general discrete-time model for multilevel event history data. The model is developed for the analysis of longitudinal repeated episodes within individuals where there are multiple origin states and multiple transitions from a state (competing risks). Transitions from each origin state are modelled jointly to allow for correlation across states in the unobserved individual characteristics that influence transitions. Implementation of the method in MLwiN is described. The model is applied in an analysis of contraceptive use dynamics in Indonesia where transitions from two origin states, contraceptive use and non- use, are of interest. A distinction is made between two ways in which an episode of contraceptive use may end: a transition to non-use or a switch to another method. After adjusting for a range of background characteristics, we find evidence of a positive residual correlation between the risk of discontinuation and the risk of moving from non-use to use; this suggests that women who have short (long) episodes of contraceptive use tend also to have short (long) episodes of non-use. Keywords: Event history analysis, competing risks, multilevel model, multistate model, contraceptive use 1. Introduction
Event history data are collected in many surveys, providing a longitudinal record of events such as births, deaths, and changes in employment and marital status. These data are often highly complex, with common features including repeated events, multiple origin states and multiple types of transition from each state (competing risks). While there are methods for handling repeated events combined with either multiple origin states or competing risks, existing methodology does not allow all three features to be handled simultaneously. In this paper, we propose a general event history model for the analysis of repeated durations where there may be multiple origin states and multiple transitions from those states. The methodological development is motivated by a study of contraceptive use dynamics. Event history data on episodes of contraceptive use and non-use are now collected in a number of developing countries, as part of the Demographic and Health Survey (DHS) programme. These surveys collect monthly data on contraceptive use, non-use and pregnancy for a period of 5-6 years before the survey date. The information recorded includes the methods of contraception used and the reason for discontinuation when an episode of use ends. Previous studies of contraceptive use dynamics using these data have focused on contraceptive discontinuation, allowing for repeated episodes of use and different reasons for discontinuation in a competing risks framework (e.g. Steele et al. 1996b). Episodes of non- use are ignored. However, the transition from non-use to use is also important for family planning programme evaluation since women who do not quickly resume contraceptive use after a birth, or after discontinuing use of a method, may be at risk of having an unintended pregnancy. In this paper, we consider episodes of both contraceptive use and non-use and model transitions between use and non-use simultaneously. Use and non-use of contraception are examples of multiple origin states. By jointly modelling transitions from different origin states, it is possible to test explicitly for state-dependent covariate effects. For example, the effects of background characteristics such as age might differ for transitions from use and non-use. Joint modelling of transitions also allows for residual correlation in individual transition rates across states, which might arise because of unobserved factors that affect transitions from each state. In the model for contraceptive discontinuation, an episode of contraceptive use is defined as a continuous period of using the same method. We distinguish between two types of transition from use of a given method: a transition to non-use or a switch to a different method (a transition within the ‘use' state). These two types of event are examples of competing risks. An episode of non-use is defined as a continuous period of non-use (excluding months of pregnancy). The only possible type of transition from non-use is to use. We therefore have a situation where the number and type of transitions that can occur depend on the origin state. The model developed in this paper can handle state-dependent competing risks. The model we propose is a generalised multilevel discrete-time event history model. A multilevel model is used to allow for the hierarchical structure that arises from having repeated episodes (of use or non-use of contraception) nested within individuals. The model includes individual random effects for each origin state and for each type of transition from a given state; these random effects may be correlated across origin states and transitions to allow for shared unobserved individual factors. One advantage of choosing a discrete-time formulation is that it allows the model to be cast as a multilevel model for multinomial response data, which may be fitted using existing software. The remaining sections of the paper are organised as follows. In Section 2, we give a brief outline of previous work on event history analysis for repeated events, competing risks and multiple states. We then describe a general multilevel multistate competing risks model that allows all three of these common features of event history data to be incorporated simultaneously. The application of this model to a study of contraceptive use dynamics in Indonesia is presented in Section 3. Finally, in Section 4, further extensions to the proposed model are discussed. 2. Methodology
Previous work on repeated events, multiple origin states and competing risks
When an event may occur more than once over an individual's lifetime, the durations between events may be correlated due to the presence of unobserved individual-level factors. Repeated events are usually handled by including individual-specific random effects in an event history model, leading to a multilevel model. The random effect represents individual ‘frailty', and the random effect variance measures unobserved heterogeneity among individuals. Vaupel et al. (1979) describe how, in the presence of unobserved heterogeneity, the population hazard rate may be observed to decline over time, even if the hazard rates of individuals in the population are constant throughout the observation period, due to high risk individuals experiencing the event early and leaving the least susceptible individuals in the ‘at risk' sample. Repeated observations on individuals allow unobserved heterogeneity to be better identified. Multilevel event history models have been developed for the analysis of hierarchical duration data, where the hierarchical structure results from repeated events within individuals or clustering of individuals within some higher-level grouping such as geographical area. Multilevel extensions of continuous-time proportional hazards models include Clayton and Cuzick (1985), Goldstein (2003, Chapter 10), Guo and Rodríguez (1992) and Sastry (1997), while discrete-time approaches include Davies et al. (1992) and Steele et Another extension of the basic event history model allows for the possibility of multiple states. There may be several transient origin states between which individuals move, perhaps more than once. Several previous studies have considered models for repeated transitions between multiple states. Enberg et al. (1990) consider transitions between welfare and work using a random effects model but, since they assume individual random effects are uncorrelated across states, their approach amounts to fitting a separate model for each origin state. In many applications, this assumption of independence may be invalid since there may be unobserved factors which influence transitions from more than one origin state. Goldstein et al. (2002) also use a discrete-time random effects model, but model jointly transitions from two origin states, allowing for correlation between the state-specific random effects. An alternative approach is to use a fixed effects model such as the proportional hazards model proposed by Lindeboom and Kerkhofs (2001) to analyse movements between sickness and work spells, clustered by workplace. While existing methods allow for multiple origin states and repeated events, it is assumed that only one type of transition can occur from each state. Competing risks are another common feature of event history data. In many situations, there are several competing destinations from a given state, or an event may be experienced for one of several reasons. To allow for unobserved individual heterogeneity in the risks of competing events, various competing risks event history models have been proposed. These models typically include individual-level random effects for each alternative destination. Enberg et al. (1990) consider a discrete-time competing risks model with individual- and destination-specific random effects, which is essentially a multilevel multinomial logit model. However, their model assumes that the random effects are uncorrelated across competing risks, an assumption which is likely to be unrealistic since there may be common unobservables affecting more than one type of transition. Hill et al. (1993) propose a nested logit model which relaxes this independence assumption. For alternative destinations which may be regarded as similar with respect to unmeasured risk factors, the error terms are decomposed into a component which is common to similar alternatives and a component which is destination-specific. A different approach to relaxing the independence assumption is adopted by Steele et al. (1996b). They propose a discrete-time competing risks model, formulated as a multilevel multinomial model, which includes individual- and destination- specific random effects that may be correlated across destinations. Theirs is a more general model than that of Hill et al. (1993) and can be extended to several hierarchical levels where the effects of duration and covariates may vary across higher-level units. However, neither approach allows for the possibility of multiple origin states. Multilevel discrete-time competing risks model
In this section we describe the discrete-time competing risks model proposed by Steele et al. (1996b). The more general model proposed in the present paper is an extension of this model that allows for multiple origin states. We focus on discrete-time models for several reasons. First, in our application durations of use and non-use of contraception are measured in discrete-time units as the data were collected monthly. It is very common for durations to be measured in discrete time, particularly in studies of human populations in which event times are often collected retrospectively. When durations are recorded in reasonably broad intervals, such as months, there will be multiple ties. While ties present no problem in the estimation of discrete-time models, some adjustment is required if a continuous-time model is used. For example the widely used Cox proportional hazards model, estimated via partial likelihood, requires some modification (see, e.g., Kalbfleish and Prentice, 1980). A second reason for favouring discrete-time event history models is that they are essentially discrete response models. This allows the use of existing methodology for multilevel discrete response data when there are repeated events. Other benefits of the discrete-time approach include straightforward inclusion of time-varying covariates and the possibility to allow for non-proportional hazards. Non-proportional hazards are handled by including interactions between the duration variable(s) (treated as explanatory variable(s) in a discrete-time model) and covariates. One disadvantage of a discrete-time approach, however, is the need to expand the dataset so that there is an observation for each time unit. If the width of the discrete time intervals is short relative to the observation period this may lead to a very large dataset, but with increasing computational power and storage this is becoming a less severe problem. One strategy to reduce the size of the expanded dataset is to group discrete-time intervals; for example, quarterly rather than monthly observations might be created. While grouping intervals leads to a loss of information, in our experience there is often little impact on parameter estimates and standard errors. In the application to contraceptive use dynamics, for example, results were robust to using six-month rather than monthly intervals. An episode is defined as a continuous period of time spent in the same state until an event occurs. Suppose that for each time interval t in episode j for individual k, we observe a multinomial variable y which denotes whether an event has occurred and the type of event. Suppose there are R end events. Denote the multinomial response by y where y = r if an event of type r has occurred in time interval t, r = 1, . ., R, and y = 0 if no event has occurred. The hazard of an event of type r in interval t, denoted by h r) , is the probability that an event of type r occurs in interval t, given that no event of any type has occurred before interval t. The log-odds of an event of type r versus no event may be modelled as a function of episode duration and covariates, using methods for unordered multinomial response data. Using a logit link, the multilevel discrete-time competing risks model may be written (r )T (r )T  = α z + β x + u ,
r = 1, . ., R. tjk  The effect of duration is represented by α(r)T (r)
z which can take a number of forms, including
a polynomial function or a step (piecewise constant) function of time. The covariates, represented by (r) x , may be defined at the level of the discrete time unit (time-dependent), or
at the episode or individual level. Equation (1) defines a proportional hazards model where the effects of covariates are assumed to be constant across time. Non-proportional effects may be accommodated simply by adding interactions between (r) z and (r)
In a competing risks model, the effects of duration and covariates may differ for each event type, as indicated by the r superscript for α and β . It is also possible that the form of z and
the set of covariates x may vary across event types. Unobserved individual-specific factors
may differ for each type of event; these are represented by R random effects (r) random effects are assumed to follow a multivariate normal distribution, with covariance matrix ; non-zero correlation between random effects allows for shared or correlated
unobserved risk factors across competing risks. The model may be extended further to allow coefficients of (r) z and x(r) to vary randomly across individuals.
Model (1) may be estimated as a multilevel multinomial model (Goldstein, 2003, Chapter 4). Several software packages may be used, including MLwiN (Rasbash et al., 2000), PROC NLMIXED in SAS (SAS Institute, 1999) and WinBUGS (Spiegelhalter et al., 2000). Further details of the multinomial model for competing risks are given in Steele et al. (1996b). A multilevel discrete-time model for competing risks and multiple states
The model we propose is an extension of (1) to handle situations where there are both competing risks and multiple origin states. The approaches of Steele et al. (1996b) and Goldstein et al. (2002) are combined in a general framework. In this general model, the number and type of transitions may differ for each state. It is also possible that the end of an episode does not necessarily lead to a change in state. For example, in our application an episode of contraceptive use may end in a transition to the non-use state or a transition within the use state (a method switch). Suppose that there are Ri ways in which an episode in state i (i = 1, . ., s) can end. Denote by the hazard of making a transition of type r i (ri = 1, . ., Ri) from origin state i in time interval t of episode j for individual k. The hazard of no transition is denoted by (0) multilevel model for competing risks and multiple states may be written (r )T (r )T i = 1, . ., Ri ; i = 1, . ., s. tijk  In (2) duration and covariate effects may depend both on the origin state i and on the type of transition r i. Unobserved individual-level factors, represented by u i , may also vary according to state and transition. The ∑r random effects are assumed to follow a multivariate normal distribution. Data preparation and estimation
In order to estimate a discrete-time event history model, the data must first be restructured to what is often called a person-period format. This involves expanding the data so that there is a record for each time interval in each episode. For example, an episode which ended during the third time interval would be expanded to obtain three records, for t = 0, t = 1 and t = 2. Suppose there are competing risks and the episode ended for reason r =2, then the multinomial response variable for the three intervals would be ( y y jk 2 jk the individual had been right-censored during the third time interval, their sequence of responses would be (0, 0, 0). After this data expansion, models (1) or (2) may be estimated using any software that can handle multilevel multinomial response data. Further details of the data structure required are given in Appendix A. In the analysis that follows we have used a hybrid Gibbs-Metropolis sampling algorithm. Gibbs sampling is used to update the random effects variance matrix, while single-site random walk Metropolis sampling is used for all the other parameters. As we have no prior information on likely parameter values we have incorporated suitable ‘diffuse' prior distributions in the model. Details of the MCMC estimation algorithm and the chosen prior distributions are given in Appendix B. This method has been implemented in MLwiN. Details of MLwiN's MCMC estimation engine are given in Browne (2002). Contraceptive use dynamics in Indonesia
We consider an application of the multilevel multistate competing risks model in an analysis of changes in contraceptive use over time. Two origin states are considered: contraceptive use and non-use. An episode of non-use always ends in a transition to use, while for an episode of contraceptive use there are two competing risks: a woman may discontinue use of all contraception and become a non-user, or she may switch to a different method. Data and sample definition
The data are from the 1997 Indonesia Demographic and Health Survey (IDHS), a nationally representative survey of ever-married women age 15-49 (Central Bureau of Statistics, 1998). Contraceptive histories were collected retrospectively using a calendar for a six-year period before the survey. The calendar has a tabular format with a row for each month of the observation period, a column containing information on pregnancies, births and contraceptive use, and another column recording the main reason for discontinuation for each episode of contraceptive use. The analysis is based on episodes of contraceptive use and non-use for 14677 women who were married throughout the observation period and who had previously used contraception. An episode is defined as a continuous period of non-use or use of the same contraceptive method. Periods of non-use that are interrupted by pregnancy are treated as two separate episodes, one ending when the woman becomes pregnant, and the other starting after the birth. Periods of non-use while a woman is pregnant are excluded. The period of non-use after pregnancy is considered as a new episode since interest is focused on non-use while a woman is at risk of conception. Episodes of male or female sterilisation are excluded from the analysis since no transition is possible from these permanent methods of contraception. This results in the loss of a very small number of episodes since sterilisation is relatively unpopular in Indonesia, and few women in the sample were sterilised after the start of the six- year observation period. The sample is further restricted to women who had previously used contraceptives, and to episodes of use or non-use which began after the start of the observation period. Episodes that were in progress at the start of the calendar period, i.e. left- truncated episodes, were necessarily excluded since the start date was not asked for these episodes. The final analysis sample contains 17 843 episodes of use and 21 285 episodes of The IDHS also collected complete birth histories and a large amount of demographic and socio-economic information from each woman and her household. A number of covariates were used in the analysis: current age (treated as time-dependent), education level, type of region of residence, an indicator of socio-economic status based on household possessions, contraceptive method (for episodes of contraceptive use) and an indicator of whether the episode followed a live birth (for episodes of non-use). The socio-economic status indicator has been used in previous studies (Curtis and Blanc, 1997; Steele and Curtis, 2003) and is based on a simple household possessions score. Households receive one point for having each of the following: piped or bottled drinking water, flush toilet, vehicle, radio, and a floor that is not dirt. The total score ranges from 0 to 5 and is categorised as low (0-1), medium (2-3), or high (4-5). Contraceptive method is classified as 1) pills or injectables (short-term hormonal methods), 2) Norplant® or intra-uterine device (IUD) (longer-term clinical methods), 3) other modern reversible methods (mainly condoms), and 4) traditional methods. Descriptive statistics for all covariates are given in Table 1. 3.2 Modeling
The multilevel multistate competing risks model in (2) is applied in the analysis of transitions from s=2 states, contraceptive use and non-use. From the ‘use' state (i=1) there are R1=2 possible transitions, while from the ‘non-use' state (i=2) there is only R2=1 transition. In order to fit a discrete-time event history model, the data first must be expanded so that there is a response for each time interval in an episode. The expanded dataset using one- month intervals has 543 737 observations. To reduce computational time, the length of discrete-time intervals is increased to six months which reduces the size of the dataset to 109 666 observations. Comparison of single-level models using one- and six-month intervals reveal that increasing the length of discrete-time intervals to six months has little effect on the parameter estimates or standard errors (results not shown). In aggregating time intervals, the number of episodes does not change. If there is more than one episode within a six-month interval, all such episodes are retained in the reduced dataset, with a duration of one six- month interval recorded for each. Duration effects are modelled in different ways for use and non-use states. For transitions from contraceptive use, a piecewise constant formulation is found to be a good fit to the observed logit-hazard. A step function is fitted for duration intervals of 0-5 months, 6-11 months, 12-23 months, 24-35 months, and 36 or more months. For transitions from non-use to use, a polynomial function of the cumulative duration of non-use is used. 3.3 Results
Cumulative transition probabilities were calculated using separate life tables for each origin state. Based on a multiple-decrement life table, within the first 12 months of use 13% of women have become non-users and 13% have switched to a different method of contraception. After 24 months, 23% have discontinued while 18% have switched methods. The probability of moving from non-use to use increases rapidly with duration of non-use. Within 12 months of the start of an episode of non-use, 57% of women have started to use contraception, while 70% start within 24 months. These high rates are due largely to women resuming contraceptive use after a brief period of non-use following a birth. 3.3.1 Random
We begin by fitting a model including duration effects only, before adding the covariates listed in Table 1. The estimated random effects covariance matrix from both models is shown in Table 2. There is evidence of unobserved heterogeneity between women in the hazards of all types of transition, but particularly for transitions from contraceptive use. From the upper panel of Table 2, it can be seen that before including covariates there is a strong negative residual correlation (estimated as -0.71) between the logit-hazards for the transition from use to non-use and from non-use to use. The negative correlation implies that women with a high (low) hazard of moving from non-use to use tend to have a low (high) hazard of discontinuation. In other words, women with short (long) periods of use before a discontinuation generally have long (short) periods of non-use. On further examination of the data, we find that the shortest periods of non-use follow a live birth. These short postnatal episodes of non-use are usually followed by a long period of using the same method of contraception, in order to space or limit subsequent births. After controlling for covariates, in particular the indicator of whether a period of non-use immediately followed a live birth, we find that the residual correlation becomes moderate and positive (see the estimate of 0.28 in the lower panel of Table 2). A positive correlation implies that women with short periods of contraceptive use tend also to have short periods of non-use, and those who use contraceptives for longer periods tend to have longer breaks in use. The correlations between the random effects for the other pairs of transitions are both small and neither is significant at 3.3.2 Fixed
The estimated coefficients and standard errors corresponding to the fixed part of the full model are shown in Table 3. For all types of transition, the effects of current age, education level, type of region of residence, and household socio-economic status are considered. For transitions from use, the type of contraceptive method used is treated as a time-dependent covariate. The indicator of whether an episode of non-use follows a live birth is included only in the model for the transition from non-use. We begin by examining the effect of duration of use and covariates on transitions from contraceptive use to non-use (‘discontinuation') or to use of another method (‘switching'). The risk of discontinuation is fairly constant over the first three years of use, but greater for longer durations, while the risk of switching is highest in the first six months of use, then decreases. Age has a negative effect on both discontinuation and switching; older women are more likely than young women to continue use of the same method. Education has a positive effect on both discontinuation and switching, but the effect on the risk of switching is stronger. Urban women are more likely than rural women to discontinue, but type of region has no effect on the rate of switching. Socio-economic status has different effects on discontinuation and switching; a high level of socio-economic status is associated with low discontinuation rates, but higher switching rates, possibly reflecting access to a wider choice of methods for better-off women. Norplant®/IUD users are less likely than users of any other method to become non-users or to change to a different method. Users of traditional methods are also relatively unlikely to switch methods. In contrast, condom users (the main constituent of the ‘other modern' group) are the most likely to abandon contraceptive use or to change to another method. We now turn to the factors associated with transitions from non-use to use. The probability that a non-user becomes a user decreases sharply with the duration of non-use. Older women are less likely than young women to become a contraceptive user. Educated women, those living in urban areas, and women of higher socio-economic status are more likely than uneducated, rural, or poorer women to make the transition from non-use. Finally, if the episode of non-use follows a birth rather than an episode of contraceptive use, a woman is considerably more likely to adopt contraception and the negative effect of duration of non- use is stronger. This effect distinguishes between short breaks in contraceptive use after a birth and longer-term non-use, possibly following a problem with contraception such as side- 4. Discussion
We have shown how to specify and fit general discrete-time event history models with multiple origin states and multiple transitions from those states. We have illustrated this for repeated episodes within individuals but our models can be extended readily to further levels of nesting. For example, in the application presented here, community-specific random effects may be added to allow for clustering of contraceptive behaviour within neighbourhoods or villages. We have assumed that random effects follow a multivariate normal distribution. This leads to an extremely flexible model in which there may be several correlated random effects. As with any statistical analysis, however, it is important to carry out diagnostic checks for departures from normality and other model assumptions. Langford and Lewis (1998) propose a range of procedures for multilevel data exploration, including methods for detecting and adjusting for outliers. It may also be possible to protect against non-normality using ‘sandwich' or robust standard errors (Goldstein, 2003, p.80-81). Another approach is to assume a non-normal random effects distribution, for example a multivariate t-distribution, which could be implemented in WinBUGS (Spiegelhalter et al. 2000). We have ignored the possibility of within-individual between-episode random variation in durations. In principle we can fit this using episode-specific random effects, but in this case the within-individual variation in episode durations is not significant, possibly due to a relatively low proportion of women who experience more than one transition of each type. Furthermore, in general it would seem preferable to model episode heterogeneity using random coefficients associated with individual level covariates. Thus, for example, the age relationship within individuals may vary across individuals and this can be modelled by including random coefficients for the age group category coefficients. Where the transition states form an ordered categorisation we can use corresponding ordered category models, for example by modelling cumulative log-odds (Goldstein, 2003, Chapter 4). This could arise, for example, in the modelling of illness duration where patients make transitions between clinical states which are ordered by severity. For such models we can also assume an underlying propensity with a probit link and this can be fitted via MCMC. Our models can also be extended readily to the multivariate case where, for each individual, we wish to study more than one type of episode at a time; for example, durations of contraceptive use episodes and intervals between births. For each episode type we form the same set of discrete time intervals and, for each time interval, the response is multivariate with dimension p, where p is the number of episode types. A dummy variable is created to indicate each episode type, and these are interacted with covariates to allow covariate effects to vary across the different types of episode. For ordered models and for binary response models, using a probit link, we can directly incorporate correlations between the underlying normal distributions at the episode level and at higher levels. This then provides covariance matrix estimates for the episode types at all levels of the data hierarchy. References
Browne, WJ and Draper D. (2000) Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models. Computational Statistics 15, 391-420.
Browne, W. J. (2002). MCMC estimation in MLwiN. London, Institute of Education. Central Bureau of Statistics (CBS) [Indonesia], State Ministry of Population/National Family Planning Coordination Board (NFPCB), Ministry of Health (MOH), and Macro International Inc. (MI) (1998) Indonesia Demographic and Health Survey 1997. Calverton, Maryland: CBS and MI. Clayton, D.G. and Cuzick, J. (1985) Multivariate generalizations of the proportional hazards model (with discussion). Journal of the Royal Statistical Society, Series A, 148, 82-117.
Curtis, S.L. and Blanc, A. (1997) Determinants of contraceptive failure, switching, and discontinuation: An analysis of DHS contraceptive histories. DHS Analytical Reports No. 6, Calverton, Maryland: Macro International Inc. Davies, R.B., Elias, P. and Penn, R. (1992) The relationship between a husband's unemployment and his wife's participation in the labour-force. Oxford Bulletin of Economics and Statistics, 54, 145-171.
Enberg, J. Gottschalk, P. and Wolf, D. (1990) A random-effects logit model of work-welfare transitions. Journal of Econometrics, 43, 63-75.
Goldstein, H. (2003) Multilevel Statistical Models. 3rd edition. London: Arnold. Goldstein, H., Pan, H. and Bynner, J. (2002) A note on methodology for analysing longitudinal event histories using repeated partnership data from the National Child Development Study (NCDS). Working paper, Institute of Education, London. Downloadable from http://k1.ioe.ac.uk/hgpersonal/. Guo, G. and Rodríguez, G. (1992) Estimating a multivariate proportional hazards model for clustered data using the EM algorithm, with an application to child survival in Guatemala. Journal of the American Statistical Association, 87, 969-976.
Hill, D.H., Axinn, W.G. and Thornton, A. (1993) Competing hazards with shared unmeasured risk factors. Sociological Methodology, 23, 245-277.
Kalbfleish, J.D. and Prentice, R.L. (1980) The Statistical Analysis of Failure Time Data. New York: Wiley. Langford, I.H. and Lewis, T. (1998) Outliers in multilevel data (with discussion). Journal of the Royal Statistical Society, Series A, 161, 121-160.
Lindeboom, M. and Kerkhofs, M. (2000) Multistate models for clustered duration data – an application to workplace effects on individual sickness absenteeism. The Review of Economics and Statistics, 82, 668-684.
Rasbash, J., Browne, W.J., Goldstein, H., Yang, M., et al. (2000). A User's Guide to MLwiN (Second Edition). London: Institute of Education. SAS Institute Inc. (1999) SAS/STAT® User's Guide, Version 8. Cary, NC: SAS Institute inc. Sastry, N. (1997) A nested frailty model for survival data, with an application to the study of child survival in Northeast Brazil. Journal of the American Statistical Association, 92,
Spiegelhalter, D.J., Thomas, A. and Best, N.G. (2000) WinBUGS Version 1.3 User Manual. Cambridge: Medical Research Council Biostatistics Unit. Steele, F. and Curtis, S.L. (2003) Appropriate methods for analysing the effect of method choice on contraceptive discontinuation. Demography, 40, 1-22.
Steele, F., Diamond, I. and Amin, S. (1996a) Immunization uptake in rural Bangladesh: a multilevel analysis. Journal of the Royal Statistical Society, Series A, 159, 289-299.
Steele, F., Diamond, I. and Wang, D. (1996b) The determinants of the duration of contraceptive use in China: A multilevel multinomial discrete hazards modelling approach. Demography, 33,12-33.
Vaupel, J.W., Manton, K.G. and Stallard, E. (1979) The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography, 16, 439-454.
Table 1. Distribution of women/episodes by covariates, Indonesia 1997
Woman-level variables Type of region of residence Socio-economic status Number of episodes Episode-level variables: non-use Episode follows a live birth Episode-level variables: use Contraceptive method Pill/injectable Table 2. Random effects covariance matrix from models of transitions from contraceptive
use and non-use, Indonesia 1992-97
Use → other method (Discontinuation) interval Est.† (95% Duration effects only Use → non-use Use → other method (-0.032,0.154) 0.748 Non-use → use (-0.204,-0.112) 0.002 (-0.045,0.042) 0.089 Duration + covariates Use → non-use Use → other method (-0.065,0.114) 0.702 Non-use → use (0.015,0.109) 0.016 (-0.090,0.093) 0.231 †Coefficients are the modal estimates from 50 000 chains. aCorrelation between random effects. Table 3. Estimated coefficients and standard errors from model of transitions from
contraceptive use and non-use, Indonesia 1992-97
(Discontinuation) method (Method switch) Est.† (SE) Est.† (SE) Est.† (SE) -2.306 (0.085) -3.424 (0.123) -1.906 (0.106) -0.408 (0.038) -0.374 (0.047) -0.319 (0.031) -0.844 (0.057) -0.593 (0.068) -0.710 (0.045) Pill/injectable Episode follows live birth Episode after live birth*Duration Episode after live birth*Duration2 †Estimates are the modal estimates from 50 000 chains. Appendix A: Data Preparation
Suppose that a woman uses contraception for 3 time intervals, then discontinues and does not use contraception for 2 intervals, then uses contraception again for 4 intervals, before switching to another method: Individual (k) Episode (j) State (i) Duration (in 6-month intervals) Censor indicates whether the episode is right-censored; here, the duration of each episode is completely observed. The first step in restructuring the data for a multinomial discrete-time model is to create a multinomial response for each time interval (six months intervals, here). The multinomial response y i categories for state i, where Ri = 2 for i = 1 and Ri = 1 for i = 2. The multinomial response is coded as follows: if no event has occurred if individual discontinues contraceptive use (i = 1), or if individual starts to use contraception (i = 2) if individual switches to another method (i = 1)

In addition, two indicator variables, I1 and I2, denoting the origin state are created. These are interacted with t and covariates. The restructured dataset is as follows: k j i t ytijk To fit a multilevel multinomial model in MLwiN, the data must be further restructured to obtain a set of binary responses for each multinomial response. This reconstruction is required only for episodes that originate in state 1, from which two types of exit are considered; since there is only one type of exit from non-use the indicator of event occurrence for episodes originating in state 2 is binary. For i=1, the multinomial response for each time interval is converted to two binary responses (r) , where (r) t1 jk t1 jk t1 jk t1 jk = r and 0 otherwise (r = 1, 2). For each time interval, the two binary responses are stacked. Thus, for the first episode in the example above the final data structure is as follows: t r (r) t1 jk The indicator for state 1, I 1, is replaced by indicators for r, I and I . These are multiplied with duration and the covariates to allow duration and covariate effects to vary according to the type of transition from contraceptive use. The destination-specific individual random effects for state 1, u )1 and u(2) , are fitted by allowing the coefficients of )1 randomly across individuals. In addition the random effect for state 2, u , is obtained by allowing the coefficient of I2 to vary across individuals. Appendix B : Estimation of a Multilevel Multistate Competing Risks Model in MLwiN
All the MCMC results in this paper were obtained using a modified version of MLwiN (2.0) that will be made publicly available in the future. Here, we describe an MCMC algorithm for estimation of the multilevel multistate competing risks model of equation (2) in Section 2.3. The algorithm is described in the context of the application to contraceptive use and non-use in Indonesia. There are s=2 states, with R1=2 possible transitions from state i=1, and R2=1 transition from state i=2. There are six sets of fixed effects, which have been split into duration effects ( α )1
α and α ) and covariate effects ( β )1
β and β ) and three sets of
u and u ). All of these parameters are updated using single-site random walk Metropolis updating steps. We also have a 3*3 variance matrix, Ωu, for the correlated sets of random effects and for this we use a Gibbs sampling step. For prior distributions we use ‘improper' uniform priors for all of the fixed effects and a diffuse inverse-Wishart prior with parameters 3 and S3 = 3*I (the identity matrix) for Ωu. We make the following substitutions in (2) to simplify writing down the conditional posterior = exp( (r)T (r )T + u ), r = , t1 jk t1 jk t 2 jk t 2 jk The joint posterior distribution is proportional to t1 jk t1 jk t1 jk py) ∝ ∏ 1 t1 jk t1 jk t1 jk t1 jk t1 jk − u u  × p
t 2 jk t 2 jk u ,u ) and Θ is the set of all unknown parameters. When we come to calculate the conditional posterior distributions for the unknown parameters they generally do not have standard forms and consist of all the terms in the above joint posterior that contain the parameter of interest. For example the posterior distribution for α has the form:
t1 jk t1 jk y,Φ) ∝ ∏ 1 which is the first term in the joint posterior. Here Φ = Θ { α )1
The MCMC algorithm works by updating each of the unknown parameters in turn by making a random draw from their conditional posterior distributions. The variance matrix, Ωu, is updated by Gibbs sampling and has an inverse Wishart conditional distribution: p(Ω−1 y,Θ Ω ∑ T + − where nw is the number of women in the dataset. All other parameters are updated by random-walk Metropolis sampling which we will illustrate via the step for α . At iteration m generate a proposed new value α )*
random walk proposal distribution α (m-1), 2
p is the proposal distribution variance which will be tuned via the adaptive method originally used in Browne and Draper (2000). The updating step is then: ( (m) = α )*
with probability min[1,p(α )*
y,Φ)/ p( α )1
( (m-1) y,Φ)], α (m) = α )1
( (m-1) otherwise. Similar steps are performed for each of the other unknown parameters. The procedure of updating all the unknown parameters is then repeated many times to generate a large sample of estimates for each parameter. We used a burn-in of 5 000 iterations to allow the chains of parameter estimates to converge and then sampled 50 000 iterations.

Source: http://seis.bris.ac.uk/~frwjb/materials/multistate.pdf

Powerpoint template 16x9

Implementation of QbD for Existing Products An Example from GSK Australia Jonathan Parks B.Sc (Hons) from Monash University in 1990 Started at Glaxo (as it was called then) in 1991 as a Development Chemist in Pharmaceutical Development Worked on the development of Blow-Fill-Seal (BFS) products for nebulisation and Dry Powder Inhalation (DPI) products

Microsoft word - rubini.doc

Int. J. Biol. Sci. 2005 1: 24-33 International Journal of Biological Sciences ISSN 1449-2288 www.biolsci.org 2005 1:24-33 ©2005 Ivyspring International Publisher. All rights reserved Research paper Diversity of endophytic fungal community of cacao Received: 2004.09.20 (Theobroma cacao L.) and biological control of Crinipellis