Schedule
Lectures  
Wednesday  12:20  13:50  K2  
Exercise Class  
Wednesday  14:00  15:30  K10A 
Course Contents

February 21Role of statistics in medical research. Subject of epidemiology. Prevalence, incidence of disease. Estimating incidence by piecewise exponential model.Supplementary reading:[EBR], Chap. 1, pp. 1–34, [BD1] Chap. II, pp. 42–47.

February 28Agespecific incidence. Agestandardized incidence. Cumulative incidence. Confidence intervals for agestandardized and cumulative incidence.Supplementary reading:[EBR], Chap. 2, pp. 49–62, [BD1] Chap. II, pp. 47–54, [BD2] Chap. 2, pp. 48–61.

March 7Exposures. Relative risk, excess risk. Estimating relative risk from aggregated data. Analysis of binary exposures (2 x 2 tables). Cohort and casecontrol design. Invariance of odds ratio to study design. Relationship between odds ratio and relative risk. Small sample methods for estimating odds ratio.Supplementary reading:[BD1] Chap. II, pp. 53–73, Chap. IV, pp. 122–129.

March 14Large sample methods for estimating odds ratio. Control of confounding in casecontrol studies: sampling and analytic strategies. Stratification, matching, adjustment. Analysis of stratified casecontrol studies via logistic regression.Supplementary reading:[BD1] Chap. III, pp. 84–115, Chap. IV, pp. 129–136, Chap. VI, pp. 192–209.

March 21Classical methods for analyzing stratified casecontrol studies with binary exposure. CochranMantelHaenszel test, Woolf estimator, MantelHaenszel estimator.Supplementary reading:[BD1] Chap. IV, pp. 136–146.

March 28Matched case control studies: rationale, implementation of matched design. Choice of matched controls. Analysis of matched casecontrol studies: Classical methods for estimation and testing of odds ratios in pairmatched studies with binary exposures.Supplementary reading:[BD1] Chap. V, pp. 162–166.

April 4Conditional logistic regression for matched casecontrol studies. Cohort study.Supplementary reading:[BD1] Chap VII, pp. 248–253.

April 11Survival models for ungrouped cohort data: Cox model, excess relative risk model, additive hazards model. Grouped timevarying exposures, Lexis diagram. Grouped analysis of cohort studies via Poisson regression.Supplementary reading:[BD2] Chap. 3, pp. 82–91, Chap. 5, pp. 178–197, Chap. 4, pp. 120–150, 159–171.

April 18Analysis of Cox model for discrete responses via regression model for binary data with complementary loglog link. Diagnostic methods. Sensitivity, specificity of a diagnostic test. Positive predictive value. ROC curves.

April 25Diagnostic tests based on multiple markers. Introduction to clinical trials. Stages of drug development. Clinical trials of Phase I, II, III, and IV.Supplementary reading:[FFDM] Chap. 1.

May 2Clinical trial protocol. Primary and secondary objectives of clinical trial. Outcomes in clinical trials: hard, soft outcomes, surrogate outcomes. Selection of outcome measures. Randomization: simple, blocked, stratified.Supplementary reading:[FFDM] Chap. 3, pp. 37–51; Chap. 6, pp. 97–105. Example contents of a clinical trial protocol

May 9Choice of study population. Inclusion and exclusion criteria. Enrollment of study subjects. Blinding. Interim monotoring, data and safety monitoring board. Analysis set, intenttotreat principle. Exclusions from analysis set. Principles for choosing appropriate analysis method. Analysis of continuous, binary, timetoevent outcomes in twoarm and multiarm trials. Analysis of change since baseline.Supplementary reading:[FFDM] Chap. 4, pp. 55–65; Chap. 5, pp. 79–90; Chap. 7, pp. 119–131; Chap. 10, pp. 183–197.

May 23Study design: factorial design, multiarm design. crossover trial, noninferiority trial, grouprandomized trial, metaanalysis. Analysis of noninferiority trials. Interim monitoring of clinical trials. Methods for groupsequential tests. Pocock and O'BrienFleming boundaries.Supplementary reading:[FFDM] Chap. 17, pp. 345–382. [FFDM] Chap. 16, pp. 293–334.
Exercise Class Assignments
 Assignment 1:
Estimating incidence rates
(due date: March 7)
The dataset dbtr contains the numbers of cases of Type 1 Diabetes Mellitus observed in the Czech Republic, together with the population size, aggregated by sex, age (0 to 14) and calendar year (19892009).
 Estimate agespecific incidence rates of Type 1 DM by calendar year for boys, girls and all children. [Decide whether and how age and calendar year should be grouped.] Plot estimated incidence rates against age for (i) different time periods; (ii) different birth cohorts. Do you feel that incidence of Type 1 DM changed between 1989 and 2009?
 Estimate the cumulative risks (i.e., probabilities) of developing Type 1 DM before the 15th year of age at different time periods (again for boys, girls, and all together).
 Calculate agestandardized incidence rates for Type 1 DM at different time periods (again for boys, girls, and all together). [Age standardized rates combine agespecific rates over the same standard age distribution at all periods.]
 Calculate confidence intervals for cumulative risks and agestandardized incidence rates.
Here is R code to solve this problem, incidence figure by calendar period and by year of birth.
 Assignment 2:
Casecontrol analysis via logistic regression
(due date: March 21)
IlleetVilaine Data contain the results of a casecontrol study investigating the effect of alcohol and tobacco consumption on the risk of oesophageal cancer. There are 200 male cases and 775 male controls, all of them inhabitants of the French departement IlleetVilaine. (See [BD1], Sec. 4.1, p. 122–124).
 Reproduce the descriptive results in Table 4.1 of [BD1], p. 123.
 Conduct a grouped analysis of alcohol risk adjusted for age (see [BD1], p. 210–213).
 Conduct a joint grouped analysis of alcohol and tobacco risk adjusted for age (see [BD1], p. 213–221, esp. Tables 6.5 and 6.6).
 Conduct a joint ungrouped analysis of alcohol and tobacco risk adjusted for age (see [BD1], p. 227–231, esp. Table 6.12).
 Assignment 3:
Matched casecontrol analysis via conditional logistic regression
(due date: April 11)
The Los Angeles Study of Endometrial Cancer was a matched casecontrol study conducted in California in the 1970's (description in [BD1], Chap. 5.1, p. 162–163, data in [BD1], App. III, p. 290–296). There are 63 cases of endometrial cancer, all women age 55 or over, each matched to four controls living in the same retirement community. The primary exposure of interest was estrogen use. The secondary exposure was gallbladder disease.
The Epi library in R includes two versions of the data: the full dataset bdendo and a subset containing a single control matched to each case bdendo11.
 Conduct descriptive analysis similar to Table 5.1 of [BD1], p. 163. Use bdendo11 data to estimate odds ratios using the method for 1:1 matching and binary exposure.
 Conduct conditional logistic analysis of the bdendo11 dataset (1:1 matching) using the function glm. See [BD1], Chap. 7.3, p. 253–259, for inspiration and comparison of results.
 Conduct conditional logistic analysis of the bdendo dataset (1:4 matching) using one of the conditional logistic regression functions available in R (function clogistic from Epi library or function clogit from survival library. See [BD1], Chap. 7.4, p. 253–268, for inspiration and comparison of results.
 Assignment 4:
Analysis of cohort followup studies
(due date: May 2)
The Cardiovascular Health Study was a prospective cohort study of risk factors for cardiovascular disease among adults aged 65 years and older. The subjects were enrolled in 19891990 and followed till 2000.
We will investigate the following questions: (1) Is the risk of myocardial infarction (MI) among the elderly different for men than for women? If it is different, does the difference vary with age? (2) Is the carotid artery intimamedia wall thickness associated with future risk of myocardial infarction?
The dataset mi.RData includes information on 3917 subjects, of whom 408 had myocardial infarction during the followup. The description of variables is provided in a separate codesheet.
Conduct a descriptive analysis of MI risk and its association with age, gender, and intima wall thickness. Build regression models addressing the questions of interest using three different approaches and compare the results:
 MI risk analysis by the Cox model.
 MI risk analysis by the grouped Poisson model.
 Analysis of MI as a binary outcome (ignoring the timing of the MI event).
You may find helpful the following examples of R code for aggregating followup time and number of cases across exposure categories and fitting interactions of exposure with time by the Cox model.
Course Materials
 [EBR] Esteve J, Benhamou E, Raymond L. Statistical Methods in Cancer Research, Vol. IV: Descriptive Epidemiology. International Agency for Research on Cancer: Lyon, 1994.
 [BD1] Breslow NE, Day NE. Statistical Methods in Cancer Research, Vol. I: The analysis of casecontrol studies. International Agency for Research on Cancer: Lyon, 1980.
 [BD2] Breslow NE, Day NE. Statistical Methods in Cancer Research, Vol. II: The design and analysis of cohort studies. International Agency for Research on Cancer: Lyon, 1987.
 [FFD] Friedman LM, Furberg CD, DeMets DL. Fundamentals of Clinical Trials. 4th Ed., Springer: New York, 2010.
Course Plan
We will learn statistical methods used in medicine, especially in epidemiology and clinical trials. Terminology specific to medical applications will be explained and some specialized methods will be covered. We will review study designs used in medical studies (cohort study, casecontrol study, randomized controlled trial) and explain how to analyze each of them. Ethical and administrative aspects of human experiments and their impact on handling statistical issues will be discussed.
Prerequisites
This course assumes advanced knowledge of statistical theory and practice, especially linear regression, logistic regression, loglinear models, survival analysis. Master students of "Probability, statistics and econometrics" must have completed the course on Linear Regression (NMSA407), Advanced Regression Models (NMST432), and Censored Data Analysis (NMST531) before enrolling in this course.