Schedule
Lectures  
Tuesday  12:20  13:50  Praktikum KPMS  
Exercise Class  
Tuesday  14:00  15:30  K10A 
Course Contents

February 25Role of statistics in medical research. Subject of epidemiology. Prevalence, incidence of disease.Supplementary reading:[EBR], Chap. 1, pp 1–21, [BD1] Chap. II, pp. 42–46.

February 28Estimating incidence by piecewise exponential model. Agespecific incidence. Agestandardized incidence. Cumulative incidence. Confidence intervals for agestandardized and cumulative incidence.Supplementary reading:[EBR], Chap. 2, pp 49–62, [BD1] Chap. II, pp. 47–54, [BD2] Chap. 2, pp. 48–61.

March 7Exposures. Relative risk, excess risk. Estimating relative risk from aggregated data. Analysis of binary exposures (2 x 2 tables). Cohort and casecontrol design. Invariance of odds ratio to study design. Relationship between odds ratio and relative risk. Small sample methods for estimating odds ratio.Supplementary reading:[BD1] Chap. II, pp. 53–73, Chap. IV, pp. 122–129.

March 14Large sample methods for estimating odds ratio. Control of confounding in casecontrol studies: sampling and analytic strategies. Stratification, matching, adjustment. Analysis of stratified casecontrol studies via logistic regression.Supplementary reading:[BD1] Chap. III, pp. 84–115, Chap. IV, pp. 129–136, Chap. VI, pp. 192–209.

March 21Classical methods for analyzing stratified casecontrol studies with binary exposure. CochranMantelHaenszel test, Woolf estimator, MantelHaenszel estimator. Matched case control studies: rationale, implementation of matched design. Choice of matched controls.Supplementary reading:[BD1] Chap. IV, pp. 136–156.

March 28Analysis of matched casecontrol studies: Classical methods for estimation and testing of odds ratios in pairmatched studies with binary exposures. Conditional logistic regression for matched casecontrol studies.Supplementary reading:[BD1] Chap. V, pp. 162–166, Chap. VI, pp. 204–205, Chap VII, pp. 248–253.

April 4Cohort study. Survival models for ungrouped cohort data: Cox model, excess relative risk model, additive hazards model. Grouped timevarying exposures, Lexis diagram. Grouped analysis of cohort studies via Poisson regression.Supplementary reading:[BD2] Chap. 3, pp. 82–91, Chap. 5, pp. 178–197, Chap. 4, pp. 120–150, 159–171.

April 11Analysis of Cox model for discrete responses via regression model for binary data with complementary loglog link. Diagnostic methods. Sensitivity, specificity of a diagnostic test. Positive predictive value. ROC curves. Diagnostic tests based on multiple markers.

April 18Introduction to clinical trials. Stages of drug development. Clinical trials of Phase I, II, III, and IV. Clinical trial protocol.Supplementary reading:[FFDM] Chap. 1. Example contents of a clinical trial protocol

April 25Primary and secondary objectives of clinical trial. Outcomes in clinical trials: hard, soft outcomes, surrogate outcomes. Selection of outcome measures. Randomization: simple, blocked, stratified.Supplementary reading:[FFDM] Chap. 3, pp. 37–51; Chap. 6, pp. 97–105.

May 2Choice of study population. Inclusion and exclusion criteria. Enrollment of study subjects. Blinding. Interim monotoring, data and safety monitoring board. Study design: factorial design, multiarm design, crossover trial, noninferirority trial, grouprandomized trial, metaanalysis.Supplementary reading:[FFDM] Chap. 4, pp. 55–65; Chap. 5, pp. 79–90; Chap. 7, pp. 119–131; Chap. 10, pp. 183–197.

May 9Analysis set, intenttotreat principle. Exclusions from analysis set. Principles for choosing appropriate analysis method. Analysis of continuous, binary, timetoevent outcomes in twoarm and multiarm trials. Analysis of change since baseline. Analysis of noninferirority trials.Supplementary reading:[FFDM] Chap. 17, pp. 345–382.

May 16General approach to power and sample size calculation for twosample tests with asymptotic normality and unequal variances. Application to binary outcomes.Supplementary reading:[FFDM] Chap. 8, pp. 133–157.

May 23Interim monitoring of clinical trials. Methods for groupsequential tests. Pocock and O'BrienFleming boundaries.Supplementary reading:[FFDM] Chap. 16, pp. 293–334.
Exercise Class Assignments
 Assignment 1:
Estimating incidence rates
(due date: March 13)
The dataset dbtr contains the numbers of cases of Type 1 Diabetes Mellitus observed in the Czech Republic, together with the population size, aggregated by sex, age (0 to 14) and calendar year (19892009).
 Estimate agespecific incidence rates of Type 1 DM by calendar year for boys, girls and all children. [Decide whether and how age and calendar year should be grouped.] Plot estimated incidence rates against age for (i) different time periods; (ii) different birth cohorts. Do you feel that incidence of Type 1 DM changed between 1989 and 2009?
 Estimate the cumulative risks (i.e., probabilities) of developing Type 1 DM before the 15th year of age at different time periods (again for boys, girls, and all together).
 Calculate agestandardized incidence rates for Type 1 DM at different time periods (again for boys, girls, and all together). [Age standardized rates combine agespecific rates over the same standard age distribution at all periods.]
 Calculate confidence intervals for cumulative risks and agestandardized incidence rates.
 Assignment 2:
Casecontrol analysis via logistic regression
(due date: April 3)
IlleetVilaine Data contain the results of a casecontrol study investigating the effect of alcohol and tobacco consumption on the risk of oesophageal cancer. There are 200 male cases and 775 male controls, all of them inhabitants of the French departement IlleetVilaine. (See [BD1], Sec. 4.1, p. 122–124).
 Reproduce the descriptive results in Table 4.1 of [BD1], p. 123.
 Conduct a grouped analysis of alcohol risk adjusted for age (see [BD1], p. 210–213).
 Conduct a joint grouped analysis of alcohol and tobacco risk adjusted for age (see [BD1], p. 213–221, esp. Tables 6.5 and 6.6).
 Conduct a joint ungrouped analysis of alcohol and tobacco risk adjusted for age (see [BD1], p. 227–231, esp. Table 6.12).
 Assignment 3:
Matched casecontrol analysis via conditional logistic regression
(due date: April 18)
The Los Angeles Study of Endometrial Cancer was a matched casecontrol study conducted in California in the 1970's (description in [BD1], Chap. 5.1, p. 162–163, data in [BD1], App. III, p. 290–296). There are 63 cases of endometrial cancer, all women age 55 or over, each matched to four controls living in the same retirement community. The primary exposure of interest was estrogen use. The secondary exposure was gallbladder disease.
The Epi library in R includes two versions of the data: the full dataset bdendo and a subset containing a single control matched to each case bdendo11.
 Conduct descriptive analysis similar to Table 5.1 of [BD1], p. 163. Use bdendo11 data to estimate odds ratios using the method for 1:1 matching and binary exposure.
 Conduct conditional logistic analysis of the bdendo11 dataset (1:1 matching) using the function glm. See [BD1], Chap. 7.3, p. 253–259, for inspiration and comparison of results.
 Conduct conditional logistic analysis of the bdendo dataset (1:4 matching) using one of the conditional logistic regression functions available in R (function clogistic from Epi library or function clogit from survival library. See [BD1], Chap. 7.4, p. 253–268, for inspiration and comparison of results.
 Assignment 4:
Analysis of cohort followup studies
(due date: May 9)
The Cardiovascular Health Study was a prospective cohort study of risk factors for cardiovascular disease among adults aged 65 years and older. The subjects were enrolled in 19891990 and followed till 2000.
We will investigate the following questions: (1) Is the risk of myocardial infarction (MI) among the elderly different for men than for women? If it is different, does the difference vary with age? (2) Is the carotid artery intimamedia wall thickness associated with future risk of myocardial infarction?
The dataset mi.RData includes information on 3917 subjects, of whom 408 had myocardial infarction during the followup. The description of variables is provided in a separate codesheet.
Conduct a descriptive analysis of MI risk and its association with age, gender, and intima wall thickness. Build regression models addressing the questions of interest using three different approaches and compare the results:
 MI risk analysis by the Cox model.
 MI risk analysis by the grouped Poisson model.
 Analysis of MI as a binary outcome (ignoring the timing of the MI event).
 Assignment 5:
Issues in Clinical Trials
(due date: May 23)

Review the assigned protocol with respect to the following questions:
 Is the overall study design appropriate for the scientific question of interest, i.e. establishing the relative clinical benefit of two alternative therapies?
 Are the primary and secondary endpoints appropriately chosen for the scientific question of interest?
 Is the sample size adequate?
 Are the proposed analysis methods sufficiently described, suitable for evaluating treatment effects and consistent with sample size calculation?

The following study is being planned: randomization to placebo vs. active therapy, timetoevent outcome, followup duration 3 years, expected loss to followup during the 3 years 15%, expected event rate at 3 years on the placebo arm 20%, clinically significant effect of the active therapy consists in the reduction of event rate by 20%. Decide how many patients are to be enrolled in each arm in order to guarantee that the power to detect the clinically significant effect exceeds 0.8.
Hints: Approximate the hazard rate by a constant hazard over 3 years, take advantage of Poisson distribution of the number of events, approximate it by the normal distribution to develop a test for the difference between log hazard rates. Then perform standard power calculation on this test.

Review the assigned protocol with respect to the following questions:
Course Materials
 [EBR] Esteve J, Benhamou E, Raymond L. Statistical Methods in Cancer Research, Vol. IV: Descriptive Epidemiology. International Agency for Research on Cancer: Lyon, 1994.
 [BD1] Breslow NE, Day NE. Statistical Methods in Cancer Research, Vol. I: The analysis of casecontrol studies. International Agency for Research on Cancer: Lyon, 1980.
 [BD2] Breslow NE, Day NE. Statistical Methods in Cancer Research, Vol. II: The design and analysis of cohort studies. International Agency for Research on Cancer: Lyon, 1987.
 [FFD] Friedman LM, Furberg CD, DeMets DL. Fundamentals of Clinical Trials. 4th Ed., Springer: New York, 2010.
Course Plan
We will learn statistical methods used in medicine, especially in epidemiology and clinical trials. Terminology specific to medical applications will be explained and some specialized methods will be covered. We will review study designs used in medical studies (cohort study, casecontrol study, randomized controlled trial) and explain how to analyze each of them. Ethical and administrative aspects of human experiments and their impact on handling statistical issues will be discussed.
Prerequisites
This course assumes advanced knowledge of statistical theory and practice, especially linear regression, logistic regression, loglinear models, survival analysis. Master students of "Probability, statistics and econometrics" must have completed the course on Linear Regression (NMSA407), Advanced Regression Models (NMST432), and Censored Data Analysis (NMST531) before enrolling in this course.