Clinical baseline

Clinical Baseline - Information gathered at the beginning of a study from which variations found in the study are measured. A person's health status before he or she begins a clinical trial. Baseline measurements are used as a reference point to determine a participant's response to the experimental treatment.

Baseline data should adequately describe the population in the trial.
This means including demographic variables, known factors that influence the outcome (including medications being taken by participants), factors that are likely to modify any benefit of treatment, and those that may predict adverse reactions. These factors are called potential "confounders", because, if they are imbalanced between the treatment groups at baseline, they may result in an apparent treatment effect when none exists, or mask an effect that does exist. Baseline data should also include any factors (especially known potential confounders) that have been used as strata for randomisation. Stratified randomisation, described in detail earlier,2 is used when a baseline characteristic, such as tumour stage, is known to affect outcome risk; the characteristic is therefore included in the randomisation algorithm to minimise imbalances between treatment groups. This is particularly useful in small studies.

If the study population contains subgroups of particular interest, the characteristics defining these subgroups, and numbers or proportion in each group, should be stated. For example, in a long-term trial of a new medication for preventing heart attack, diabetes mellitus would be a potential confounder (as people with diabetes have a much higher risk of heart attack than similar people without diabetes). Those with diabetes in this study would also be an interesting subgroup in whom the effects of the intervention might be different. Similarly, concurrent therapy with aspirin (which would substantially reduce the risk of heart attack) could confound the trial results if there was an imbalance between trial groups in the proportions of patients taking aspirin; aspirin therapy might also influence the likelihood of adverse reactions to study therapy. Baseline factors can be determined from interviews, physical examination, laboratory measures or imaging studies.

Measurement
Baseline data are measured as close as possible to the time that participants are randomly allocated to study groups, and in all cases, should be measured before the allocated treatment commences (information collected after the commencement of trial treatment may have been altered by the treatment itself, and is generally not regarded as baseline data). Ideally, baseline data should be collected on all patients screened for eligibility, as this would provide further information about the generalisability of the trial population. However, this is not always practicable or affordable, so some variables (eg, tissue biopsy, measurement of genetic markers, expensive imaging tests) are measured only in actual participants randomly allocated to a trial group.

For factors that are not constant, the conditions under which the baseline data are collected should be stated in the methods section of the study report. For example, it should be clear whether blood pressure recordings were measured sitting or supine, or after a specified rest period; also whether a single reading, the average of several readings, the highest of two, or the first of two or more, was used.

Baseline data as entry criteria
In some circumstances, threshold levels of one or more baseline variables will form part of the entry criteria for the study. In this case, if an extreme value of a baseline factor, such as high blood pressure, is required to qualify a person for entry into a study, potential participants whose value on the day of screening is more extreme (higher) than their usual level will be more likely to qualify for entry. A second baseline reading of the average blood pressure for this group will be lower and more accurately reflect their usual blood pressure; this is known as regression towards the mean.3 For this reason, remeasuring factors required for entry is desirable, to establish a more realistic group average value of the characteristic at baseline.

Presentation
The baseline characteristics are usually presented in the first table in a report. Care should be taken to include the necessary descriptive information without overwhelming readers with unnecessary details. For example, in the recent AFFIRM trial comparing rate control with rhythm control of atrial fibrillation, the published first table has 16 baseline characteristics, each with a mean and percentage value for the overall group, and for both treatment groups separately, together with P values.4 The resulting table of 107 values and four footnotes may make it difficult for some readers to extract the key information.5 A simpler presentation appears in the FRISC II study of invasive compared with non-invasive treatments for unstable coronary artery disease.6 This presents more baseline characteristics (20), but by minimising detail (omitting overall group and P values), allows a more rapid comparison of the characteristics between groups.

Comparability between groups
If randomisation has been performed correctly, the groups should be similar in baseline characteristics, except for the play of chance. Stratification in the randomisation process further restricts the extent of chance imbalances.2 For continuous variables (such as blood pressure, age, cholesterol level), the similarity of the treatment groups should be assessed by comparing relevant summary measures (mean and standard deviation, or median and range). For categorical factors (such as sex, disease stage), the numbers and proportions in each category level should be shown for each treatment group. The more similar the treatment groups, the more credible are the trial results as reflecting a true result of treatment, especially if unadjusted analyses are presented.

Use of P values to assess randomisation
Use of statistical tests to compare the balance and/or values of baseline characteristics between the study groups and the presentation of P values are not uncommon. However, many authors assert that this is inappropriate.3,5,8-10 If randomisation has been performed correctly, chance is the only explanation for any observed difference between groups at the outset of the study, in which case statistical tests become superfluous. Consequently, only if it is suspected that the randomisation process has failed or was flawed, can performing significance tests on the baseline data be readily justified.8 It is worth remembering that, if 20 baseline characteristics are presented from a trial using simple randomisation, it is more likely than not that at least one characteristic will show a significant imbalance between groups at two-sided P < 0.05 by chance alone (actual likelihood, 64%).

In any case, providing P values is not a substitute for carefully describing, in the results section, any imbalances between study groups that may be clinically important. For example, in a trial of a thrombolytic drug, a 1% baseline difference in history of previous intracranial haemorrhage may not be statistically significant, but could still affect haemorrhagic stroke rates after treatment (an outcome of the study), and hence could be regarded as potentially clinically significant. If there are imbalances that are considered important to the final study results, they should be accounted for by an adjusted analysis of the data, not simply noted with a P value in the first table.7

Other uses of baseline data
A longer-term benefit of collecting comprehensive baseline data is that, after outcome data become available, it allows the estimation of risk of the outcome in the control group, related to various baseline characteristics. This effectively uses the control group as an epidemiological cohort study, providing contemporary information about predictors of disease outcomes.

In summary, careful planning and collection of baseline data enables performance of a high-quality trial and allows readers to clearly see the internal and external validity of the study.