Mark and recapture

Mark and recapture is a method commonly used in ecology to estimate population size and population vital rates (i.e., survival, movement, and growth). This method is most valuable when a researcher fails to detect all individuals present within a population of interest every time that researcher visits the study area. Other names for this method, or closely related methods, include capture-recapture, capture-mark-recapture, mark-recapture, sight-resight, mark-release-recapture and band recovery.

Another major application for these methods is in epidemiology, where they are used to estimate the completeness of ascertainment of disease registers. Typical applications include estimating the number of people needing particular services, say services for children with learning disabilities, services for frail elderly living in the community, or with particular conditions, say illegal drug addicts, people infected with HIV, etc.

Field work related to mark-recapture
Typically a researcher visits a study area and uses traps to capture a group of individuals alive. Each of these individuals is marked with a unique identifier (e.g., a numbered tag or band), and then is released unharmed back into the environment.

Sufficient time is allowed to pass for the marked individuals to redistribute themselves among the unmarked population.

Next, the researcher returns and captures another sample of individuals. Some of the individuals in this second sample will have been marked during the initial visit and are now known as recaptures. Other animals captured during the second visit will not have been captured during the first visit to the study area. These unmarked animals usually are given a tag or band during the second visit and then are released.

Population size can be estimated from as few as two visits to the study area. Commonly, more than two visits are made, particularly if estimates of survival or movement are desired. Regardless of the total number of visits, the researcher simply records the date of each capture of each individual. The "capture histories" generated are analyzed mathematically to estimate population size, survival, or movement.

In the epidemiological setting, different sources of patients take the place of the repeated field visits in ecology. To take a concrete example, establishing a register of children with Type 1 diabetes children were identified from hospital admission records, from general practitioners (family doctors), and from the records of the local Diabetes Association. None of these sources had a complete list, but by putting them together it was possible to do two things, first to see how many children were identified in total, and secondly to estimate how many more children with Type 1 diabetes were living in the community.

Lincoln-Petersen method of analysis
The Lincoln-Petersen method can be used to estimate population size if only two visits are made to the study area. This method assumes that the study population is "closed." In other words, the two visits to the study area are close enough in time so that no individuals die, are born, move into the study area (immigrate) or move out of the study area (emigrate) between visits. The model also assumes that no marks fall off animals between visits to the field site by the researcher, and that the researcher correctly records all marks.

Given those conditions, estimated population size is:


 * $$N = \frac{n1n2}{m},$$

where


 * N = Estimate of total population size
 * n1 = Total number of animals captured on the first visit
 * n2 = Total number of animals captured on the second visit
 * m = Number of animals captured on the first visit that were then recaptured on the second visit

Derivation of the Lincoln-Petersen method
The above equation is derived as follows. The researcher defines the sample on the first visit, n1, to be a population. The researcher can then estimate the proportion of this newly-defined population that is captured on the second visit: m / n1. This ratio provides the probability of capturing a previously-marked individual during the second visit.

For example, suppose 50 individuals are marked on the first visit and 25 of those individuals are recaptured on the second visit. The researcher concludes that the probability of capturing a previously-marked individual on the second visit is: m / n1 = 25 / 50 = 0.50.

The researcher then assumes on the second day that all individuals in the actual population, N, have the same capture probability as did the recaptured individuals. (This assumption is critical, and cannot be tested in a two visit study.) Imagine the researcher thinking on the second visit, "I know that today I recaptured 50% of the animals I marked during my first visit, so today I probably also captured 50% of the individuals that I did not mark on my first visit. Indeed, today I probably captured 50% of all the individuals present in the study site regardless of whether or not those individuals were marked on my first visit." This is expressed as:


 * $$ \frac{n2}{N} = \frac{m}{n1}.$$

This is easily transformed into the formula used for the Lincoln-Petersen method:
 * $$N = \frac{n1n2}{m}.$$

Sample calculation
A biologist wants to estimate the size of a population of turtles in a lake. She captures 10 turtles on her first visit to the lake, and marks their backs with paint. A week later she returns to the lake and captures 15 turtles. Five of these 15 turtles have paint on their backs, indicating that they are recaptured animals.
 * $$N = \frac{n1n2}{m} = \frac{10*15}{5} = 30$$

In this example, the Lincoln-Petersen Method estimates that there are 30 turtles in the lake.

A refined form
A slightly better estimate of population size can be obtained with a modified version of the first formula above. This modified formula reduces bias in the population estimate:


 * $$N = \frac{(n1+1)(n2+1)}{(m+1)} - 1,$$

where, as before,


 * N = Estimate of total population size
 * n1 = Total number of animals captured on the first visit
 * n2 = Total number of animals captured on the second visit
 * m = Number of animals captured on the first visit that were then recaptured on the second visit

An approximately unbiased variance of N, or var(N), can be estimated as:


 * $$var(N) = \frac{(n1+1)(n2+1)(n1-m)(n2-m)}{(m+1)(m+1)(m+2)}.$$

More than two visits
The literature on the analysis of capture-recapture studies has blossomed since the early 1990s. There are very elaborate statistical models available for the analysis of these experiments. A simple model which easily accommodates the three source, or the three visit study, is to fit a Poisson regression model. The Open Source R programming language, (a freely-available implementation of the S programming language), can do this, and also has a number of specialised libraries for more complex analyses. More sophisticated mark-recapture models can be fit using specialized software packages such as programs MARK or M-SURGE.

Integrated approaches
Modeling mark-recapture data is trending towards a more integrative approach, which combines mark-recapture data with population dynamics models and other types of data. The integrated approach is more computationally demanding, but extracts more information from the data improving parameter and uncertainty estimates.