H-index

The $$h$$-index is an index that quantifies both the scientific productivity and the scientific impact of a scientist. The index is based on the set of the scientist's most quoted papers and the number of citations that they have received in other people's publications. The index can also be applied to the productivity and impact of a group of scientists, such as a department or university or country. The index was suggested in 2005 by Jorge E. Hirsch as a tool for determining theoretical physicists' relative quality and is sometimes called the Hirsch index or Hirsch number. The $$h$$-index has yet to supplant older metrics.

Definition and purpose
The index is based on the distribution of citations received by a given researcher's publications. Hirsch writes:
 * A scientist has index h if h of his Np papers have at least h citations each, and the other (Np - h) papers have at most h citations each.

Thus, a scholar with an index of h has published h papers with at least h citations each. Thus, the h-index is the result of the balance between the number of publications and the number of citations per publication. The index is designed to improve upon simpler measures such as the total number of citations or publications, to distinguish truly influential scientists from those who simply publish many papers. The index is also not affected by single papers that have many citations. The index works properly only for comparing scientists working in the same field; citation conventions differ widely among different fields.

Online web programs are available to directly calculate a scientist's h-index: h-Index Calculation and Charts on QuadSearch: Metasearch Engine and h-index calculator mirror. Alternatively, the h-index can be manually determined using free Internet databases, such as Google Scholar, whereas subscription-based databases such as Scopus and the Web of Knowledge provide automatic functions and more complete databases. The h-index serves as an alternative to more traditional journal impact factor metrics in the evaluation of the impact of the work of a particular researcher. Because only the most highly cited articles contribute to the h-index, its determination is a relatively simpler process. Hirsch has demonstrated that $$h$$ has high predictive value for whether a scientist has won honors like National Academy membership or the Nobel Prize. In physics, a moderately productive scientist should have an $$h$$ equal to the number of years of service while biomedical scientists tend to have higher values.

Advantages
The main disadvantages of the old bibliometric indicators, such as total number of papers or total number of citations are that the former does not account for the quality of scientific publications, while the latter is disproportionately affected by participation in a single publication of major influence. The h-index is intended to measure simultaneously the quality and sustainability of scientific output, as well as, to some extent, the diversity of scientific research. For instance, the h-index is much less affected by methodological papers proposing successful new techniques, methods or approximations. For example, one of the most cited condensed matter theorists, John P. Perdew, has been very successful in devising new approximations within the widely used density functional theory. He has published 3 papers cited more than 5000 times and 2 cited more than 4000 times. Several thousand papers utilizing the density functional theory are published every year, most of them citing at least one paper of J.P. Perdew. His total citation index is close to 39 000, while his h-index is large, 51, but not unique. In contrast, the condensed-matter theorist with the highest h-index (94), Marvin L. Cohen, has a lower citation index of 35 000. One can argue that in this case the h-index reflects the broader impact of Cohen's paper in solid-state physics due to his larger number of highly-cited papers.

The h-index can also be calculated as a function of time, in two different ways. It was originally proposed by Hirsch that h depends linearly on the age of a researcher; in this case the time derivative allows to compare scientists of different age. Another possibility is to calculate h using papers published within a particular time period, for instance, within the last 10 years, thus measuring the current productivity as opposed to the lifetime achievement.

Which source to search on?
There are three citation databases on which $$h$$ is commonly calculated: Scopus, Web of Knowledge and Google Scholar. Each, however, is likely to produce a different $$h$$ for the same academic. This has been studied. Web of Knowledge was found to have strong coverage of journal publications, but poor coverage of high impact conferences (a particular problem for Computer Science based scholars); Scopus has better coverage of conferences, but poor coverage of publications prior to 1992; Google Scholar has the best coverage of conferences and most journals (though not all), but like Scopus has limited coverage of pre-1990 publications. Google Scholar has also been criticized for including gray literature in its citation counts. However, a study showed that the majority of the additional citation sources Google Scholar uses are legitimate refereed forums. It should be remembered that the content of all of the databases, particularly Google Scholar, continually changes, so any research on the content of the databases risks going out of date. It has been suggested that in order to deal with the sometimes wide variation in $$h$$ for a single academic measured across the possible citation databases, that one could assume false negatives in the databases are more problematic than false positives and take the maximum $$h$$ measured for an academic.

Criticism
There are a number of situations in which $$h$$ may provide misleading information about a scientist's output


 * The h-index does not consider the context of citations. For example, citations in a paper are often made simply to flesh-out an introduction, otherwise having no other significance to the work. h also does not resolve other contextual instances: citations made in a negative context and citations made to fraudulent or retracted work.


 * The h-index does not account for confounding factors. These include the practice of "gratuitous authorship", which is still common in some research cultures, the so-called Matthew effect, and the favorable citation bias associated with review articles.


 * The h-index has slightly less predictive accuracy and precision than the simpler measure of mean citations per paper.


 * The h-index is bounded by the total number of publications. This means that scientists with a short career are at an inherent disadvantage, regardless of the importance of their discoveries. For example, Évariste Galois' h-index is 2, and will remain so forever. Had Albert Einstein died in early 1906, his h-index would be stuck at 4 or 5, despite his being widely acknowledged as one of the most important physicists, even considering only his publications to that date.


 * While the h-index de-emphasizes singular successful publications in favor of sustained productivity, it may do so too strongly. Two scientists may have the same h-index, say, $$h=30$$, but one has 20 papers that have been cited more than 1000 times and the other has none. Clearly scientific output of the former is more valuable. Several recipes to correct for that have been proposed, but none has gained universal support.


 * The h-index is affected by limitations in citation data bases. Some automated searching processes find citations to papers going back many years, while others find only recent papers or citations. This issue is less important for those whose publication record started after automated indexing began around 1990. Citation data bases contain some citations that are not quite correct and therefore will not properly match to the correct paper or author.


 * The h-index does not account for the number of authors in a paper. If the impact of a paper is the number of citations it receives, it is logical to divide that impact by the  number of authors involved. (Some authors will have contributed more than others, but in the absence of information on contributions, the simplest assumption is to divide  credit equally.)  Not taking into account the number of authors could allow gaming the h-index and other similar indices: for example, two equally capable researchers could  agree to share authorship on all their papers, thus increasing each of their h-indices. Even in the absence of such explicit gaming, the h-index and similar indices tend to favor fields with  larger groups, e.g. experimental over theoretical.

General problems associated with any bibliometric index, namely the necessity to measure scientific impact by one number, apply here as well. For instance, comparing two condensed matter theorists with the highest h-index, Marvin Cohen and Philip Anderson, we observe that they have the same   h-index within 3%, although the latter is a Nobel Prize winner and a founder of entire new fields in condensed matter theory.

While the h-index is one 'measure' of scientific productivity, some object that a human activity as complex as the formal acquisition of knowledge should be condensed down to a single numeric metric. Two potential dangers of this have been expressed:
 * career progression and other aspects of a human's life may be damaged by the use of a simple metric in a decision-making process by someone who has neither the time nor the intelligence to consider more appropriate decision metrics
 * Scientists may respond to this by maximising their h-index to the detriment of doing more justifiable work. This effect of using simple metrics for making management decisions has often been found to be an unintended consequence of metric-based decision taking; for instance, governments routinely operate policies designed to minimise not crime, but crime figures.

Modifications of h-index and m value
Various proposals to modify the h-index in order to emphasize different features have been made .

Physical chemists from Berkeley and Stanford with high h-indices
The following is a list of some US physical chemists with high h-indices. It has been compiled from "The Everyday Scientist" and ISI Web of Science, using data for only professors at Stanford and U.C. Berkeley:


 * Richard N. Zare: h = 95
 * Gabor A. Somorjai: h = 90
 * Harden M. McConnell: h = 89
 * Graham R. Fleming: h = 75
 * Richard A. Mathies: h = 68

Biologists with high h-indices
A starting attempt was made at using the physics-oriented h-index for the life sciences (a blanket term for biology, botany, medicine, and so forth). Only ten names were listed in the reference, but if a more solid attempt is made this list will certainly grow. The following list is based on publications from 1983-2002.


 * Solomon H. Snyder: h = 191
 * Robert J. Lefkowitz: h = 164
 * David Baltimore: h = 160
 * Robert Gallo: h = 154
 * Pierre Chambon: h = 153
 * Bert Vogelstein: h = 151

Computer scientists with high h-indices
The following are five computer scientists with high h-indices (from http://www.cs.ucla.edu/~palsberg/h-number.html).


 * Hector Garcia-Molina: h = 70
 * Deborah Estrin: h = 68
 * Scott Shenker: h = 65
 * Don Towsley: h = 65
 * Jeffrey D. Ullman: h = 65
 * Robert Tarjan: h = 64

Economics researchers with high h-indices
The following are five economics researchers with high h-indices as measured by the University of Connecticut's RePEc Author Service (as of December 2006):


 * Andrei Shleifer: h = 36
 * Robert J. Barro: h = 32
 * Mark L. Gertler: h = 31
 * James J. Heckman: h = 29
 * N. Gregory Mankiw: h = 28

Scientists in other fields with high h-indices

 * Marcus Raichle: h = 89 (Neurologist and neuroscientist)
 * Endel Tulving: h = 65 (Cognitive psychologist)
 * Daniel Schacter: h = 64 (Cognitive psychologist)
 * George M. Whitesides: h = 135 (Chemistry)

Publications related to h-index

 * Hirsch, Jorge E., (2005), "An index to quantify an individual's scientific research output," PNAS 102(46):16569-16572, November 15 2005 (Free copy available from arXiv).
 * A Rational Indicator of Scientific Creativity
 * Sidiropoulos A., Katsaros D. and Manolopoulos Y., (2006), Generalized h-index for disclosing latent facts in citation networks.
 * Kelly C. D. and Jennions M.D. (2006), The h index and career assessment by numbers: a paper expounding certain problems of the h-index.
 * "Impact factor," Science 309:1181, 19 August 2005.
 * "An index to quantify an individual's scientific research output," PNAS 102(46):16569-16572, November 15 2005.

Computing the h-index

 * A simple web script to compute a (raw) h-index based on Google Scholar
 * A MATLAB script to compute the h-index
 * Publish or Perish calculates various statistics, including the h-index and the g-index using Google Scholar data
 * The HView visualizer showing a sorted histogram of citations showing the h-number as the biggest square included in the histogram
 * Yet another web script highlighting the article(s) to cite to raise the h-number

Lists of h-indices

 * A long list of chemists with high h-index values
 * A ranking of American computer science departments based on the h-index
 * The H-index for computer science
 * H values for Stanford p-chem professors from "The Everyday Scientist"

Hirschův index H-Index índice h Índice h H-index H指数 H-индекс H-indeks H指数