Language family

A language family is a group of languages related by descent from a common ancestor, called the proto-language. As with biological families, the evidence of relationship is observable shared characteristics. An accurately identified family is a phylogenetic unit; that is, all its members derive from a common ancestor, and all attested descendants of that ancestor are included in the family. Most of the world's languages are known to belong to families; for many others, however, family relationships are not known or only tentatively proposed.

The concept of language families is based on the assumption that over time languages gradually diverge into dialects and then into new languages. However, linguistic ancestry is less clear-cut than biological ancestry, because there are extreme cases of languages mixing due to language contact in conquest or trade, whereas biological species normally don't interbreed. In the formation of creole languages and other types of mixed languages, there may be no one ancestor of a given language. In addition, many sign languages develop in isolation and may have no relatives at all. However, these cases are relatively rare and most languages can be unambiguously classified.

The common ancestor of a language family is seldom known directly, since most languages have a relatively short recorded history. However, it is possible to recover many features of a proto-language by applying the comparative method—a reconstructive procedure worked out by 19th century linguist August Schleicher. This can demonstrate the validity of many of the proposed families in the list of language families. For example, the reconstructible common ancestor of the Indo-European language family is called  Proto-Indo-European. Proto-Indo-European is not attested by written records, since it was spoken before the invention of writing.

Sometimes, though, a proto-language can be identified with an historically known language. Provincial dialects of Latin ("Vulgar Latin") gave rise to the modern Romance languages, so the Proto-Romance language is more or less identical with Latin (if not exactly with the literary Latin of the Classical writers). Similarly, dialects of Old Norse are the proto-language of Norwegian,  Swedish,  Danish,  Faroese and  Icelandic.

Language families can be divided into smaller phylogenetic units, conventionally referred to as branches of the family, because the history of a language family is often represented as a tree diagram. However, the term family is not restricted to any one level of this "tree". The Germanic family, for example, is a branch of the  Indo-European family. Some taxonomists restrict the term family to a certain level, but there is little consensus in how to do so. Those who affix such labels also subdivide branches into groups, and groups into complexes. The terms superfamily, phylum, and stock are applied to proposed groupings of language families whose status as phylogenetic units is generally considered to be unsubstantiated by accepted historical linguistic methods.

Languages that cannot be reliably classified into any family are known as isolates. A language isolated in its own branch within a family, such as Greek within Indo-European, is often also called an isolate; but the meaning of isolate in such cases is usually clarified. For instance, Greek might be referred to as an Indo-European isolate. The isolation of modern Greek, however, is not typical of its relationship to other languages at other times in its history. Several Greek dialects evolved out of the larger Indo-European language group; and later, Greek words influenced many other languages. By contrast, the Basque language is a living modern language and a near perfect isolate. The history of its lexical, phonetic, and syntactic structures is not known, and is not easily associated to other languages, though it has been influenced by Romance languages in the region, like Castilian Spanish,  Occitan, and  French.

Connections among and between language families are often used by anthropologists, in combination with DNA evidence and fossil evidence, to help reconstruct pre-historic migrations of peoples, and other pre-historic events, such as the spread of agriculture.

The Linguist List is now working on a National Science Foundation funded project entitled Multitree, to build a database of all hypothesized language relationships, with a full searchable bibliography for each.