This article is currently unfinished.

Turn up the racism dials, we're gonna do some genetic archeology of the stupid variety. I'm kidding about the racism part, but unfortunately not the others. What I intend to do here is speculate on possible links between language families based on haplogroup lineages. By the way, I'm in no way qualified to handle this subject. This means you're in for a treat.

For some background: I am going to be focusing mainly on Y-DNA haplogroups, because they're, frankly, the more interesting ones. My interest in them was spurred by the fact that they correlate much more with language families than mitochondrial DNA haplogroups. This is the basis of the Father Tongue hypothesis: that the most common form of language transmission is the mother teaching the child the father's language. There is quite a bit of evidence for this theory. For instance, Haplogroup R is highly correlated with the Indo-European languages. Likewise, Haplogroup N is correlated with the Uralic languages, certain forms of Haplogroup O with the Sino-Tibetan languages, and so on. Of course, sometimes this pattern breaks: Hungarians, from a Y-DNA perspective, resemble Southern Slavs, though their mtDNA retains Uralic markers. Likewise, certain forms of Haplogroup E are correlated with the Afro-Asiatic languages, except in the Arabian Peninsula, where Haplogroup J is more common.

Everything that I have stated on this matter up until this point has landed within the range from "likely true" to "reasonable enough to probably be true." However, that's no fun. What I'm here for is speculation. I must put a disclaimer here: this is completely unscientific. Ask a linguist for real answers.

One important fact about haplogroups is that they are defined by mutations specific to them, meaning one can determine their lineages. Take for example Haplogroup R, the one most associated with Indo-Europeans. According to the Kurgan hypothesis, the Indo-Europeans originated in the Pontic steppe. One hypothesis also places the origin of Haplogroup R here. However, Haplogroup R has deeper origins that tell a very interesting story. Its parent, Haplogroup P, originated in Southeast Asia. The other major descendent of Haplogroup P, Haplogroup Q, is strongly correlated with Native Americans. Which tribes? Well, most of them, it turns out. What we can conclude from this is that the closest relatives of the Indo-Europeans, in an odd twist, include the majority of the Native American peoples. Though, don't expect a reconstruction of Proto-Indo-European-Amerind to be made anytime soon. There are too many technical problems hindering that project.

This fact is striking, because the usual suspects for large macrofamilies that subsume Indo-European tend to include some of its direct neighbors. For instance, the Uralic, Dravidian, Semitic, and even Kartvelian languages have all at some point been suspected as possible siblings of Indo-European. So, what about these ones? Let's start with Uralic. As previously stated, these languages strongly correlate with Haplogroup N. Using similar logic to before, we can conclude that these languages might be related to the languages correlated with its immediate neighbor, Haplogroup O. Except, which languages are these? It's at this point that we stumble upon a futher mystery: the potential relationships between the "East Asian languages." Various forms of Haplogroup O are associated with, to name a few, the Sino-Tibetan, Hmong-Mien, Austroasiatic, Austronesian, and Kra-Dai langues. Perhaps, it has been argued, that these languages are in fact related to each other. Some even add Japonic and Koreanic, though this is more ambiguous. However, if we add the Uralic languages into the mix, this might actually help the chances of including Japonic and Koreanic, oddly enough. Japanese and Korean have perplexed lingusts for years, being (somewhat) isolates that have multiple potential leads for familial membership. In the case of Japanese, some have theorized that it is a mix of an "East Asian" (Austronesian? Kra-Dai?) substratum and an "Altaic" (ambiguous) substratum. However, if we link Uralic and the broad East Asian families, perhaps the half-Eastern-half-Altaic nature of Japanese is a sign of membership of this group. We can make a similar argument for Korean, and possibly other languages. Perhaps the Tungusic languages? Haplogroups N and O occur in their populations. Though, frankly, the complete mess given the "Altaic" label is hard to unravel, even from (perhaps "especially from"?) the haplogroup angle.

The case of the Afro-Asiatic languages is also interesting. It is suspected that their original spread correlated with the spread of Haplogroup E1b1b. This is because the most common candidate for an Afro-Asiatic Urheimat is in Africa, and African speakers of these languages have this Haplogroup. This leads to the question of which languages belong to Haplogroup E1b1a, which turn out to, perhaps unsurprisingly, be the Niger-Congo languages. Given this, you can guess that I am more than willing to attempt a link between the Niger-Congo and Afro-Asiatic languages, and I wouldn't be the first.

The case of the African continent has some further examples of language-haplogroup links that paint certain very interesting pictures. For instance, Africa is famously home to Haplogroup A, the original human Y-DNA haplogroup, as well has Haplogroup B, one of its earliest offshoots. These two haplogroups are correlated with the people who, depending on whether or not they speak click languages, are either "Khoisan" or "Nilo-Saharan". These two families, while still used for convenience, are often suspected of being "wastebasket taxa" invented by Joseph Greenberg. What matters for our purposes is their relative age. Haplogroup A in particular is often labeled as the haplogroup of the "original" inhabitants of Africa. If that's the case, why are there relatively few people with Haplogroups A or B? The answer is simple: they were almost completely annihilated. They were the victims of the Bantu Expansion, one of the greatest genocides ever committed by the human race. The only two countries today with significant presence of Haplogroup A are Namibia and South Sudan, the latter being essentially created for them by fiat.

Another interesting case of dispersed populations is the case of the Fula people. In the modern day, they speak a Niger-Congo language and practice Islam. While they appear rather typical for a West African population, they have a bizarre secret: an unusual preponderance of Haplogroup R. In all other aspects, such as mtDNA or autosomal DNA, they resemble their neighbors. Where did this appearance of Haplogroup R come from? Perhaps it is a similar case to certain Native American groups, where it is debated how much of their Haplogroup R presence is indigenous or imported. However, as always, there is a much more outlandish explanation that I am much more willing to entertain: the Fula are the White Aethiopians. These were a group of people claimed to exist by authors such as Pliny the Elder, Pomponius Mela, Ptolemy, and Orosius. Supposedly, they were people, variously described as "white" or "olive", who had established territory in sub-Saharan Africa during ancient times. Of course, given that we have a group of unknown white guys living in Africa whose identity remained unknown, people scrambled to figure them out. Of note is that a name, supposedly of a king of these White Aethiopians, has been interpreted as having a Fulani affix. You can connect the dots.

An interesting reversal of the case of the Fula people is the case of the Arabs. Afro-Asiatic languges correlate with certain forms of Haplogroup E. However, remember that, while this still holds true within Africa, it ceases to be true in the Arabian Peninsula, where Haplogroup J is more common. In fact, given the data that we have available, it appears that Haplogroup J either originated in the Arabian Peninsula or somewhere nearby.

Now, given that I am speculating on human lineages, there are a couple questions that are probably burning in the minds of some of you. For one, if Indo-Europeans and (most) Native Americans are so closely related, why do Indo-Europeans look white, while Native Americans look Asian? Not only that, if Indo-Europeans and Arabs aren't closely related, then why are there a lot of Arabs who look like they could be Europeans? Perhaps these are reasonable questions to have, but what should be stated is that human physiology can be somewhat distinct from other lineages. Here's a good example: it is widely held that the Uralic languages have an Asian origin, but if you visit Finland or Hungary, the people look European. The reason for this is actually closely related to that comment I made about how, in many respects, modern Hungarians have genetic affinity with Slavs. Put simply, any physical features that would betray a non-European origin of the Magyars have been blended away via admixture with local populations.

This article is currently unfinished.