I'm generally a fan of "blurry" definitions where something can qualify as X if it fulfills a few of many criteria. I think trying to create hard rules around blurry areas like race and culture is fool's errand, and Scott does a great job laying out how overly strict definitions can go wrong.
- 79
- 10
Jump in the discussion.
No email address required.
Notes -
Depending on the publication, they can get up to 30% from the first two PCs, although it can be low like high single digits.
Regardless of the number, they're not sucky nor negligible when you can do a PCA using 5,000 random SNPs from 300 random individuals (100 individuals each of the three populations of Europeans, East Asians, and West Africans) and get a clean triangle. Pick another 5,000 random SNPs from another 300 random individuals and you'll get another clean triangle. Rinse and repeat. Quite robust.
If anything, it's a testament to how different genetically Europeans, East Asians, and West Africans are, that one can get consistent, clean separation in such low dimensionality (e.g., 2), robust to using different individual and different loci. One can always add more PCs to get more total variation captured, but it's just unnecessary for separating those three populations.
More options
Context Copy link