Sunday, July 14, 2019

The Case of the Missing DNA


A recent inquiry from a DNA match sent me into the mysteries of AncestryDNA matching. I was appalled at what I learned.

In the field of data science, we use the term "single source of truth." You would expect that DNA matching would be consistent across all the different testing companies, so any and all of them would be a valid source of truth. Not so. If you skip the story, please take a moment to look below at the explanatory graphic.

It started with a routine email from a gentleman I'll call Lou. He found our match on Family Tree DNA and asked if it was possible we were related through a particular surname. He closed by telling me he was of African-American descent.

Knowing the challenges faced by African-Americans who are researching their roots, I gave the email far more attention than I would have if it had been from a Caucasian match.

Lou and I had no shared matches on FTDNA, so  I turned to Ancestry, where I have identified many cousins from that branch of my family. But there was a problem. Our match on Ancestry was only 8 cM (1 segment), where on FTDNA it was 27 cM (3 segments). How is that even possible?

Our FTDNA match includes a couple of short segments. Eliminating them leaves our FTDNA match at 19.9 cM (1 segment), still more than double what AncestryDNA showed. We exchanged some emails to discuss the source of the data.

  • Lou had uploaded his AncestryDNA results to FTDNA, MyHeritage and GedMatch. 
  • Lou had tested with 23AndMe and uploaded that result to GedMatch.
  • I had directly tested with FTDNA and MyHeritage, in addition to AncestryDNA. 
  • I had uploaded my AncestryDNA result to GedMatch. 

All match combinations except AncestryDNA are in the range 17.8-19.9 cM, with one segment in the same approximate range in chromosome 12. Of course we can't see what AncestryDNA is suggesting.


Choose Your Source of Truth


My brother's match to Lou is also in a similar range to these numbers. At Ancestry DNA, Lou's brothers and daughter have a stronger match to me than Lou does. However, none of them have uploaded to GedMatch, so we can't see the science behind the numbers. This particular chromosome range does appear to fall in or near an ISOGG-documented slight pile-up area.

So this leaves the possibility that AncestryDNA has chosen to ignore some of the match due to pile-up. Wouldn't it be nice to be told that?

Does Ancestry have a computational error on my match with Lou, since Lou's daughter matches me more strongly than Lou matches me? Or does she match me in an entirely different way via her mother's lines?

What is the best source of truth? If you are using only AncestryDNA for your DNA matching, you are not seeing the whole truth. GedMatch is free. FTDNA is inexpensive. MyHeritage has some great tools. You can choose your source of truth.

No comments:

Post a Comment