DNA-R1B1C7-L Archives

Archiver > DNA-R1B1C7 > 2008-01 > 1201465098


From: "David Ewing" <>
Subject: [DNA-R1B1C7] : Re: FTDNA Panel 4 Stability
Date: Sun, 27 Jan 2008 13:18:18 -0700


Sadly, all but one of the 7 men in our "large closely related group" of
Ewings exactly match the R1b1c7 modal for FTDNA markers 38-67, so it hasn't
given us the sort of information we were hoping for either in
differentiating the Ewing line from others or different branches of the
Ewings from one another. Congratulations to the Grier(son)/Greers!

To my understanding, the fact that you have identified a single mutation in
a slowly mutating marker that is shared by a large number of men with the
same surname gives you virtually no information about the TMRCA of the group
that you didn't already have based on your conventional genealogy (which is
that the MRCA must have lived before the oldest known ancestor who is not
the ancestor of all the men in the group). We have a similar situation in
the Ewing study--we have found identical 37-marker modals in several
kindreds not known to be related to one another on conventional grounds.
None of the men in our "large closely related group" of Ewings is further
than genetic distance 5/37 from the modal (and the large majority are within
genetic distance 2), though a few of the men in the group are as far as
genetic distance 7 or 8 from one another. Have a look at the network diagram
at:
http://www.clanewing.org/DNA_Project/DNA_ProjectResults/network/Y-DNA_Network_Detail.html

Here is an interesting fact, which I recall was shared by Matt Kaplan
(though it could have been Bruce Walsh) at the last FTDNA conference.
Suppose you have two pairs of 37-marker haplotypes, the members of each pair
separated by genetic distance one. One pair has a difference of one at
CDYa/b (a relatively rapidly mutating marker, estimated by Chandler at
0.03531--over 4x faster than the next fastest in the 37-marker panel) and
the other pair has a difference of one at any of the other (more slowly
mutating) markers. Which pair has the longest TMRCA? The surprising and
counter-intuitive answer is that those that differ at CDYa have a greater
TMRCA than those that differ at one of the other markers. How can this be?

As I understand it, the reason is as follows. We will expect to see a
mutation at CDY on average 0.03531 times the number of transmission events
in question. We will expect to see a mutation at ANY of the other 35 markers
on average the sum of their 35 mutation rates times the number of
transmission events in question, or (saying the same thing in another way)
35 times their average mutation rate times the number of transmission
events. So if we have ten transmission events, we will expect to see on
average 0.35 mutations at CDY, and we will expect to see on average
1.4mutations among the other markers (using an estimated average
mutation rate
of 0.004). What??!! Yes, in the same number of transmission events, we
expect to find four times as many mutations at other markers than we will
find at CDY. Of course, this all results from allowing yourself to look at
35 different places for a mutation instead of at one.

Actually, the situation I described above is a little more complicated than
this, because we are not just talking about seeing the mutation we see, but
we are also talking about not seeing the mutations we don't see. I am afraid
that the mathematics of the Poisson distribution leave me in the dust, but
fortunately, I have a little calculator (you can have it, too, from:
http://www.genetealogy.com/resources/ldout.php?id=46&to=http://members.aol.com/dnafiler/MutationCalculator.exe)
that has allowed me to figure the expectation after ten transmission events
that we will find exactly one mutation in one marker with a mutation rate of
0.03531 to be 24.8% and that we will find exactly zero mutations in the
other 35 markers with an average mutation rate of 0.004 to be 24.7%, so the
probability that both these things will occur simultaneously is 0.245 x
0.247 = 6.1%; whereas, the expectation that we will find exactly one
mutation among thirty-five markers with an average mutation rate of 0.004 is
34.5% and that we will find exactly zero mutations an another marker with a
mutation rate of 0.03531 is 70.3%, so the probability that both these things
will occur simultaneously is 0.345 x 0.703 = 24.3%. As you can see, even
using the more complicated (and accurate) computation, it is four times more
likely to see a mutation in any one of the slowly mutating markers than it
is to see a mutation in exactly and only CDY.

Probably my turbid expository style has resulted in glazed eyes around the
planet. For those of you who are still with me, here is the fallacy I am
trying to warn against in plain English: you cannot focus narrowly on a
single marker when you are trying to calculate TMRCA. Mutations happen at
random. Finding a single mutation gives you no information about whether it
occurred yesterday or a few thousand years ago. Finding mutations to be
*absent* in a large number of markers contributes significantly to making
these calculations and is far more important than the random differences in
single markers here or there.

David Ewing


This thread: