DNA-R1B1C7-L Archives

Archiver > DNA-R1B1C7 > 2009-04 > 1239618017


From: John Mclaughlin <>
Subject: Re: [R-M222] Genetic Distance/Diversity
Date: Mon, 13 Apr 2009 05:20:17 -0500
References: <c2e.5003fcf0.3713caac@aol.com> <000001c9bc05$e8ecc820$bac65860$@com> <49E2F69F.8020707@aol.com><000201c9bc19$1ec3c300$5c4b4900$@com>
In-Reply-To: <000201c9bc19$1ec3c300$5c4b4900$@com>


<Right, we're on the same track now. However, take a look at the output
of the Magee Utility, the column on the left immediately after Kit
no/Name. This should give the gd between participants and modal. It
seems to be limited to a value of 10.

I re-ran an old M222 dataset in the McGee utility using the Hybrid
mutation model option for genetic distance. Now there are plenty of GDs
over 10 against the modal. In fact in this dataset I saw the McCord
sample and he comes in at a GD of 21 against the modal. One Doherty was
12. Brennan as 13. Since I am not sure exactly what dataset you are
using I won't comment further but there is nothing wrong with the McGee
utility. John McEwan, one of the foremost DNA authorities out there,
uses it often in his analysis.

The only real question is which option to use: the Hybrid mutation
model or the Infinite allele mutation model. Either one is valid. FTDNA
and Ysearch both use the huntrid mutation modal referred to below as the
stepwise mutation model.

http://nitro.biosci.arizona.edu/ftdna/models.html

Since we are somewhat ignorant of the actual mutational details, two
different assumptions are used to model mutations. These are based on
standard models from molecular population genetics, and are

* The Infinite alleles model. This assumes that each mutation that
arises is unique (i.e., an infinite number of alleles -- marker forms --
can be generated). Hence, marker is scored as no mutation, while any
sort of a mismatch (be it one or more than one step off) is simply
scored as a new mutation.

* The Stepwise Mutation Model. This follows the changes in length,
allowing a step up (or down) by one at each mutation. Note that this
means that an observed match could result from no mutations, from one up
and one down mutation, from two up and two down mutations, etc.

Note from the calculator that these methods give essentially the same
result when the fraction of total matches is very high and only differ
when the fraction of matches decrease. Further, the stepwise model
always gives lower estimates than the infinite alleles model (as it
assumes more mutations could have occurred, and hence gives longer times).


The McGee utility defaults to the Infinite allele mutation model. I
assume there is a reason for that. Anybody know?


John




This thread: