DNA-R1B1C7-L Archives

Archiver > DNA-R1B1C7 > 2012-03 > 1333090861


From: "Sandy Paterson" <>
Subject: Re: [R-M222] Geographical distribution of M222+ and a peak at DF23+
Date: Fri, 30 Mar 2012 08:01:01 +0100
References: <mailman.33.1332831605.6995.dna-r1b1c7@rootsweb.com> <4F71B0A6.6070005@earthlink.net> <89EF7258-4726-4158-A2DF-C21E442D1E70@me.com> <000101cd0cb2$95151030$bf3f3090$@com> <0DF9A903-8A37-4969-B2AD-B57F74BFA20B@me.com> <000001cd0d87$29d874d0$7d895e70$@com><966322E5-24CC-4A1B-B264-E7EF9574C9AB@me.com>
In-Reply-To: <966322E5-24CC-4A1B-B264-E7EF9574C9AB@me.com>


Malcolm

Yes, I'm familiar with KNs work.

We need to clarify something here. At some point in time, say T1, an SNP
occurs. At a later point in time, say T2, this male line splits into two
surviving branches. T2 can be estimated, but it is impossible to determine
T1, the time of the SNP mutation.

KN gave a mathematical proof in another forum in which he shows that E(v) =
mG, where G is the number of generations back to T2. The proof had nothing
to do with simulation, but I think he verified it using simulation. It's
quite possible that he started off with simulation and then developed the
proof, I don't know. It's also possible that someone else developed the
proof and that KN merely posted it in that other forum, but if that were the
case I think he would have acknowledged his source when he posted. I do know
that he's also done a lot of work on confidence intervals, working from
first principles, and again, using simulation to verify confidence interval
estimations.

I'm amused that you call it black box Monte Carlo. I've not seen that
expression before but if you're suggesting that he simply bought some
software that does Monte Carlo simulation and blindly used it in his work, I
can dispel that notion. He didn't. He wrote his own software, from first
principles. I know this because he sent his source code to me some time in
2008 or maybe early in 2009.

I've since written my own code, also from first principles, and have
incorporated multi-step mutations (KNs code allowed only for bi-directional
single-step mutations). As such, the simulations allow for both reverse
mutations as well as independent mutations in different family branches
(some people call them parallel mutations).

Sandy




-----Original Message-----
From:
[mailto:] On Behalf Of Malcolm McClure
Sent: 29 March 2012 13:23
To:
Subject: Re: [R-M222] Geographical distribution of M222+ and a peak at DF23+

Sandy

Ken Nordtvedt himself says: "There is no way at all to estimate age back to
(mutation?) happening of an snp."
See:
http://archiver.rootsweb.ancestry.com/th/read/GENEALOGY-DNA/2009-07/12483127
09

His explanation of how he uses variance is at
http://knordtvedt.home.bresnan.net

So far as I can understand his rationale, Nordtveld has established his
variance analysis methodology using data derived from black box Monte Carlo
simulation of mutation rates and clade formation, not from actual SNPs in
real families. Nested ANOVA analysis of real data requires randomisation of
sampling. For observational data, the derivation of confidence intervals
must use subjective models. In practice, estimates of effects from
observational studies are often inconsistent. In practice, "statistical
models" and observational data can be useful for suggesting hypotheses but
otherwise be treated very cautiously.

Malcolm





This thread: