APG-L ArchivesArchiver > APG > 2007-12 > 1196660489
From: Sharon <>
Subject: [APG] TGN/Ancestry patent 7,249,129 issues
Date: Mon, 03 Dec 2007 00:41:29 -0500
Dick Eastman's blog announced this patent 7,249,129 grant in August, and
most of the ensuing discussion did revolve around the "Internet
Biographical Collection" controvery.
However, while the patent title "Correlating Genealogy Records Systems
and Methods" is quite general, the patent claims are notably specific,
and limited, to the methods and systems for consolidating "same
person" (augmented by "same mother" and "same father" inferences for
that person) records from many source databases to respond to a user
request with a consolidated record view (and/or links to the original
record) and/or a subgroup, such as a family tree "based on consolidated
information from a plurality of records".
This patent's methods and systems claims are also specifically limited
to indexing, partitioning, sorting, comparison, calculating correlation
and threshold criteria, correlating and linking records iteratively by
pairs of records for only these data elements -
surname (augmented by double metaphone and soundex algorithms)
birth date [and/or birth date range]
The rationale for this invention seems to be based on a premise that
previous pedigree/family tree systems were open to "newer" data
replacing the "older" data without necessarily being correct. This
invention's stated goal is to correct that flaw by leaving any source
records as created/submitted, while providing a way to correlate
multiple records sources in an ongoing, iterative fashion, such that the
family tree produced is dynamic, and iteratively updated by additional
information and analysis criteria.
This patent diagrams particularly highlight the process of creating
individual person records from the 1930 census, the Social Security
Death Index, and Ancestry's World Tree, and then looking for "same
person" matches, augmented by "same mother" and "same father"
relationships. Genealogists "skilled in the art" can make a chart of all
the data elements that could be extracted from these database examples,
without regard to whether the data exists in the source and/or is
correct - and see pretty quickly what the basic advantages and
limitations for matches might be. Other databases are mentioned, but not
TGN has access to many large databases and could create statistical
analysis of many patterns in those databases that might yield some
helpful correlation possibilities. However, what I have seen in the
implementation of these techniques so far in the TGN "virtual tree"
representations is often confusing and awkward to understand and use. I
see parents, children and spouses mixed up, duplicated or linked oddly.
I used to receive email notices periodically about "new" data for the
surname "sergeant". I also have seen "did you know" messages for how
many folks of a certain surname were in a particular database while
searching in that database for something else.
I wondered if the patent might shed some light on why such TGN
implementations did not produce more helpful results.
The detailed description of the invention mentions other family
relationships, but the claims only specify identifying "same person"
with "same mother" and/or "same father" consolidations. So while the
detailed description implies more utility, there is an element of an old
joke in the academia of mathematics - "the proof is left to the reader"
- AND is not in the claims, and thus is not covered by the patent.
For example, if a records source provides an implicit or explicit
sibling relationship between one person and another, but either one or
both of those persons do not have sources for mother or father
relationships, the sole use of the methods and systems in this invention
would leave the records that only define their sibling relationship out
in the cold.
Likewise, in the course of the detailed description, a "draft
registration" is mentioned as a source of assertion of an event for a
person. Draft registrations do contain birth date, birth place and often
next of kin. The next of kin could be a parent - or not. Spouses are
often given in this next of kin field, but spousal relationships are not
covered in the patent claims. Spousal correlations are implied in the
patent detailed description and diagrams, but not included in the patent
claims. Other source record attributes mentioned in the detailed
description include occupation, hair color, fingerprints and DNA - but
the patent claims for methods and systems of consolidating records for
"same person" are also silent on the role of such attributes. Other
spurious references to photographs and text contributed by users does
not specify how they could support the patent claims.
The detail in a patent description can get away with giving examples of
various embodiments for the invention to illustrate that the mechanics
are not limited to implementation by specific computer and network
configurations, programming languages, database structures or user
interfaces etc. But if the premise of the patent is that it is
data-centric (specifically to address a problem of tree-centric
methods), and yet the methods and systems do not specify how attributes
other than name, birth, death or parents are to be employed uniquely,
then the methods employed for attributes other than name, birth, death
or parents are not covered by the patent.
While computer and business processes have been allowed in recent patent
law, the original intent of such utility patents was to address
mechanical methods and systems. Mechanical innovations in the process
that "those skilled in the art" would appreciate/understand/benefit from
are supposed to be the underlying basis of the invention, even if the
invention is a computer or business process. In fact, this patent may
simply portray the potential advantages of having large databases of
source records "to one of ordinary skill in the art."
Patent detail descriptions generally do attempt to illustrate enough
particular embodiments to show that the databases or user interaction
tools are simply implementation details (ie, not limited to every
existing or potential implementation medium). But patent methods and
systems are supposed to be novel, useful, and non-obvious. A number of
professional genealogists have given talks over the years using simple
excel spreadsheets of various data sources to illustrate the process of
normalizing and consolidating source records into proof methodologies
for "same person" and "same mother" or "same father" case studies -
using many more indexing, sort and correlation keys than name, birth,
death and/or parent info.
The patent requirements for novelty, usefulness and nonobviousness are
supposed to be have been verified by examiners that have consulted
sources for other patents, implementations, publications, demonstrations
etc that pre-date the patent application. However, examiners are not
likely to be familiar with genealogical publication and practice sources.
This oversight may have been particularly aggravated by this patent's
premise that "open" family trees were subject to the data supplied by
the last person to have updated it, rather than by any analysis of what
data actually makes one "same person" consolidation more likely than
another. I do not recall that any family tree implementation simply
replaced "older submissions" with "newer submissions" - except in the
earlier days of the Ancestry group of family tree products, or in the
case of selecting a global merge by replacement option (instead of a
person by person merge record selection) in a personal genealogy
If,on the other hand, this patent is really about the process of
creating "same person" indexes to consolidate various records sources by
the primary keys of name, birth, death and/or parents for the purposes
of making it possible to construct potential family trees from many
large databases by those primary keys, then the claims are really
limited to a very rudimentary technical approach of indexing various
database partitions by keys, sequential, iterative sorting, comparisons,
and rankings by pairs of records. For those "skilled in the art" of such
technology as relational databases, neural networks, expert systems
etc., there are many other methods and systems that take a more
sophisticated approach to cross referencing and linking data in records
from databases with varying formats and correlation criteria.
This patent does not seem to have teeth in either genealogical methods
or database indexing and linking methods. But I do think that Bob
Velke's early experience with a misguided patent holder should
illustrate that patent infringement cases can be pursued by folks even
less skilled in either art!
|[APG] TGN/Ancestry patent 7,249,129 issues by Sharon <>|