Scotch-Irish-L Archives

Archiver > Scotch-Irish > 2010-01 > 1263673126


From: "William H. Magill" <>
Subject: Re: [S-I] Failure notice
Date: Sat, 16 Jan 2010 15:18:46 -0500
References: <1463368312.14660761263588596014.JavaMail.root@sz0165a.westchester.pa.mail.comcast.net>
In-Reply-To: <1463368312.14660761263588596014.JavaMail.root@sz0165a.westchester.pa.mail.comcast.net>


On Jan 15, 2010, at 3:49 PM, wrote:

> Hi Donna, those Irwins (etc) causing trouble again! Speaking of the US Census, or rather people paid to index it at Ancestry.com. I wonder sometimes who they hire. No one could find this one family I was looking for. I knew where they should be living and searched the township with no name (turns up everyone). That was very enlightening. Who ever had indexed it should be fired. All the John Smith, Jr. (or whoever) were indexed with Junior as the last name. Or Senior. This was an early census -- 1820 or 30...one of those very hard to read chicken scratch censuses. The census taker had not only written down their names but also occupations. The
> bad indexer had dutifully indexed them with the occupation as the surname. So I found the man under Miller.
> Living next to Carpenter and Parson and Farmer, a couple Juniors and Seniors too. You basically couldn't find
> anyone indexed correctly in the whole township.
>
> Linda Merle

Digital records are both a boon and a bane to Genealogical research.

One wonders if this kind of error will ever be corrected, or "officially" documented.

One of the major problems with "digital" searching is exactly this --- one assumes an exact match.
Without being able to either see the entire record base, as Linda did here, or being able to see the original records, one has ABSOLUTELY
NO CLUE that the data being "searched" is in fact the data which you THINK it is.

Thank you for the story/information Linda.
It is the kind of of "gotch-ya" that confronts researchers all the time, and far surpasses the "simple" issue of different spellings for names.

I find similar kinds of errors in the various OCR projects of the Digital Library, Google and Microsoft. The scanning software simply does not or
cannot interpret certain characters in many type faces, especially of ligatures or in mixed-font printings (very common in 1800s documents) --
you either get some weird symbol in the text, or completely erroneous letter substitutions. Fortunately, the Digital Library contains multiple
formats of the document, typically an OCR generated text version and a PDF of the original document. This makes it possible, but tedious,
to take the text document and "fix" the errors contained within it.



T.T.F.N.
William H. Magill








This thread: