[ih] Diff double entry. OCR. Re: RAND Unix Port code
Bill Ricker
bill.n1vux at gmail.com
Sun Feb 19 08:54:08 PST 2017
Give ten kids 20 pages each and diff the results...:o)
/Bernie\
We did exactly that 39 years ago with n=2 for survey data not received on
op scan media.
My first paid coding job. I was supposedly one of the data entry undergrads
but I took over support of Fortran4 and PL1 diff and column swap utilities.
(Also took op scan sheets to batch window and used offline accounting
machines with the opscanned card deck, because my time using the offline
sorted was cheaper than mainframe sort job. I hope I'm the youngest person
who can say that!)
With NL text, might want a format forgiving diff unless setting rigid
guidelines.
Having been researching in Google Books and ArchiveORG/PG PDFs of early and
mid 20thC publications for a project unrelated to IH (but WWW is saving me
N-1 trips to NARA), I'll concur that scan quality is very important for the
OCR, and not just resolution. Contrast, focus, alignment all matter.
Especially with elegant light "Modern" typefaces. An adaptive BW scan
particularly helpful.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://elists.isoc.org/pipermail/internet-history/attachments/20170219/3d4fe3bb/attachment.htm>
More information about the Internet-history
mailing list