[ih] Archiving Internet history

John Day jeanjour at comcast.net
Thu Feb 16 17:13:18 PST 2023


“out of print” often means they are more expensive. ;-)  or at least they are in a library for free.

It doesn’t mean they don’t exist, right?

> On Feb 16, 2023, at 18:13, vinton cerf via Internet-history <internet-history at elists.isoc.org> wrote:
> 
> re: books - you have surely heard "out of print" for economic/demand
> reasons...
> v
> 
> 
> On Thu, Feb 16, 2023 at 1:37 PM Jack Haverty via Internet-history <
> internet-history at elists.isoc.org> wrote:
> 
>> My best thought for proactive archiving is based on a business model,
>> involving giving an "archive" some value.   My "print everything and
>> make a book" suggestion was only partly whimsical. If a book exists,
>> someone will sell it (priced low enough so anyone can afford it).   The
>> bookseller(s) du jour (e.g., Amazon) will treat it as part of their
>> SKUs, and presumably preserve it as long as there is interest in it
>> (i.e., buyers).   Even if a company shifts priorities or goes out of
>> business, their "assets" remain, including the books they've been
>> selling, and will likely be sold to someone else.   This has been
>> happening with various kinds of media, e.g., movies, TV shows,
>> recordings, etc.  When no one cares about the book any more it may
>> disappear.  But if no one cares...
>> 
>> My second choice is to simply transmit all of the content into deep
>> space.   It will travel forever at the speed of light, and can preserve
>> an enormous amount of content.  Future researchers will be able to
>> access the archive once they've solved the technical problem of how to
>> catch up to it, and capture and decode the contents. Just like today's
>> researchers are now able to look at what happened just after the Big
>> Bang by looking at the signals that are just getting here using the JWST.
>> 
>> Jack Haverty
>> 
>> 
>> On 2/16/23 10:08, Joe Touch via Internet-history wrote:
>>> FYI, even cemeteries don’t preserve things “forever”. Plots are leased,
>> not sold. Libraries disappear, universities dissolve, and churches are sold
>> and rebuilt.
>>> 
>>> I don’t think this group will find a new solution.
>>> 
>>>> On Feb 16, 2023, at 10:03 AM, vinton cerf via Internet-history <
>> internet-history at elists.isoc.org> wrote:
>>>> 
>>>> John,
>>>> thanks for your thoughtful intervention. Your conclusion leads me to
>> wonder
>>>> about business models that might produce the desired resilience.
>>>> Preservation by accident is not a plan and so often that's all that we
>>>> achieve.
>>>> 
>>>> v
>>>> 
>>>> 
>>>>> On Thu, Feb 16, 2023 at 1:08 AM John Gilmore <gnu at toad.com> wrote:
>>>>> 
>>>>> vinton cerf via Internet-history <internet-history at elists.isoc.org>
>> wrote:
>>>>>> wow thanks for this lengthy history. So many familiar names. I sure
>> hope
>>>>>> this mailing list does get archived properly as it contains a wealth
>> of
>>>>>> information it would be hard to re-create in the future.
>>>>> Besides the internet-history mailing list's archives here:
>>>>> 
>>>>>  https://elists.isoc.org/pipermail/internet-history/
>>>>> 
>>>>> I have also been using an Archive-It account to make periodic copies of
>>>>> that web site in the Internet Archive here:
>>>>> 
>>>>> 
>>>>> 
>> https://wayback.archive-it.org/15071/20230114211520/https://elists.isoc.org/pipermail/internet-history/
>>>>> 
>>>>> These are accessible via the Wayback Machine as well as via
>>>>> the page for this collection, here:
>>>>> 
>>>>>  Internet and Unix History
>>>>>  https://archive-it.org/collections/15071
>>>>> 
>>>>> As you can see there, it's set up to periodically scan various other
>> URLs;
>>>>> please suggest others that are of historic interest, and I can add
>> them.
>>>>> 
>>>>> FYI, the Wayback Machine does not necessarily get deep copies of every
>>>>> web site.  Their focus is on breadth, so if a website has a thousand
>> web
>>>>> pages, perhaps they will get 50 or 100 of them in each crawl.  Also,
>>>>> there are enough websites which are designed to "trap" a web crawler
>> and
>>>>> cause it to waste a lot of its time, storage and bandwidth uselessly,
>> so
>>>>> the main crawler doesn't keep going.  So, if there's a deep collection
>>>>> (for example, ALL the source code to reproduce a popular Linux
>>>>> distribution) that you think is worth saving for the future,
>>>>> Archive-It.org is one way to get it saved for posterity.
>>>>> 
>>>>> Also FYI, the Internet Archive is an example of the philosophy of
>>>>> putting all your eggs in one basket and watching that basket intently.
>>>>> The (untested) theory is that the collection will be too valuable to
>> let
>>>>> it fall apart later.  A distributed system (like LOCKSS for example)
>>>>> would provide higher likelihood of stuff surviving the next hundred,
>>>>> thousand or 10,000 years.  The Archive is keeping two or three
>>>>> replicated copies of each item they have, and copying them forward onto
>>>>> newer and fatter drives, but all of them are under the same
>>>>> administration and owned by the same nonprofit.  Brewster Kahle is the
>>>>> sparkplug and the main funding source; control of that nonprofit will
>> be
>>>>> in the hands of a small number of probably-less-competent-and-virtuous
>>>>> people after Brewster is no more.  Hell, during the pandemic, ONE GUY
>>>>> was responsible for swapping out failed disk drives before the only
>>>>> second copy of a failed drive happened to also fail.  Bit-rot sets in
>>>>> quickly, and five or ten years of merely incompetent system
>>>>> administration would make a shambles of this finely tuned machine.  Not
>>>>> to mention the possibility of malicious intrusion, particularly by
>>>>> people or governments who want to destroy the historical evidence of
>>>>> whatever bad stuff they've been up to.
>>>>> 
>>>>> It would be better if there were ten Internet Archive nonprofits (or
>>>>> government agencies) scattered around the planet.  Each of them would
>>>>> ideally be taking copies of each others' full holdings, as well as
>> doing
>>>>> their own crawls of the live web, and scanning in whatever physical
>>>>> cultural works they are particularly interested in.  Anybody know any
>>>>> Internet billionaires or spy-agency VP's who want to catalyze and endow
>>>>> a second Internet Archive?  The big advantage for spy agencies is
>>>>> stealth; you can look anywhere you want in your own archive, and nobody
>>>>> knows where you are looking.
>>>>> 
>>>>>        John
>>>>> 
>>>>> 
>>>> --
>>>> Internet-history mailing list
>>>> Internet-history at elists.isoc.org
>>>> https://elists.isoc.org/mailman/listinfo/internet-history
>> 
>> --
>> Internet-history mailing list
>> Internet-history at elists.isoc.org
>> https://elists.isoc.org/mailman/listinfo/internet-history
>> 
> -- 
> Internet-history mailing list
> Internet-history at elists.isoc.org
> https://elists.isoc.org/mailman/listinfo/internet-history




More information about the Internet-history mailing list