[ih] Archiving Internet history

Thu Feb 16 18:29:51 PST 2023

yes, but the business model that puts them out of print causes the LOCKSS
mechanism to stop propagating

v

On Thu, Feb 16, 2023 at 8:13 PM John Day via Internet-history <
internet-history at elists.isoc.org> wrote:

> “out of print” often means they are more expensive. ;-)  or at least they
> are in a library for free.
>
> It doesn’t mean they don’t exist, right?
>
> > On Feb 16, 2023, at 18:13, vinton cerf via Internet-history <
> internet-history at elists.isoc.org> wrote:
> >
> > re: books - you have surely heard "out of print" for economic/demand
> > reasons...
> > v
> >
> >
> > On Thu, Feb 16, 2023 at 1:37 PM Jack Haverty via Internet-history <
> > internet-history at elists.isoc.org> wrote:
> >
> >> My best thought for proactive archiving is based on a business model,
> >> involving giving an "archive" some value.   My "print everything and
> >> make a book" suggestion was only partly whimsical. If a book exists,
> >> someone will sell it (priced low enough so anyone can afford it).   The
> >> bookseller(s) du jour (e.g., Amazon) will treat it as part of their
> >> SKUs, and presumably preserve it as long as there is interest in it
> >> (i.e., buyers).   Even if a company shifts priorities or goes out of
> >> business, their "assets" remain, including the books they've been
> >> selling, and will likely be sold to someone else.   This has been
> >> happening with various kinds of media, e.g., movies, TV shows,
> >> recordings, etc.  When no one cares about the book any more it may
> >> disappear.  But if no one cares...
> >>
> >> My second choice is to simply transmit all of the content into deep
> >> space.   It will travel forever at the speed of light, and can preserve
> >> an enormous amount of content.  Future researchers will be able to
> >> access the archive once they've solved the technical problem of how to
> >> catch up to it, and capture and decode the contents. Just like today's
> >> researchers are now able to look at what happened just after the Big
> >> Bang by looking at the signals that are just getting here using the
> JWST.
> >>
> >> Jack Haverty
> >>
> >>
> >> On 2/16/23 10:08, Joe Touch via Internet-history wrote:
> >>> FYI, even cemeteries don’t preserve things “forever”. Plots are leased,
> >> not sold. Libraries disappear, universities dissolve, and churches are
> sold
> >> and rebuilt.
> >>>
> >>> I don’t think this group will find a new solution.
> >>>
> >>>> On Feb 16, 2023, at 10:03 AM, vinton cerf via Internet-history <
> >> internet-history at elists.isoc.org> wrote:
> >>>>
> >>>> John,
> >>>> thanks for your thoughtful intervention. Your conclusion leads me to
> >> wonder
> >>>> about business models that might produce the desired resilience.
> >>>> Preservation by accident is not a plan and so often that's all that we
> >>>> achieve.
> >>>>
> >>>> v
> >>>>
> >>>>
> >>>>> On Thu, Feb 16, 2023 at 1:08 AM John Gilmore <gnu at toad.com> wrote:
> >>>>>
> >>>>> vinton cerf via Internet-history <internet-history at elists.isoc.org>
> >> wrote:
> >>>>>> wow thanks for this lengthy history. So many familiar names. I sure
> >> hope
> >>>>>> this mailing list does get archived properly as it contains a wealth
> >> of
> >>>>>> information it would be hard to re-create in the future.
> >>>>> Besides the internet-history mailing list's archives here:
> >>>>>
> >>>>>  https://elists.isoc.org/pipermail/internet-history/
> >>>>>
> >>>>> I have also been using an Archive-It account to make periodic copies
> of
> >>>>> that web site in the Internet Archive here:
> >>>>>
> >>>>>
> >>>>>
> >>
> https://wayback.archive-it.org/15071/20230114211520/https://elists.isoc.org/pipermail/internet-history/
> >>>>>
> >>>>> These are accessible via the Wayback Machine as well as via
> >>>>> the page for this collection, here:
> >>>>>
> >>>>>  Internet and Unix History
> >>>>>  https://archive-it.org/collections/15071
> >>>>>
> >>>>> As you can see there, it's set up to periodically scan various other
> >> URLs;
> >>>>> please suggest others that are of historic interest, and I can add
> >> them.
> >>>>>
> >>>>> FYI, the Wayback Machine does not necessarily get deep copies of
> every
> >>>>> web site.  Their focus is on breadth, so if a website has a thousand
> >> web
> >>>>> pages, perhaps they will get 50 or 100 of them in each crawl.  Also,
> >>>>> there are enough websites which are designed to "trap" a web crawler
> >> and
> >>>>> cause it to waste a lot of its time, storage and bandwidth uselessly,
> >> so
> >>>>> the main crawler doesn't keep going.  So, if there's a deep
> collection
> >>>>> (for example, ALL the source code to reproduce a popular Linux
> >>>>> distribution) that you think is worth saving for the future,
> >>>>> Archive-It.org is one way to get it saved for posterity.
> >>>>>
> >>>>> Also FYI, the Internet Archive is an example of the philosophy of
> >>>>> putting all your eggs in one basket and watching that basket
> intently.
> >>>>> The (untested) theory is that the collection will be too valuable to
> >> let
> >>>>> it fall apart later.  A distributed system (like LOCKSS for example)
> >>>>> would provide higher likelihood of stuff surviving the next hundred,
> >>>>> thousand or 10,000 years.  The Archive is keeping two or three
> >>>>> replicated copies of each item they have, and copying them forward
> onto
> >>>>> newer and fatter drives, but all of them are under the same
> >>>>> administration and owned by the same nonprofit.  Brewster Kahle is
> the
> >>>>> sparkplug and the main funding source; control of that nonprofit will
> >> be
> >>>>> in the hands of a small number of
> probably-less-competent-and-virtuous
> >>>>> people after Brewster is no more.  Hell, during the pandemic, ONE GUY
> >>>>> was responsible for swapping out failed disk drives before the only
> >>>>> second copy of a failed drive happened to also fail.  Bit-rot sets in
> >>>>> quickly, and five or ten years of merely incompetent system
> >>>>> administration would make a shambles of this finely tuned machine.
> Not
> >>>>> to mention the possibility of malicious intrusion, particularly by
> >>>>> people or governments who want to destroy the historical evidence of
> >>>>> whatever bad stuff they've been up to.
> >>>>>
> >>>>> It would be better if there were ten Internet Archive nonprofits (or
> >>>>> government agencies) scattered around the planet.  Each of them would
> >>>>> ideally be taking copies of each others' full holdings, as well as
> >> doing
> >>>>> their own crawls of the live web, and scanning in whatever physical
> >>>>> cultural works they are particularly interested in.  Anybody know any
> >>>>> Internet billionaires or spy-agency VP's who want to catalyze and
> endow
> >>>>> a second Internet Archive?  The big advantage for spy agencies is
> >>>>> stealth; you can look anywhere you want in your own archive, and
> nobody
> >>>>> knows where you are looking.
> >>>>>
> >>>>>        John
> >>>>>
> >>>>>
> >>>> --
> >>>> Internet-history mailing list
> >>>> Internet-history at elists.isoc.org
> >>>> https://elists.isoc.org/mailman/listinfo/internet-history
> >>
> >> --
> >> Internet-history mailing list
> >> Internet-history at elists.isoc.org
> >> https://elists.isoc.org/mailman/listinfo/internet-history
> >>
> > --
> > Internet-history mailing list
> > Internet-history at elists.isoc.org
> > https://elists.isoc.org/mailman/listinfo/internet-history
>
> --
> Internet-history mailing list
> Internet-history at elists.isoc.org
> https://elists.isoc.org/mailman/listinfo/internet-history
>

-- 
Please send any postal/overnight deliveries to:
Vint Cerf
Google, LLC
1900 Reston Metro Plaza, 16th Floor
Reston, VA 20190
+1 (571) 213 1346

until further notice