[ih] Where's Multics now, was Internet-history Digest
Dan Cross
crossd at gmail.com
Mon Aug 18 09:13:22 PDT 2025
On Sun, Aug 17, 2025 at 2:51 PM John Levine via Internet-history
<internet-history at elists.isoc.org> wrote:
> It appears that Dave Crocker via Internet-history <dcrocker at bbiw.net> said:
> >On 8/17/2025 10:56 AM, John Levine via Internet-history wrote:
> >> Huh? Windows NT was widely reported to be the seconf coming of VMS
> >> and that's what's still inside Windows 11.
> >
> >Exactly.
> >
> >I forgot to connect the dots: Multics -> VMS -> Windows.
> >
> >The major point of continuity that I know about is the cost of
> >sub-processes. In unix, it's cheap. In the other approach, it is very
> >expensive. (No, I don't remember or know enough detail to fill this in
> >adequately for a technical audience.)
>
> One of the underappeciated innovations in Unix was to split process creation
> into fork() and exec() rather than a single spawn() call. Opening and closing
> files and moving file descriptors are done with normal I/O calls rather than
> needing a zillion spawn() options.
If I may be opinionated for a moment... `fork`, as implemented in
Unix, was a matter of expediency and less of design: it was easy and
relatively cheap to implement in PDP-7 assembler (Dennis Ritchie said
precisely "27 lines of assembly code"), and at least Ken Thompson
would have been familiar with `fork` as implemented in the Bekerely
timesharing system.
(https://www.nokia.com/bell-labs/about/dennis-m-ritchie/hist.html)
However, `Unix` fork is the pauper cousin of its GENIE antecedent, was
orthogonal to some other Unix-isms, and has not aged gracefully into
the multithreaded age. In general, these days, it's a poor example of
a primitive in the modern age. This paper from HotOS'19 goes into some
of the details and is worth a read:
https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf
Still, the utility of being able to do things in the child, like
modify file descriptors and so on, before executing a different
program is useful, and it would be nice to retain that. A solution
lies in refuting the assertion that the only viable alternative to
fork/exec is spawn.
An approach that would work better these days would be retaining two
operations, but changing the dividing line between what they do. A
process creation primitive could be combined with a few system calls
that let's a parent manipulate the state of a (new) child (e.g., to
pass file descriptors, credentials, and so on, to it), and then a
primitive to mark the new process as runnable. Such a scheme actually
maps closer to the original `fork` from GENIE (and, IIRC, from
TENEX/TOPS-20) than Unix fork does, and avoids pretty much all of the
issues from the HotOS paper. Note that this could be used to make
`exec` effectively a library call (and in particular, binary parsing
of executable images could happen entirely in userspace).
> When a process did a fork() it bascially swappped out the process' address
> space, assigned the swapped copy to the old process, left the swapped in copy in
> the new process, and just incremented the counts on all the open files.
These are specific implementation details; certainly true on the PDP-7
and PDP-11, but I think it would be more generally accurate to say
that that `fork` creates a _copy_ of the parent process that puts an
additional reference on resources used by the parent, as they are now
usable by the child as well.
> The bit
> that made two copies of the kernel context had the comment "you are not supposed
> to understand this".
This is incorrect. That comment was specific to the 6th Edition of
Unix, and was in `swtch`, the process context switching code, not
`fork`.
In a nutshell, in that era, Unix user _processes_ were represented in
the kernel by separate _threads_ (though they weren't called that;
they were just considered "the kernel portion of a process" and each
process thus had two stacks: one in userspace, and one in the kernel;
the kernel is shared between all processes). A context switch from one
process to another involved trapping into the kernel, which would save
userspace state and so on on, and then do a coroutine jump from the
kernel thread of the source proc to a dedicated scheduler thread
represented as a special process: PID 0, "the swapper", which has
little to do with swapping in the memory sense, and more to do with
swapping contexts. From there, another runnable process was selected
to run, and the kernel did another coroutine jump to that process's
kernel thread, from which it eventually returned to user space, thus
resuming execution in the destination process.
The "you are not expect to understand this" comment comes from a quirk
in the way that this was implemented in 6th Edition on the PDP-11: the
code that does the actual register saves and restores to switch from
one thread to another is split into two "functions" written in
assembler: code to save register and stack state, and code to restore
from a previously saved state. However, the code that saved that state
arranged things so that a subsequent restore would resume execution in
its caller's caller; that is, it borrowed its caller's stack frame for
restoration and so, when restored, execution would resume one frame up
in the call graph. Critically, execution resumed in some function
different from the one that saved state. Since the function epilogues
emitted by the Unix C compiler on the PDP-11 generated code that was
uniform in the way it restored the stack and registers, this worked.
However, when they began porting to the Interdata machines, the
function epilogue differed in ways determined by how a function was
called, and the scheme broke down; `swtch` could be called from
multiple places.
The real solution is to arrange for the state restoration code to
resume execution from the save code function, not that function's
caller, and differentiate based on the return value, akin to how
`setjmp` and `longjmp` work, which is more or less what they did for
the 7th Edition. The hack they put into 6th Edition was meant as a
temporary fix, and indeed was removed in 7th, along with that comment.
Dennis Ritchie went into some detail here:
https://www.nokia.com/bell-labs/about/dennis-m-ritchie/odd.html
> These days fork() isn't as simple as it used to be since a process typically
> has several shared libraries each with R/O and R/W pages but it is still a
> lot simpler than posix_spawn().
It's even worse than that. Consider a program with multiple
(userspace) threads of execution and how these interact with `fork`:
according to POSIX 2024, "a process shall be created with a single
thread". So if a multithreaded process has two threads, A and B, and A
acquires a mutex, and then B forks while A still has that mutex, then
the new process will resume executing in whatever code B and been
running, the mutex will be locked, but there will be no A to release
it. Consequently, POSIX says that multithreaded processes can only
invoke async-signal-safe operations until the process does an `exec`.
`posix_spawn` is often used as an escape hatch for this kind of thing.
Sadly, `posix_spawn` is often implemented in terms of `vfork`, and has
its own issues, some of which are documented by my colleague Rain
Paharia here: https://nexte.st/docs/design/architecture/signal-handling/
> 3BSD had the ugly vfork() kludge which, observing that most fork() calls are
> soon followed by exec(), pauses the parent process and gives the address space
> to the child until it calls exec() or exit().
>
> I believe the motivation was that the VAX-11/750 had microcode bugs that made
> read-only stack pages and hence copy-on-write sttacks not work. DEC declined to
> fix it since it didn't affect VMS. The sensible approach would have been to make
> the shared stack pages copy-on-touch rather than copy-on-write, which would have
> preserved fork() semantics at little extra cost, but nooooo ....
I don't believe that's true, and I've never found a source that
suggests that this is why CoW wasn't implemented in 3BSD.
The earliest descriptions of `vfork` that I have found suggest it was
added for efficiency reasons, and that, while CoW (or CoT for the
stack) was considered superior, it wasn't implemented because it would
have been a lot more work to build given the overall structure of the
Unix kernel in that era: it just wasn't designed for that kind of
thing, and they would have had to make much more invasive changes to
the system to set up the kind of sharing needed for CoW fork.
(http://roguelife.org/~fujita/COOKIES/HISTORY/3BSD/design.pdf)
London and Reiser's VM system would be a counterexample of a virtual
memory system for Unix that _did_ implement CoW on the VAX.
And of course, the stack was pretty small back then: they could have
just copied that part, shared R/O text, and kept RW data CoW.
> PS: TENEX had a fork call but I believe that the usual way to create
> a process was for the parent to create the fork, then do calls to
> manage the child including mapping in the other program and then
> start it. One option was a Unix style fork with shared copy-on-write
> pages but I don't know how much people used it that way. I also didn't
> see any reasonable equivalent of exec(), for a process to whomp another
> program on top of itself.
According to Dan Murphy, "Three systems most directly affected the
design of TENEX -- the MULTICS system at MIT, the DEC TOPS-10 system,
and the Berkeley timesharing system for the SDS 940 computer."
(https://opost.com/tenex/hbook.html). The term "fork" in TENEX (and
thus inherited in TOPS-20) is synonymous with "process"), and the
operation that created them is closer in spirit to that of GENIE than
the simple case in Unix; TEN-SYS 7 goes into this
(https://walden-family.com/bbn/10-SYS/TEN-SYS-7.pdf).
Processes in VMS, TOPS-20, and Multics all work a bit differently than
in Unix, in that multiple programs can reuse the same process over
time, and so processes tend to be long-lived things; in TENEX, the
command to "halt" a process (the `HALTF%` JSYS) is not analogous to
`exit` Unix, in that the calling process is not destroyed, but rather,
it simply stops the process so that it's not scheduled, but it can be
resumed at some other time, perhaps after being loaded with a new
command.
- Dan C.
PS: as to the original question about "what's the status of Multics
now?": It's alive! One can download and run it, albeit in a simulator:
https://multicians.org/simulator.html
More information about the Internet-history
mailing list