When creating a child process, we use the same trick as we do for file descriptors, and allocate a Box {}
whose address + child process info is stored in a look-aside table. This box is stored in the type of the returned ChildProcess
, i.e. as something like
ChildProcess := { tag: Box {} }
When we lose all references to the ChildProcess
and the box is dropped, we need to decide what to do with the child process.
If you lose a reference to a child process, do you really want it to run forever?
I am hard-pressed to think of realistic use cases where you want to spawn some children but not await their results.
The one other case I can think here is that you only need some auxiliary information from the child, e.g. some stdout/stderr, and you are happy to drop the process after that information is acquired.
However, even in this case, you likely would want to wait/kill the child explicitly - when exactly a reference is dropped is not guaranteed by the compiler, and a process getting SIGKILL’d without user intervention doesn’t seem like the right thing to do.
Even if children are kept alive, the platform likely wants to enforce that all spawn subprocesses will be dead after Roc-code exit, and certainly before the host program exits. Otherwise, we can easily end up in a position where zombie processes are laying around and their resources (e.g. PIDs) haven’t been reaped by a host OS.
wait
or kill
it yourself - otherwise, how are you to know how it exited? It feels important to put in some guardrails such that it is harder for a user to lose references to a child process without explicitly ensuring its exit.By default, losing all references to a child process results in the host checking the exit status of the child process. If the child has already exited, the operation is a no-op aside from de-allocation/de-registry of the box marker in the child-process-lookaside table. If the child is still alive, the host should kill the child.
The user can opt-in to letting the child process live forever with a forced death after Roc process termination by specifying an appropriate flag to launch
; see the API below.
Ideally, the host will provide some notification to the user that the dealloc-based conclusion of the child’s lifetime was likely in error, and that the user should instead explicitly wait/kill the child. I’m not sure the best way to go about this, we could either
roc_panic
- this seems like the most visible option, so maybe the best. But, it makes iterative development worse, since you might get a panic in a program you’re developing, that otherwise works completely fine.To better support use cases where you’d like to capture standard output and standard error a child process, I propose we introduce an API like the following for directing IO:
# TODO: better name?
ProcessOutput := [
Captured,
PipeTo (FileDescriptor [Write]),
DevNull,
]
# Pipes the process output to a file descriptor that can be written to.
# This either a proper unix file descriptor, or a file handle on Windows.
# See also <https://roc.zulipchat.com/#narrow/stream/231635-compiler-development/topic/zero-sized.20allocations/near/287235288>
#
# This gives users a way to create `ProcessOutput`s from arbitrary file descriptors, for example if
# they want to pipe stdout/stderr to files, or to standard pipes like the parent stdout/stderr.
pipeTo: FileDescriptor [Write] -> ProcessOutput
# The output of a child process should be captured.
# Implemented as-needed by the operating system; for example, creating an anonymous
# pipe on unix or anonymous handle on Windows.
capture : ProcessOutput
# The process output should be ignored.
# This is equivalent to piping to the null device, which is `/dev/null` on POSIX or `nul` on Windows.
# This is equivalent to `pipeTo Fd.DevNull` (if/when that exists)
devNull : ProcessOutput
Unlike ProcessOutput
, ProcessInput
has no “captured” variant because you cannot capture stdin, it can only be explicitly provided to a child process! As such, ProcessInput
can only be constructed from a file descriptor/handle that is readable.