钛节点

翻译《select简史》上篇

发表@2023-07-07 17:35:46

更新@2023-07-07 18:02:01

嗨，大家好，我是老李，最近我顺藤摸到了一个老外的博客，这个老外有两把刷子，他和70年代那一批*NIX建设者还能保持不错的联系。我翻了翻他的博客列表，决定先从《A brief history of select(2)》开始翻译，原文比较长所以我分上下两篇翻译。

小小提醒一把：本文内容尤其是超链接引用内容，质量非常干燥硬霸！一定要看！！

原文地址：https://idea.popcount.org/2016-11-01-a-brief-history-of-select2/

Recently I've been thinking about the multiplexing in Linux, namely the epoll(7) syscall. I was curious if epoll is better or worse than the iocp or kqueue. I was wondering if there was a benefit in batching epoll_ctl calls. But let's step back for a while, before we start a serious discussion we need to get some context. Most importantly - is file descriptor multiplexing an aberration or a gentle extension to the Unix design philosophy's?
最近哥一直在思考Linux下的多路复用，也就是epoll系统调用，哥很好奇epoll、iocp与kqueue孰强孰弱，哥也想知道epoll_ctl的批量调用（就是将频繁的多次epoll_ctl操作合并成一个后一次性操作，规避内核trap导致的效率低下）是否有裨益之处。在开始这次认真地探讨之前，我们得先淡定淡定回顾一下过往，我们需要了解更多的上下文信息。最重要的问题是：文件描述符的多路复用是符合UNIX设计哲学的良好扩展还是个糟粕玩意？（aberration是个名词，是反常现象、异常现象的意思，短语in a moment of aberration就是一反常态的意思）

To answer these question we must first discuss the epoll predecessor: the select(2) syscall. It's a good excuse to do some Unix archaeology!
为了回答这些问题，我们首先得讨论下epoll的前辈（predecessor，前辈的意思）：select系统调用。总算找到一个理由挖一波儿UNIX早期设计的坟了（这句直译：这是一个进行Unix考古的良好借口）。

In mid-1960's time sharing was still a recent invention. Compared to a previous paradigm - batch-processing - time sharing was truly revolutionary. It greatly reduced the time wasted between writing a program and getting its result. Batch-processing meant hours and hours of waiting often to only see a program error. See this film to better understand the problems of 1960's programmers: "The trials and tribulations of batch processing".
在20世纪60年代中叶，分时系统依然还是个新鲜玩意（近期的发明）。相对于之前的批处理系统，分时系统是个巨大的变革，它极大减少了写程序与等待程序处理结果之间的时间浪费。批处理系统意味着你等了小时小时又一小时后最终只能看到一个程序错误。为了更好理解1960那个时代程序员的苦逼，请欣赏一下这个影片：《批处理系统从入门到放弃》。

Early Unix
Then in 1970 the first versions of Unix were developed. It's important to emphasize that Unix wasn't created in a void - it tried to fix the batch-processing problems. The intention was to make a better, multi-user, time-sharing environment to speed up most common tasks. The "common tasks" were mostly: executing programs requiring heavy CPU computations and heavy disk access.
第一个版本的Unix是1970年问世的，需要重点强调的是Unix并不是被凭空创造（的KPI产物，括弧里的“的KPI产物”是本人尝试做的本土化润色，非英文直译） --- 它的诞生是为了解决批处理系统的问题，它的设计初衷是为了提供一个更好的多用户分时系统环境，可以多快好省地解决人们的常规任务，这里常规任务大多是指：烧CPU的程序或者需要频繁磁盘IO的程序。

These days when a program was executed, it could "stall" (block) only on a couple of things:
1、wait for CPU
2、wait for disk I/O
3、wait for user input (waiting for a shell command) or console (printing data too fast)
现如今一个程序执行时，只有在遇到下面几种情况时候会被阻（stall是抛锚、停顿的意思，block则是阻塞的意思）：
1、等待CPU就绪
2、等待磁盘IO就绪
3、等待用户输入就绪（shell里输入命令）或终端（终端里正在吐大量数据）

Take a look at the Linux process states. The above "stalls" are represented as: R, D, S process states.
咯一眼Linux中的进程状态，前面描述的“阻塞”可以被表述为处于R、D、S状态。（这里作者超链接引用了https://idea.popcount.org/2012-12-11-linux-process-states/）

Processes in early Unix couldn't do much more really. There was a pipe(2) and later a named pipe abstractions, but that's about it.
早期Unix中进程很简单，它具备一个pipe功能（看作者意思应该是无名的），后来又引入了命名管道的抽象设计，但也就这样了。

Let's take a closer look at the pipe(2)2. A colleague of mine @dwragg found this gem: "UNIX Time-Sharing System: A Retrospective" by Ritchie from 1978. Here's a relevant snippet on the page 1966 (page 20 in the PDF):
接着继续深挖一下pipe系统调用的坟。我的一个大学同学@dwragg发现了这么一个宝贝儿（gem有钻石、美好的事物的意思）：丹尼斯里奇于1978年发表的“UNIX Time-Sharing System: A Retrospective”，下面截取了其中与之相关的一小段：

There is no general inter-process message facility, nor even a limited communication scheme such as semaphores. It turns out that the pipe mechanism mentioned above is sufficient to implement whatever communication is needed between closely related, cooperating processes. [...] Pipes are not, however, of any use in communicating with daemon processes intended to serve several users.
（说真的，专业级的paper是真不好翻译，因为双方面理解都不透彻：一是对实现机制与历史渊源不了解，二是在一的基础上对复杂英文语句表示懵）没有通用的进程间通信机制，甚至连信号量这种有限的通信手段也没有。这证明前天提到的管道机制足够应付紧密协作的进程之间的任何通信。然而管道这种通信机制，并不能满足为多名用户提供网络服务的守护进程程序。（in any use of，短语，就是有用、有帮助的意思，但是这个句子有一个not，所以主体意思就变成了sth is not any use of in doing sth，也就是没有用的意思。举个例子，Sadly, nothing you have learned to date is of any use with your new client，我们剥离掉所有修饰性词语，句子核心是nothing is of any use with your new client，对你的新客户来说都没有用，所以再加上所有修饰性词语后完整句子的意思就是：很不幸的是，你学到的一切对于你的新客户来说都没啥卵用）

Here, Ritchie seem to have confirmed that synchronous pipe sufficed as the basic inter-process communication facility.
丹尼斯里奇几乎已经认定同步管道作为进程间通信的基础机制是完全足够的。

It might well have been sufficient! In 3BSD the processes were limited to maximum of 20 file descriptors. Each user was limited to 20 concurrent processes. These systems were really rudimentary. There just wasn't a need for IPC or complex I/O.
这种简单的同步管道机制在当时或许是完全足够的。在BSD系统中（Unix系统的伯克利大学分支）每个进程限制最多打开20个文件描述符，每个用户最多允许20个并发进程，当时的系统确实是比较原始简陋的，压根就用不到复杂的进程间通信或复杂的IO机制。

For example, in early Unix'es there was no idea of file descriptor multiplexing. A good example is the cu(1) Call Unix command. The man page says:
举个例子，早期Unix中对文件描述符的多路复用毫无想法，比如cu命令，man说明页上是这么说的：

When a connection is made to the remote system, cu forks into two processes. One reads from the port and writes to the terminal, while the other reads from the terminal and writes to the port.
当与远程系统建立了一个网络连接后，cu会fork成两个进程。其中一个进程从网络端口中读取数据然后输出到终端上，同时另外一个进程从终端读取信息后再通过网络端口写入发送到远程系统。（cu命令非常有意思，我也是头一次看到这个上古时代的命令，各位有兴趣搜索一下）

This makes sense. All of the I/O was blocking. The only way to read and write at the same time was to use two processes.
这是一件很有意义的事情（make sense短语），因为所有的IO都是阻塞的，在当时唯一实现同时读写的办法就是使用两个进程。

As a side note, if you are a Golang programmer, this may sound familiar. In Golang read and write calls usually block so you are forced to use two coroutines when you want to read and write at the same time.
作为旁注（这个短语有点儿意思，今天刚学到，不过我认为本土化翻译的话，可能用“我多说几句”更好？），如果你是个Golang码子，应该会感到熟悉。在Golang中读和写经常会阻塞，所以你被迫使用两个gourouting实现读写。

The TCP/IP is born
This all changed in 1983 with the release of 4.2BSD. This revision introduced an early implementation of a TCP/IP stack and most importantly - the BSD Sockets API.
然而到了1983年，伴随着BSD 4.2的发布，这一切都变了。这个修订版本BSD引入了TCP/IP网络栈的一个早期实现，最重要的是：还引入了BSD Socket API。（老李本人多叨叨两句，一是截止到目前为止互联网世界网络编程相关的依然基于这些东西，二是实现这些东西的人叫做Bill Joy，除此之外，Bill还是BSD Unix创始人、Pascal作者、Sun公司创始人、Java实现核心贡献者（JVM就是他贡献的）、J2EE中间件规范设计者、Vi编辑器作者、Solaris系统核心创始人）

Although today we take the BSD sockets API for granted, it wasn't obvious it was the right API. STREAMS were a competing API design on System V Revision 3.
尽管现在我们认为BSD Socket API是理所应当的一个存在，但是它并不一定是合适的API实现。在SVR3中，STREAMS（流）就是一个竞争者。（take something for granted，短语，认为xxx是理所当然的；坦白说，这句我没太理解，也感觉翻译的不太好，权当复习了take something for granted吧）

With the BSD Sockets API came the select() syscall. But why was it necessary?
伴随着BSD Sockets API诞生，select系统调用也应运而生，但是为什么这是必要的？

I always thought that the "proper" Unix way to write network servers in was to create one worker process for each connection. In case of TCP/IP servers, this meant the accept-and-fork model:
哥时常认为在Unix环境中，“正确的”服务器编程方式应该是为每一个网络请求创建一个进程。在TCP/IP服务器编程中，这叫做accept-and-fork模式：（这个我就不细说了昂，在高性能API社区中曾经介绍过各种服务器编程模式；这里我们需要额外注意in case of这个短语用法，因为in case、in case of和in the case of三个短语都有，所以各位可以搜索一下做下笔记）

```php
sd = bind();
while (1) {
    cd = accept(sd);
    if (fork() == 0) {
        close(sd);
        // Worker code goes here. Do the work for `cd` socket.
        exit(0);
    }
    // Get back to `accept` loop. Don't leak the `cd`.
    close(cd);
}
```

While this model may be sufficient for writing basic network services , it's not enough for non-trivial programs.
虽然这种服务器进程模型应对基础简单的网络服务可能是足够了，但是对于复杂的程序是绝对不够的。（trivial是形容词，意思是不重要的、琐碎的、微不足道的；non-trivial是棘手的、不容易解决的、重要的，形容词）

未完待续... ...