fork safety 以及多线程环境下注意事项 |

多线程下进程安全及 fork 使用注意事项

非进程安全容易导致死锁
未考虑 fork 时文件的处理 fcntl(fd, F_SETFD, FD_CLOEXEC)

概念

Reentrant Function 可重入函数
Thread-Safe Function 线程安全函数
signal-safety - async-signal-safe functions 异步信号安全函数

fork & fork safety & signal-safety

$ man fork
    Note the following further points:

    *  The child process is created with a single thread—the one that called fork().  The entire virtual address space of the  parent  is  repli‐
        cated  in the child, including the states of mutexes, condition variables, and other pthreads objects; the use of pthread_atfork(3) may be
        helpful for dealing with problems that this can cause.

    *  After a fork() in a multithreaded program, the child can safely call only async-signal-safe functions (see  signal-safety(7))  until  such
        time as it calls execve(2).

只复制了父进程中当前线程到子进程，如果父进程是多线程，那个其他线程在子进程中消失
子进程得到父进程拷贝，包括互斥锁、条件变量…，这会导致死锁，是问题的根源
互斥锁，就是多线程fork大部分问题的关键部分
可以使用 pthread_atfork 处理部分问题，例外如库函数不受控制，还需要考虑锁的顺序，在大型程序中无实用价值
fork 后，在执行 execve 前，只能调用 async-signal-safe functions，其他函数都有风险

死锁模型如下：

T1 持有锁
T2 fork 后，子进程中只存在 T2 线程，获取不到锁导致死锁
pthread_atfork 只能解决简单模型
fork 之后，execve 之前，只能调用 async-signal-safe functions
libc 中很多函数都使用锁，而且不是 async-signal-safe functions，例如 malloc/free
fork 之后，execve之前，使用内存分配函数存在风险
execve 后会重置所有进程状态，之前线程获取的锁会直接被销毁

查看 async-signal-safe functions 列表：

$ man signal-safety

uclibc & glibc

unsafe use of malloc after fork
glibc 中 malloc 不存在问题

malloc/malloc.c (malloc_atfork, free_atfork): Remove
Remove malloc hooks from fork handler. They are no longer needed because malloc runs right before fork, and no malloc calls from other fork handlers are not possible anymore.
uclibc 中 malloc 存在问题

解决方案

fork 后立即调用 execve，两者之间只允许调用 async-signal-safe functions
在程序启动其他线程前 fork 子进程，需要考虑 C++ 静态构造函数
在多线程中避免使用 fork，使用 pthread_create 代替

概念

fork & fork safety & signal-safety

uclibc & glibc

解决方案

Ref