1. 进程是unix系统中两个最重要的基础抽象之一(另一个是文件)
A process is a running program
A thread is the unit of activity inside of a process
the virtualization of memory is associated with the process, the threads
all share the same memory address space
2. pid
The idle process has the pid 0
The first process that the kernel executes after booting the system, called
the init process, has the pid 1
默认情况下,内核规定一个最大的进程ID值:32768(16位有符号数)
内核使用严格的线性方式为进程分配PID,如果当前所有分配的PID值最大为17,那么下一次分配的PID值为18
进程层级:每一个进程(init进程除外)都是从另一个进程引生的
Children normally belong to the same process groups as their parents
3. 运行新进程
In Unix, the act of loading into memory and executing a program image is
separate from the act of creating a new process
exec system call:loads a binary
program into memory, replacing the previous contents of the address space, and
begins execution of the new program
fork system call:create a new
process
to execute a new program in a new process :first a fork to create a new
process, and then an exec to load a new binary into that process
exec系列函数中,只有execve是系统调用,其余函数都是根据execve进行封装的C语言库函数
fork系统调用之后,子进程与父进程几乎完全相同,除了以下:
a.进程ID不同
b.挂起的信号被清除,并不被子进程继承
c.父进程的文件锁不被子进程继承
This “fork plus exec” combination is frequent and simple:
pid_t pid = fork();
if (pid == -1) {
fprintf(stderr, "fork error\n");
}
/* the child */
if (pid == 0) {
const char* args[] = {"grep", NULL};
int ret = execv("bin/grep", args);
if (ret == -1) {
fprintf(stderr, "execv error\n");
exit(EXIT_FAILURE);
}
}
4. copy-on-write
早期Unix中,实现fork非常简单,the kernel created copies of all internal data
structures, duplicated the process‘s page table entries, and then performed
a
page-by-page copy of the parent‘s address space into the child‘s new
address space, 这样拷贝实现无疑是费时的
copy-on-write是一种减小资源复制开销的延迟策略。
如果多个用户请求对它所拥有的资源进行读操作,资源副本的拷贝是不需要进行的,每个用户只需要操作一个指向共同资源的指针即可。只要没有用户试图修改这个资源,那么副本的拷贝开销就可以避免。如果有一个用户确实需要修改这个资源,那么直到这个时候,资源才复制一份,复制的一份资源被交给需要修改的用户,其余所有用户继续共享原初的资源
在虚拟内存的具体示例中,copy-on-write是以内存页为单位进行的。
进程的内存页首先被标记为 read-only and copy-on-write,
如果有进程需要修改这个内存页,就产生页错误,就在这时,内核复制该内存页,并且内存页的copy-on-write标记被清除。
现代机器体系结构都硬件层面上支持copy-on-write
5. exit
当一个进程终止时,内核清除它包含的所有资源:分配的内存、打开的文件、信号量等
结束一个程序最经典的方式不是显式调用exit,而是 "falling off the end" of the program
当一个进程终止时,内核向其父进程发送 SIGCHLD
信号。默认情况下,该信号被忽略,父进程也可以调用 sigaction 捕捉该信号
6. wait
#include <sys/types.h>
#include <sys/wait.h>
/* wait() returns the pid of a terminated child or ?1 on error. */
pid_t wait (int *status);
pid_t waitpid (pid_t pid, int *status, int options);
当pid==-1时,waitpid等待任一子进程,等同于 wait,即 wait (&status)
等同于 waitpid (?1, &status, 0)
7. system
#define _XOPEN_SOURCE /* if we want WEXITSTATUS, etc. */
#include <stdlib.h>
int system (const char *command);
It is common to use system() to run a simple utility or shell script, often
with the explicit goal of simply obtaining its return value
If command is NULL, system() returns a nonzero value if the shell /bin/sh
is available, and 0 otherwise
here is a sample implementation for system():
/* my_system - sync spwans and waits for the command
* /bin/sh -c <cmd> .
*
* Return -1 on error of any sort, or the exit code from
* the launched process. Does not block or ignore any signals.
*/
int my_system(const char* cmd)
{
int status;
pid_t pid = fork();
if (pid == -1) {
fprintf(stderr, "fork error\n");
return -1;
} else if (pid == 0) {
const char* argv[4];
argv[0] = "sh";
argv[1] = "-c";
argv[2] = "cmd";
argv[3] = NULL;
execv("bin/sh", argv);
exit(-1);
}
if (waitpid(pid, &status, 0) == -1) {
return -1;
} else if (WIFEXITED(status)) {
return WEXITSTATUS(status);
}
return -1;
}
8. 僵尸进程
a process that has terminated but has not yet been waited upon by its
parent is called a “zombie.”
Zombie processes continue to consume system resources, although only a
small percentage
如果一个父进程fork产生了子进程,那么该父进程就必须负责wait其子进程(回收僵尸子进程的所占资源)
如果一个父进程在终止之前没有wait其子进程呢? 内核会视察其所有子进程,并将这些子进程挂在init进程之下(确保任何进程在任何时候都有一个父进程),init进程会周期性wait其所有子进程,这些子进程最终还是会被回收资源,而不会长期以僵尸进程形式存在
9. Users and Groups
#include <sys/types.h>
#include <unistd.h>
int setuid (uid_t uid);
int setgid (gid_t gid);
uid_t getuid (void);
gid_t getgid (void);
Each process is a member of a process group,Process groups that contain
more than one process are generally implementing job control
A command on the shell such as this one:
/* one process group containing three processes */
$ cat ship-inventory.txt | grep booty | sort
10. 守护进程
A daemon is a process that runs in the background, not connecting to any
controlling terminal
通常在系统启动时开始运行,并且以root权限运行,处理系统级任务,名字后缀一般为d
一个守护进程必须满足:a. 是init进程的子进程 b. 不允许与终端连接
Linux System Programming 学习笔记(五) 进程管理,古老的榕树,5-wow.com