Thursday, June 10, 2010

Process and Thread In Linux

Process:
A process is a unit of execution that can be scheduled. Examples include the shell in Linux, and the invocation of a command from the shell. In Linux, one can find process manipulation functions in the header file unistd.h. A process is identified by pid_t, defined in sys/types.h, where pid refers to the Process ID. Linux has a special process called init from which all other processes are spawned. The init process has an ID of 1. The ID of the process your program
is running in can be obtained with the function getpid(). Every process, except init, has a parent process. The parent’s ID can also be obtained with the function getppid(). To view all processes running on a system, the user can use the shell command ps with the option -e. Each process’s ID, parent’s ID, and command that started it is displayed. To end a process, the kill command can be executed with the process’s ID as an argument. When killing a process, the kernel sends a SIGTERM signal to it. A handler could be implemented in the program to intercept the signal so that, for example, files or connections could be released before stopping the process gracefully.
When a program is running, it can be terminated with the exit function. There are several ways to create a process. The first is with the system function, which will run the command given as an argument in a shell. The second is to fork the process; a new child process will begin running the exact same program as its parent, beginning at the statement after the fork command. All memory is copied from one process to the other, and no memory is shared between them. Conflicts could arise from the use of resources such as files. Finally, the exec function can switch the program being run altogether. Using exec does not result in concurrency like fork can; it only changes the program. It is as though the code has been swapped out for new code. Note that while it used to be fashionable to create multiple processes for each program, there is a big performance hit for creating and communicating between them, so it is much less prudent to do this today. An exception is agent-style programming where multiple processes are still used.
It is sometimes desirable to give a particular process more execution time than the others. This can be done with the nice command, where a lower number means higher priority.

Thread:
Threads are often a preferable alternative to processes, especially for simple tasks. While a process is handled by the kernel, threads are contained and managed by processes. This results in higher performance and the ability to share memory. In fact, thread creation is 10 to 100 times faster than process creation. Despite these pros for using threads, there are dangers involved as well. Because memory is shared between threads, data must be thread-safe; that is, mechanisms must be put into place to ensure that the integrity of data is maintained even if multiple threads are trying to access it at the same time. Furthermore, race conditions can occur where the order and timing of events between multiple threads unexpectedly determine the outcome of a procedure. For example, suppose that a user starts a chat on one server, thereby getting channel-operator privileges. At the same time, a second user with the same name tries to start another chat on the same network, but on a different server. Neither server has received notice of the other server’s channel allocation, so the users will be able to access each others’ chats with higher privileges because they have the same name. To create a thread in Linux, the pthread_create function is used:
int pthread_create(
pthread_t * thread,
pthread_attr_t * attr,
void * (*start_routine)(void *),
void * arg;
The new thread will execute concurrently with the calling thread. The function start_routine will be called with the argument arg when the new thread starts, and attr indicates the attributes to be applied to the new thread. When start_routine ends, the thread will be terminated implicitly. Alternatively, the thread can be explicitly terminated with pthread_exit. If the thread was joined to another, the joined thread will begin execution when this thread has terminated via
either method. A thread may be run in detached mode with the function pthread_detach, guaranteeing that memory resources consumed in the function’s argument thread will be freed immediately when the thread terminates. This causes the thread to be nonjoinable, so other threads cannot synchronise on the termination of this thread. To help make programs run thread-safe, a pthread_mutex may be used. Such a mutual exclusion mechanism will help ensure that concurrent access to certain data is thread-safe. The function pthread_mutex_lock may be called in an attempt to get exclusive access to the data it protects for a particular thread. If another thread already has the lock, the calling thread will be suspended until the lock is released. To unlock access, pthread_mutex_unlock is used. Note that the pthread facilities are available as C code. It may be desirable to make use of them in C++. One way to do this is to use a C++ class that contains a reference to a thread via the pthread_t type. When the thread is created in the class’s initialization code, a static (non-class) function can be called via start_routine. Then, this main entry point for the thread can call back to a class function via its argument.