(Recorded on 2022-04-28)
In this lecture we turn our attention to parallel computing. Although we have looked at some types of hardware- and compiler-supplied parallelism (pipelining and SIMD), we haven’t had to explicitly manage multiple threads of program control ourselves. That will be the focus of the remainder of this course.
There is a fundamental mental shift we have to make in thinking about computing at this point however. Up until now we have been thinking about a computation as a sequence of instructions carried out one after the other by the a CPU core – a single sequence of instructions. The seismic shift – from which all of the different models of parallel computing will follow – is that our “program” is going to consist of multiple sets of sequences of instructions. Each set of instructions – each instruction stream or each thread – will itself be executed by a core in the same fashion that we have been considering in the course so far. The core gets an instruction, puts it in the execution pipeline, gets the next instruction and so on. Now we just have multiple instruction streams, each of which is following this process.
Some of the subtleties in parallel programming has to do with writing programs such that they can be expressed as multiple threads of execution, rather than just one. To make sure a computation that is expressed in this way is correct, we have to make sure dependencies between data and computation are respected – which can limit how many threads we can have or how many threads we can run concurrently. And we have to distinguish between concurrency, which means that computations can be run independently of each other (order doesn’t matter), and parallelism, which means computations are running truly at the same time on separate execution units. Concurrent programs aren’t necessarily parallel, but parallel programs must be concurrent.
We introduce parallelism and concurrency with an anthropomorphized example of grading a set of exams. We consider different ways of spreading the work of grading exams (and grading questions on the exams) among different graders. In class we analyzed how we might gain efficiency with more graders (a shorter time before all the exams were graded) and noted that parallelism and not just concurrency was important for speedup. Although spreading the work out among different graders might result in 1/22 of the work per grader and thus 1/22 of the time spent per grader, the time to get all of the exams graded is not 1/22 of the previous time unless all of the graders start at the same time and complete at the same time – and that time is 1/22 of the time it would take one grader.
In computing,
a process is an abstraction for a collection of resources to represent a (running) program.
It has five stages in its lifetime: new, ready, running, waiting, and terminated.
When the operating system creates a process, it invokes fork()
to create a new process. fork()
causes creation of a new process.
Upon successful completion, fork() returns a value of 0 to the child process and the returns the process ID of the child process to the parent process.
In computing, a context switch is the process of storing the state of a process or thread, so that it can be restored and resume execution at a later point. Below is an illustration of how two running processes P0 and P1 switch contexts.
In computing, a thread is an abstraction of execution (using the resources within a process). A thread is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system. Threads can share a process address space.
We can use std::thread
to create threads (“fork”) in a process.
Each thread works on a task within a process.
Once the task assigned to a thread finishes, the thread is going to “join”.
Concurrency means when two or more tasks can start, run, and complete in overlapping time periods.
Parallelism means when tasks literally run at the same time.
A process is an abstraction for a collection of resources to represent a (running) program. It manages resources such as CPU, Memory, Address space, etc. A process is not light weighted, meaning it is expensive to create, and terminate.
A thread is an abstraction of execution (using the resources within a process). It is the smallest sequence of programmed instructions. Threads can share an address space of a process. A thread is light weighted, meaning it is less expensive to create and terminate.