Lecture 1: Course Overview and Introduction to HPC

../_images/L1-title.png Lecture 1 slides Lecture 1 panopto Lecture 1 podcast

(Originally recorded 2019-04-02)

Overview

This is the first lecture of the course and I spend the first part of it going over course mechanics and the plan for the course (essentially a review of the syllabus).

To help set context for the course I also describe how HPC is used and why it is important. A quick overview of the history of HPC hardware begins with Seymour Cray and takes us through Thomas Sterling.

HPC is about enabling scientists and/or decision-makers to solve bigger problems and to solve them faster and better. As such, hardware, software, algorithm, and problem domain are inextricably linked. The plan of the course will be guided by the hierarchical structure of today’s largest computers: single core, hierarchical memory, SIMD, multicore, GPU, and clusters. As the course follows this progression we will apply the hardware at hand to solve some problems of interest. At the same time, we will be building up a set of (high-performance) abstractions for developing solid software and introducing a small set of necessary features in the C++ programming language to realize these abstractions and the programs built with them. We will also be learning about some of the tools typically used in an HPC environment.

Course Essentials

Almost all information about, and materials for, this course will be on the coure web site, which is located at https://amath583.github.io/sp21/ . Please familiarize yourself with the course syllabus and course policies. Information on the web site is definitive and supersedes what might have been previously recorded in lecture. We will still be using the course Canvas site to broadcast announcements and for homework submission.

The course site includes links to all of the course lecture videos, PDFs of all the course slides, and all of the course assignments. It also includes additional videos and notes on selected topics. Over the course of this semester I will be augmenting the course lectures with companion notes (available on each lecture page). Contact information about the course instructional staff can be found on the course web site, along with a calendar of due dates and office hour times.

There is no required course text, but I suggest Parallel Programming Concepts and Practice by Schmidt et al and Patterns for Parallel Programming by Mattson et al as supplementary references for the course material.

The primary computing resource that you will be using for most of the course will be your own laptop. For the last few assignments we will be using institutional resources for GPU programming and for distributed-memory (cluster) programming. The primary computing environment will be Linux or Mac OS X. If you have a Windows laptop, Linux is available with the Windows Subsystem for Linux. Instructions on how to set up and use your computing environment for this course will be part of the first assignment.

I also strongly recommend using Visual Studio Code for all of your software development in this course.

Course Overview

The course basically follows a canonical outline for HPC, structured according to the progression of hardware from single core CPU to clusters of multicore CPUs (with a detour for GPUs). Each step along this progression emphasizes a particular technology, which has a corresponding mental model (paradigm) and a corresponding traditional means of programming. We will be using a slightly more modern approach than the typical “canon” – using C++ and its standard library rather than C and low-level primitives (cf slides 18-29).

An important part of this course will focus around sound software development practices and we will emphasize clarity and elegance rather than cleverness.

Programming in This Course

We will be using the C++ language for programming in this course as it is the language of choice for modern real-world high-performance computing. As a modern multi-paradigm programming language, C++ provides a number of high-level abstractions that allow you to solidly structure large-scale software systems. At the same time, it is a compiled language, and allows direct control of low-level performance-oriented hardware features. C++ is a huge language—we will focus on an essential slice for doing what needs to be done for this course.

Before taking this course, you should have had some experience in programming, most likely in Python or matlab. There are some important differences between C++ and these languages. Of course, the syntax differs between C++ and Python (matlab), but that is not really an important difference. All of the things you could do in Python you can do in C++ and vice versa, we just express how those things are done in slightly different ways and with slightly different idioms.

The real difference is that Python (matlab) is a dynamically-typed interpreted language, whereas C++ is a statically-typed compiled language. When you run a Python program, what happens is that there is another program, the Python interpreter that is running on your computer. That program reads in your Python program and then carries out the instructions that are in that program. At the bottom, when a program is running, that means that a CPU is fetching and executing binary-coded instructions – at a very high rate of speed. In the case of a Python program, those instructions all belong to the interpreter. It may take hundreds or even thousands of machine instructions to be executed by the CPU to carry out one Python instruction. On the other hand, when we say C++ is a compiled language, what we mean is that before we run a C++ program, we have to translate it from what we write in C++ into the actual machine instructions. When we run a C++ program, we don’t run the C++, we run the program that contains the machine instructions that we generated and the CPU executes those directly. As a result, a C++ program will be hundreds of times faster than an equivalent Python program. On the other hand, the C++ program has to be complete and it has to be completely correct before it can be compiled and executed. You can’t iteratively interact with small program fragments in C++ in the way that you can with Python or matlab.

This will all become more clear during the course. The first two assignments in particular are intended to familiarize you with programming in C++.