The Pre Processor

Although not part of the C++ language per se, the pre-processor allows you to programmatically manipulate the text of the program itself. That is, you can make changes to the program text – in an automated fashion – before it is actually compiled. One example we touched on was in the use of assert() for defensive programming.

When assert() is enabled, it executes a check of the expression that is passed to it – that is, it executes a small amount of code. In performance critical applications, this can steal cycles as well as prevent certain compiler optimizations. For production (release) versions of a program, we need to be able to remove the assertions.

One way to remove the assertions, of course, would be to go to all of your files and either delete or comment out all of the calls to assert(). This is a anti-solution: during development we need to be able to switch between debug and release versions of the code. We need to be able to switch all of the calls to assert() on and off in one fell swoop each time. To support this, calls to assert() can be “turned off” by the pre-processor if the macro NDEBUG is defined.

The pre-processor works in the following basic way. It takes your program as input and provides program text as an output. How this output is produced is controlled by the pre-processors own macro language – which is also embedded in the program text. Some of the statements you have already seen in your programs, such as #include are pre-processor statements. In fact, all statements beginning with # are pre-processor language commands.

Beyond just filtering input and output, the pre-processor has some sophisticated features for text manipulation. However, we are just going to look at some basic capabilities for including or excluding specific parts of your program before being passed to the compiler.

The command you have already seen from the pre-processor is #include, which is used to pull the text of one file into another file. For example, if your file Vector.cpp includes the file Vector.hpp

#include "Vector.hpp"

int main() {

  return 0;
}

the text that is passed to the compiler is the concatenation of the two files. That is, what is compiled when you invoke the compiler on Vector.cpp is the complete text of Vector.hpp with Vector.cpp, with the text of Vector.hpp inserted at the location of the #include ``Vector.hpp''. If you would like to see all of the text that is passed to the compiler after pre-processing, you can add the option “-E” to your compilation command. Be careful with this though – if you have included anything from the standard library you will get an enormous amount of text back. The statements #include <iostream> are not special – there is a file named iostream that is part of the standard library and its text is pulled in with that pre-processor directive.

Now, as we mentioned above, the pre-processor can be programmed by the user. To do this, we need to be able to do expected programming tasks, such as defining and testing variables and branching based on the tests. Since the pre-processing text and the program text are combined – and since the pre-processor manipulates the program text – it is important to distinguish between pre-processor commands and variables, and the program text and variables. The convention is to use ALL_CAPS for any pre-processor macros or variables.

Variables in the pre-processor are defined with the following syntax:

#define MY_VARIABLE sometext

This creates the macro (pre-processor variable) with the name MY_VARIABLE in the pre-processor namespace. Two things happen when a pre-processor macro is defined. First, whenever the pre-processor encounters the text MY_VARIABLE in the program text, it substitutes the defined text for that variable. In other words, the following transformation occurs.

#include <iostream>
using namespace std;

#define MY_GREETING "Hello World"
#define OP *

int main() {

  cout << MY_GREETING << endl;
  cout << "7 x 6 = " << 7 OP 6 << endl;

  return 0;
}
using namespace std;

int main() {

  cout << "Hello World" << endl;
  cout << "7 x 6 = " << 7 * 6 << endl;

  return 0;
}

Note that the macros MY_GREETING and OP are replaced by exactly the text they are define to be. In this case, that even includes the quotation marks in MY_GREETING.

Branching in the pre-processor is controlled by a family of #if directives: #if, #ifdef, and #ifndef, which test the value of a compile-time expression, whether a macro is defined, or whether a macro is not defined, respectively. In response to evaluating an #if the pre-processor doesn’t execute one branch of pre-processor code or another, rather it sends one stream of your program text or another to the compiler.

Include Guards

One standard use of the branching capabilities of pre-processor is to make sure that the text from any give header files is only placed once into the text stream to the compiler. This can easily happen when multiple headers include each other and/or include multiple headers from the standard library. In that case, even though the headers might be #included, we only insert their text once. The following is standard technique. For any header file, the following pre-processor directives are used (assume the header file is Matrix.hpp):

#ifndef MATRIX_HPP   // if the macro MATRIX_HPP is not defined, include the following text
#define MATRIX_HPP  // First, define the macro


#endif  // The program text up to the matching #endif is what is included

You should make a habit of always protecting your header files in this way.

As you might expect, there is an #else to go along with #if.

#include <iostream>

int main() {

#ifdef BAD_DAY
  std::cout << "Today is a bad day'' << std::endl;
#else
  std::cout << "Today is a good day" << std::endl;
#endif

  return 0;
}

In this example, the compiler will get the first branch of text if the macro BAD_DAY is defined, otherwise it will get the second branch. NB: With #ifdef, it does not matter what the value of the macro is. The test is only whether the macro exists or not. It is perfectly acceptable to #define a macro with no value.

In fact, when we want to disable assert(), we just need to #define the macro NDEBUG, we don’t need to give it any particular value. But, since the pre-processor just processes your program text and sends it to the compiler, the NDEBUG macro must be defined before the #include <cassert> statement.

But this raises almost the same scalability issue we mentioned before. If we want to globally remove assert, we need to have the NDEBUG macro defined when processing our program files. One way to do this would be to edit each of the files and insert (or remove) #define NDEBUG in every one. This is impractical for all but the smallest of programs (and maybe even not then).

There is an essential feature of the C++ compiler that solves this problem, namely the -D option. This option passes a macro (with or without a defined value) to the prep-processor. In particular, you can pass NDEBUG to your programs this way (without ever having to change the program):

$ c++ -DNDEBUG main.cpp -o main.exe

You can give the macro a value by using =

$ c++ -DNDEBUG=1 main.cpp -o main.exe

but for NDEBUG the definition, not any value, is what turns on or turns off assert().

An equivalent to ifdef and ifndef is to used the defined function. That is, an equivalent way to write an include guard is

#if !defined(MATRIX_HPP)   // if the macro MATRIX_HPP is not defined, include the following text
#define MATRIX_HPP         // First, define the macro


#endif  // The program text up to the matching #endif is what is included

The function defined returns true if its argument is defined, false otherwise. An advantage to using defined is in using it with just a plain if. One can construct compound predicates, for instance.

#if !defined(MATRIX_HPP) && defined(TBB)
#define MATRIX_HPP


#endif

In fact, the preprocessor can do fairly sophisticated computations at compile time. However, many of the tasks that were formerly the domain of the preprocessor are now more readily accomplished within the C++ language itself through the use of the template mechanisms and defined compile-time operations (constexpr).