Part B - Foundations

Pointers and Arrays

Review pointer syntax
Examine the relationship between an array and a pointer
Improve type safety through the const qualifier

"C++ inherited pointers from C, so I couldn't remove them without causing serious compatibility problems" (Stroustrup, 1997)

Pointers | Arrays | Unmodifiable Arrays | Unmodifiable Variables
In-Class Exercise | Consistency | Summary


We begin our exposition of the C++ language with syntax that is foundational to all of the data types in both the C and the C++ core languages. 

In this chapter, we review pointers and arrays, and identify the relationship between them.  Pointers refer to locations in memory where information is stored.  Arrays are the simplest data structures.  They hold data in a form that enables efficient access to their elements.  We show that array and pointer syntax is interchangeable for primitive types.  This interchangeability extends to the more complex compound data types introduced in the next chapter.  We also show how to guard against modifications to array elements in functions where arrays are received through parameters.


Pointers

Memory Map

We model the RAM of primary memory in terms of a linear map and use addresses on this map to identify bytes of information within memory.  For instance, in 512Mb of RAM, we use the address 0 to identify the first byte and address 512Mb-1 to identify the last byte. 

memory map

Pointer Syntax

A pointer is a variable that holds an address.  A pointer definition takes the form

 type* identifier;

where type is the type of the variable pointed to and identifier is the name of the variable that holds the address. 

To store the address of a variable of type double in pointer, we write

 double* pointer;

We may attach the * operator to the type, to the identifier, to both or to neither.  C programmers prefer identifier attachment

 double *pointer;

C++ programmers prefer type attachment 

 double* pointer;

To access the data at the address stored in pointer, we prefix the identifier with the * operator:

 *pointer

To access data using a pointer, we must know that data's address.  The compiler does not initialize pointers implicitly.  In the following example, we first assign the address of an existing variable to pointer, x.  Once pointer points to a valid address, we can access the data at that address. 

 // Initializing a Pointer
 // initialize.cpp

 #include <iostream>
 using namespace std;

 int main() {
     double x = 10.5;
     double* pointer;

     pointer = &x;

     cout << "The value stored in address " << pointer
          << " is " << *pointer << endl;
 }

If we omit the pointer assignment, pointer will be undefined and dereferencing pointer will produce either meaningless output or a run-time error. 


Multiple Declarations

To define several pointers in a single statement, we associate the * operator with each identifer

 double* a,* b;

If we omit the * operator on the second identifier

 double* a, b;

the compiler will allocate memory for a pointer to a double (a) and memory for a double (b).  Since such pointer definition is open to misinterpretation, many programmers prefer to define each pointer in a separate statement on a separate line:

 double* a;
 double* b;

NULL Address

We call an uninitialized pointer a wild pointer.  A wild pointer does not point to anything in particular.  It is good practice to initialize any wild pointer to NULL

 // Initializing a Pointer - NULL
 // initialize.cpp

 #include <iostream>
 using namespace std;

 int main() {
     double x = 10.5;
     double* pointer = NULL;

     pointer = &x;

     cout << "The value stored in address " << pointer
          << " is " << *pointer << endl;
 }

The NULL address is defined in a pre-processor directive as the address of the start of RAM: the 0 address.  Several library header files include this definition. 

By convention the start of memory does not hold any valid data or any valid program instruction.  Any attempt to dereference that address generates a run-time error, which halts execution.  To demonstrate this, comment out the statement that assigns the address of x to pointer.  Run this modified code on platforms like Borland and matrix. 


Arrays

An array stores its elements contiguously in memory; that is, without intervening space between adjacent elements.  All of the elements share a common data type and the name of the array holds the address of the start of the array. 


Subscript Notation

In array subscript notation the index refers to the offset into the array: 

 a[i]

The index stands for the number of data types into the array. 

To refer to this element in pointer notation, we add the index to the array's address and dereference the result:

 *(a + i)

Any expression that adds an integer, say i, to a pointer, say a, yields an address.  The result is the location in memory that is i data types beyond location a.  In other words, a + i is the address of element i.  So, *(a + i) accesses the data at i memory locations beyond address a.

pointer arithmetic

Using Pointer Notation Instead of Array Notation

We may replace array subscript notation with equivalent pointer notation:

 int x;
 int a[] = {1, 2, 3};

 x = *a;      // stores 1 in x
 x = *(a + 1) // stores 2 in x
 x = *(a + 2) // stores 3 in x

We can use pointer addition to identify an offset from an array element:

 char str[] = "This is ABC123";
 char* s;

 s = &str[8];              // points to the 'A' in ABC123
 cout << (s + 3) << endl; // displays 123

Function Calls

If we pass the name of an array as an argument in a function call, the function receives the address of the array in the corresponding parameter.  That is, the parameter that receives the address is a pointer to the start of the array.

In function parameter declarations, array and pointer notations are equivalent: 

 type foo(type identifier[])

is equivalent to

 type foo(type* identifier)

For example, the function header

 void foo(int a[])

is equivalent to

 void foo(int* a)

[Caution: note that this equivalence does not extend to array definitions.  We cannot replace the definition of an array with a pointer definition.  The array definition allocates the required number of memory locations for the elements of the array.  A pointer definition only allocates a single memory location to hold one address.] 


Unmodifiable Arrays

In certain functions, we only need to use the values of the elements of an array without changing them.  If we inform the compiler that the contents of the array will not change within the function, the compiler can reject any code that attempts to change those contents. 

To bar a function from changing the contents of an array, we insert the keyword const before the parameter type of the pointer that receives the address of the array.  The compiler then treats the elements of the array as unmodifiable.

For example,

 // Unmodifiable Array
 // const.cpp

 #include <iostream>
 using namespace std;
 #define MAX 100
 void display(const int a[], int n);

 int main( ) {
     int i, a[MAX];

     for (i = 0; i < MAX; i++)
         a[i] = i * i;

     display (a, MAX);
 }

 // display the contents of a[n]
 void display(const int a[], int n) {
     int i;

     for (i = 0; i < n; i++)
         cout << "Element " << i + 1 << " of a is " << a[i] << endl;
 }

As an exercise, add to the display() function

 a[0] = 10;

re-compile and note the compile-time error message.

const Return Values

A function that receives the address of an unmodifiable array can return the address of the unmodifiable array.  We qualify such return values as read only with the const keyword. 

For example, to prohibit changes to a returned string from the subset() function below, we qualify its return data type as const

 // Returning An Address
 // return_const.cpp

 #include <iostream>
 using namespace std;
 #define MAX 11
 void display(const int a[], int n);
 const int* subset(const int* a, int i, int n);

 int main( ) {
     int i, a[MAX];

     for (i = 0; i < MAX; i++)
         a[i] = i * i;

     display(subset(a, 3, MAX), MAX - 3);
 }

 // display the contents of a[n]
 //
 void display(const int a[], int n) {
     int i;

     for (i = 0; i < n; i++)
         cout << "Element " << i + 1
              << " of a is " << a[i] << endl;
 }

 // subset of a[n]
 //
 const int* subset(const int* a, int i, int n) { 

     return i >= 0 && i < n ? &a[i] :
            i < 0 ? &a[0] : &a[n - 1];
 }
























 Element 1 of a is 9
 Element 2 of a is 16
 Element 3 of a is 25
 Element 4 of a is 36
 Element 5 of a is 49
 Element 6 of a is 64
 Element 7 of a is 81
 Element 8 of a is 100 



As an exercise, remove the const qualifier from the return data type, re-compile, and note the compile-time error.  On the Borland platform,

Error E2034 return_const.cpp 34: Cannot convert 'const int * const' to
'int *' in function subset(const int *,int,int)
*** 1 errors in Compile ***

The const keyword is necessary with return data types from such arrays.  We say that const is viral.

[Caution: We can return an address if it has been passed into a function through the parameter list, but we should not return the address of a variable or array defined within the function itself, since the local definition goes out of scope upon returning from the function.]


Unmodifiable Variables and Objects

We can qualify any data type as unmodifiable, not just an array.  We may apply the const keyword to any type.  Wherever we do not want to change the value of an initialized variable, its definition takes the form

 const type identifier = initialValue;

For example, to qualify a double that holds 3.14159 as unmodifiable, we write

 const double pi = 3.14159;

Any attempt to change the value of pi will trigger a compile-time error. 

Alternative to #define

const provides an alternative to the #define pre-processor directive.  For example, we can rewrite the const.cpp program above as follows:

 // const Instead of #define
 // const_alternative.cpp

 #include <iostream>
 using namespace std;
 const int MAX = 100;
 void display(const int a[], int n);

 int main( ) {
     int i, a[MAX];

     for (i = 0; i < MAX; i++)
         a[i] = i * i;

     display (a, MAX);
 }

 // display contents of a[n]
 void display(const int a[], int n) {
     int i;

     for (i = 0; i < n; i++)
         cout << "Element " << i + 1 << " of a is " << a[i] << endl;
 }

The advantages of declaring MAX as an unmodifiable variable is that we have control over its scope and the statement is integrated into C++'s type system. 


In-Class Exercise

As an exercise, determine the output of the walkthrough in the Handout on Pointers and Arrays


Consistency (Optional)

Consistency issues arise with unmodifiable variables.  Defining some variables as unmodifiable while defining other related variables as modifiable may create security flaws. 

Consider the following program and identify its flaw:

 // Security Flaw - Not Recommended
 // security.cpp

 #include <iostream>
 using namespace std;

 int main() {
     int i, choice;
     char  instruments[][15] = {"Stocks", "Bonds", "Treasury Bills"};
     int   highRisk[]        = {      70,      15,               15};
     int   lowRisk[]         = {      15,      50,               35};
     const int* profile      = NULL;

     cout << "Select Risk Profile (0 for High Risk, 1 for Low Risk) : ";
     cin >> choice;

     switch (choice) {
         case 0: profile = highRisk; break;
         case 1: profile = lowRisk;  break;
     }

     cout << "Recommended Portfolio" << endl;
     for (i = 0; i < 3; i++)
         cout << profile[i] << "% : " << instruments[i] << endl;
 }

Since profile points to an unmodifiable array, we cannot change profile[i].  However, since we can copy fresh values into highRisk[i] or lowRisk[i], a backdoor exists to changing the elements of profile.  The compiler lets us change highRisk as shown below, which is not something that we should allow:

 // Security Flaw - Modifying an Unmodifiable
 // security.cpp

 #include <iostream>
 using namespace std;

 int main() {
     int i, choice;
     char  instruments[][15] = {"Stocks", "Bonds", "Treasury Bills"};
     int   highRisk[]        = {      70,      15,               15};
     int   lowRisk[]         = {      15,      50,               35};
     const int* profile      = NULL;

     cout << "Select Risk Profile (0 for High Risk, 1 for Low Risk) : ";
     cin >> choice;

     switch (choice) {
         case 0: profile = highRisk; break;
         case 1: profile = lowRisk;  break;
     }

     highRisk[0] = 0;
     highRisk[1] = 85;

     cout << "Recommended Portfolio" << endl;
     for (i = 0; i < 3; i++)
         cout << profile[i] << "% : " << instruments[i] << endl;
 }

To correct this flaw and to ensure that the compiler traps attempts to change values pointed to by profile, we qualify highRisk and lowRisk as unmodifiable also.  For completeness, we also qualify instruments as unmodifiable:

 // Security Flaw - Corrected
 // security.cpp

 #include <iostream>
 using namespace std;

 int main() {
     int i, choice;
     const char instruments[][15] = {"Stocks", "Bonds", "Treasury Bills"};
     const int  highRisk[]        = {      70,      15,               15};
     const int  lowRisk[]         = {      15,      50,               35};
     const int* profile      = NULL;

     cout << "Select Risk Profile (0 for High Risk, 1 for Low Risk) : ";
     cin >> choice;

     switch (choice) {
         case 0: profile = highRisk; break;
         case 1: profile = lowRisk;  break;
     }

     cout << "Recommended Portfolio" << endl;
     for (i = 0; i < 3; i++)
         cout << profile[i] << "% : " << instruments[i] << endl;
 }

Summary

  • we must initialize a pointer before we dereference it
  • it is good programming practice to initialize a wild pointer to NULL
  • the name of an array points to the start of the array
  • one-dimensional array notation and pointer notation are equivalent in parameter declarations
  • const identifies a data type as unmodifiable
  • const is a type-safe alternative to #define




Previous Reading  Previous: Modular Programs Next: Compound Types I   Next Reading


  Designed by Chris Szalwinski   Copying From This Site   
Logo
Creative Commons License