Workshop 2

BLAS

In this workshop, you write your own code for vector and matrix operations and compare your solution times to those of the GSL BLAS library.

Learning Outcomes

Upon successful completion of this workshop, you will have demonstrated the abilities to

• write code that implements linear-algebraic operations on vectors and matrices
• classify linear algebra operations according to the BLAS standard
• explain the function naming convention of the BLAS standard
• use the GNU Scientific Library to perform vector and matrix operations
• summarize what you have learned in this workshop

Specifications

Scientists and engineers express many of their computational problems in terms of vectors and matrices.  The vectors may contain a large number of elements and the matrices that transform the vectors may consist of a larger number of coefficients organized into a large number of rows each containing a large number of elements.  Common operations on these vectors and matrices include:

• dot product of two vectors of equal size
• product of a matrix and a matching vector
• product of one matrix with another matrix

We can code these operations directly or use library that have been optimized for efficent processing.

Custom Version

Consider the following incomplete program.  Add your own code to perform the linear-algebraic calculations:

 ``` // Linear Algebra - Workshop 2 // w2.custom.cpp #include #include #include #include using namespace std::chrono; void init(float* a, int n) { const float randf = 1.0f / (float) RAND_MAX;  for (int i = 0; i < n; i++) a[i] = std::rand() * randf; } void reportTime(const char* msg, steady_clock::duration span) { auto ms = duration_cast(span); std::cout << msg << " - took - " << ms.count() << " millisecs" << std::endl; } float sdot(int n, const float* a, const float* b) { // insert your custom code here } void sgemv(const float* a, int n, const float* v, float* w) { // insert your custom code here } void sgemm(const float* a, const float* b, int n, float* c) { // insert your custom code here } int main(int argc, char** argv) { // interpret command-line argument if (argc != 2) { std::cerr << argv << ": invalid number of arguments\n";  std::cerr << "Usage: " << argv << " size_of_matrices\n";  return 1; } int n = std::atoi(argv); steady_clock::time_point ts, te; float* v = new float[n]; float* w = new float[n]; float* a = new float[n * n]; float* b = new float[n * n]; float* c = new float[n * n]; // initialization std::srand(std::time(nullptr)); ts = steady_clock::now(); init(a, n * n); init(b, n * n); init(v, n); init(w, n); te = steady_clock::now(); reportTime("initialization ", te - ts);  // vector-vector - dot product of v and w ts = steady_clock::now(); sdot(n, v, w); te = steady_clock::now(); reportTime("vector-vector operation", te - ts);  // matrix-vector - product of a and v ts = steady_clock::now(); sgemv(a, n, v, w); te = steady_clock::now(); reportTime("matrix-vector operation", te - ts);  // matrix-matrix - product of a and b ts = steady_clock::now(); sgemm(a, b, n, c); te = steady_clock::now(); reportTime("matrix-matrix operation", te - ts);  delete [] v; delete [] w; delete [] a; delete [] b; delete [] c; }```

Compile and link your completed program using version 7.2 or higher of the GNU GCC compiler and the O2 optimization switch.  Version 7.2.0 is available in matrix's local system directory and accessible using the following Makefile:

 ```# Makefile for w2 # VER=custom GCC_VERSION = 7.2.0 PREFIX = /usr/local/gcc/\${GCC_VERSION}/bin/ CC = \${PREFIX}gcc CPP = \${PREFIX}g++ w2.\${VER}: w2.\${VER}.o \$(CPP) -ow2.\${VER} w2.\${VER}.o w2.\${VER}.o: w2.\${VER}.cpp \$(CPP) -c -O2 -std=c++17 w2.\${VER}.cpp  clean: rm *.o ```

To execute this Makefile, enter the command

 ` > make`

To run the executable, enter the command

 ` > w2.custom 10`

The command-line argument is the size of the vector  and matrix [10 by 10].

Test Runs

Tabulate your timing statistics for the following sizes (n):

 n initialization Custom Level 1 Custom Level 2 Custom Level 3 500 1000 1500 2000 2500 3000

BLAS Version

The following incomplete program is a copy of the source code listed above with the linear-algebra functions removed.  Insert into this code calls to the GSL cblas implementation of the BLAS standard.

 ``` // Linear Algebra - Workshop 2 // w2.cblas.cpp #include #include #include #include using namespace std::chrono; void init(float* a, int n) { const float randf = 1.0f / (float) RAND_MAX;  for (int i = 0; i < n; i++) a[i] = std::rand() * randf; } void reportTime(const char* msg, steady_clock::duration span) { auto ms = duration_cast(span); std::cout << msg << " - took - " << ms.count() << " millisecs" << std::endl; } int main(int argc, char* argv[]) { // interpret command-line argument if (argc != 2) { std::cerr << argv << ": invalid number of arguments\n";  std::cerr << "Usage: " << argv << " size_of_matrices\n";  return 1; } int n = std::atoi(argv); steady_clock::time_point ts, te; float* v = new float[n]; float* w = new float[n]; float* a = new float[n * n]; float* b = new float[n * n]; float* c = new float[n * n]; // initialization std::srand(std::time(nullptr)); ts = steady_clock::now(); init(a, n * n); init(b, n * n); init(v, n); init(w, n); te = steady_clock::now(); reportTime("initialization", te - ts);  // vector-vector - dot product of v and w ts = steady_clock::now(); add call to cblas here te = steady_clock::now(); reportTime("vector-vector operation", te - ts);  // matrix-vector - product of a and v ts = steady_clock::now(); add call to cblas here te = steady_clock::now(); reportTime("matrix-vector operation", te - ts); // matrix-matrix - product of a and b ts = steady_clock::now(); add call to cblas here te = steady_clock::now(); reportTime("matrix-matrix operation", te - ts); delete [] v; delete [] w; delete [] a; delete [] b; delete [] c; }```

You can find information about the arguments to the cblas function on the course Wiki

Compile your code for this GSL-BLAS version and run it for the sizes listed below.  Record your results for initialization, level 1, level 2, and level 3.  Copy the values from the execution of your custom code to this table.

Results

 n initialization Level 1 Level 2 Level 3 Custom Level 3 500 1000 1500 2000 2500 3000

Store the complete set of results for both custom and BLAS solutions in a spreadsheet file named w2.ods or w2.xls.  Prepare a 3D look realistic column chart showing the times in each column against n along the horizontal axis as shown below. You can create the chart in Open Office using the following steps:

• Highlight data and labels
• Select Chart in the Toolbar
• Chart Type - check 3D Look Realistic Column
• Data Range - 1st row as label, 1st column as label
• Chart Elements - add title, subtitle, axes labels

You can create the chart in Excel using the following steps:

• Select Insert Tab -> Column -> 3D Clustered Column
• Select Data -> remove n -> select edit on horizontal axis labels -> add n column (500-4000)
• Select Chart tools -> Layout -> Chart Title - enter title and subtitle
• Select Chart tools -> Layout -> Axis Titles -> Select axis - enter axis label

SUBMISSION

Create a matrix typescript named w2.txt that includes

• a listing of your source code - both original and cblas versions
• a compilation and linking of your source code - both versions
• a run of the executable for each problem size - both versions  