辅导 AMATH 483 / 583 (Roche) - Homework Set 4讲解 C/C++语言

AMATH 483 / 583 (Roche) - Homework Set 4

Due Friday May 2, 5pm PT

April 25, 2025

Homework 4 (90 points)

1. (+10) Given matrix evaluate the action of A on the unit balls of R2 defined by the 1-norm, ﬁned by the 1-norm, 2-norm, and ∞-norm (induced matrix norms). Submit your work and drawings.

2. (+20) Compiler Optimization of Matrix Multiplication Loop Permutations. Implement C++ templated

gemm, C ← αAB + βC (A ∈ Tm ×p , B ∈ Tp ×n , C ∈ Tm ×n , α , β ∈ T) for {kij} and {jki} loop permutations using the speciﬁcations provided here:

• template

void mm_jki(T a, const std::vector & A, const std::vector & B, T b,

std::vector & C, int m, int p, int n);

• template

void mm_kij(T a, const std::vector & A, const std::vector & B, T b,

std::vector & C, int m, int p, int n);

You will explore matrix multiply performance applying compiler optimization levels -O0 and -O3 for square matrices of dimension n = 2 to n = 512, stride one to these functions. For reference, recall the loop order for outer loop, j middle loop, k inner loop. Let each n be measured ntrial times and plot the average performance in FLOPs (ﬂop count / time(seconds)) for each case versus n, ntrial ≥ 3. Submit plots for both permutation variants that include both FP32 (ﬂoat) and FP64 (double) compiler optimization results.

3. (+20) Row major Matrix class. Reference the ﬁle matrix class.hpp for the starter code for a Matrix class template for row major index referencing and std::vector for the matrix storage scheme. Please put your function implementations in ﬁle matrix class.hpp, and submit matrix class.hpp.

• (+5) Matrix transpose for A ∈ Rm ×n is deﬁned A = Aj,i and so AT ∈ Rn ×m . This method returns

a matrix as deﬁned by the class.

— Matrix transpose() const{}

Matrix inﬁnity norm for A ∈ Rm ×n is deﬁned This method returns a number.

— T infinityNorm() const{}

• (+5) Write the method to operator overload multiplication * for the matrix class I provided. This method returns a matrix as deﬁned by the class.

— Matrix operator*(const Matrix &other) const{}

• (+5) Write the method to operator overload addition + for the matrix class I provided. This method returns a matrix as deﬁned by the class.

— template

Matrix Matrix ::operator+(const Matrix &other) const{}

4. (+10) Extremum. Consider the surface deﬁned by xy + 2xz = 5√5 and (x, y, z) ∈ R3 . Find (a) the

coordinate instance(s) aﬃliated with the minimum distance from a point on the surface to the origin, and (b) the value of the minimum distance. You will ﬁnd (may safely assume) the domain [−3, 3] × [−3, 3] × [−3, 3] ∈ R3 holds the correct coordinate instance(s).

5. (+10) IO bandwidth. Write a C++ function that writes type double square matrices in column major order to ﬁle in binary. Measure the time required to complete the write for matrices of dimension 32, 64, 128, ... 16384 (2GB). Write a C++ function that reads binary matrices from ﬁle to type double matrices in memory. Measure the time required to complete each read for the same dimensions. (a) Make a single plot of the read and write measurements with the bandwidth (bytes per second) on the y-axis, and the problem dimension on the x-axis. Submit your plot.

6. (+20) File access time. (a) Write C++ functions for the given function declarations that perform. row and column swap operations on a type double matrix stored in a ﬁle in column major index order. Test the swapping capabilities for correctness. Put the functions you write in ﬁle ﬁle swaps.hpp. (b) Conduct a performance test for square matrix dimensions 16, 32, 64, 128, ... 8192, measuring the time required to conduct ﬁle-based row and column swaps separately. Let each operation be measured ntrial times, ntrial ≥ 3. Make a single plot of the row and column swap average times on the y-axis (log10 (time)) and the problem dimension on the x-axis. Submit your header ﬁle ﬁle swaps.hpp and plot. You may ﬁnd the code snippet helpful.