update slides

4661ceac · Mactavish · 18977ef2 · 4661ceac · 18977ef2 · 18977ef2
Commit 4661ceac authored May 31, 2023 by Mactavish
--- a/exercises/OpenMP/pi.c
+++ b/exercises/OpenMP/pi.c
@@ -13,32 +13,28 @@ from the OpenMP runtime library
 History: Written by Tim Mattson, 11/99.

 */
-#include <stdio.h>
 #include <omp.h>
+#include <stdio.h>
+
 static long num_steps = 100000000;
 double step;
-int main ()
-{
+
+int main() {
  int i;
  double x, pi, sum = 0.0;
  double start_time, run_time;

  step = 1.0 / (double)num_steps;

-        	 
  start_time = omp_get_wtime();

  for (i = 1; i <= num_steps; i++) {
-		  x = (i-0.5)*step;
-		  sum = sum + 4.0/(1.0+x*x);
+    x = (i - 0.5) * step;            // x of the middle of the rectangle
+    sum = sum + 4.0 / (1.0 + x * x); // sum up the height of the rectangles
  }

-	  pi = step * sum;
+  pi = step * sum; // area of the rectangles
  run_time = omp_get_wtime() - start_time;
-	  printf("\n pi with %ld steps is %lf in %lf seconds\n ",num_steps,pi,run_time);
+  printf("\n pi with %ld steps is %lf in %lf seconds\n ", num_steps, pi,
+         run_time);
 }
-
-
-
-
-
--- a/slides/images/Petri-Net-Figure.jpg
+++ b/slides/images/Petri-Net-Figure.jpg
--- a/slides/images/Petri-Nets-basics.png
+++ b/slides/images/Petri-Nets-basics.png
--- a/slides/images/Petri-Nets-structures.png
+++ b/slides/images/Petri-Nets-structures.png
--- a/slides/images/Petri-Nets-types.png
+++ b/slides/images/Petri-Nets-types.png
--- a/slides/images/false-sharing.png
+++ b/slides/images/false-sharing.png
--- a/slides/pages/recap.md
+++ b/slides/pages/recap.md
@@ -34,7 +34,7 @@ Using Foster's Design Methodology.
 </v-clicks>

 ---
-title: Recap III
+title: OpenMP I
 ---

 # OpenMP
@@ -45,7 +45,7 @@ An API for Writing Multithreaded Applications.
 - Greatly simplifies writing multi-threaded programs in C/C++, Fortran
 - Standardized

-<br/>
+OpenMP is a multi-threading, shared address model.

 ### Assumptions

@@ -57,3 +57,119 @@ Details about OpenMP support in compilers can be found with the following links:
 - [Clang](https://clang.llvm.org/docs/OpenMPSupport.html)

 You can also take a look at [OpenMP reference cards](https://www.openmp.org/resources/refguides/).
+
+---
+title: OpenMP II
+---
+
+## Exercise-1
+
+Your first OpenMP program.
+
+Finish the `hello.c` in `exercises/OpenMP`.
+
+### Hint
+
+Use the `#pragma omp parallel` directive to create a parallel construct.
+
+Find and use the suitable function declared in `<omp.h>`.
+
+---
+title: OpenMP III
+---
+
+## Exercise-2
+
+Try to parallelize the program that calculates the integral:
+
+$$
+\int_{0}^{1} \frac {4.0} {(1 + x^2)} \,dx = \pi
+$$
+
+Using the classical approximation: calculate the sum of the area of the rectangles below the curve.
+
+Create a parallel version of the sequential pi program using a parallel construct using **SPMD (Single Program Multiple Data)**
+
+See `pi.c` in `exercises/OpenMP`.
+
+### Hint
+
+In addition to a parallel construct, you will need the runtime library routines:
+
+- `int omp_get_num_threads();` - number of threads in the team
+- `int omp_get_thread_num();` - thread ID or rank
+- `double omp_get_wtime();` - time in seconds elapsed since a fixed point in the past
+
+---
+title: OpenMP IV
+---
+
+### Solution with SPMD
+
+See live demo.
+
+This pattern is very general and has been used to support most (if not all) the algorithm strategy patterns.
+
+### Problem with SPMD
+
+If independent data elements happen to sit on the same cache line, each update will cause the cache lines to “slosh back and forth” between threads -- This is called _false sharing_.
+
+<div class="container flex justify-center">
+    <img src="/images/false-sharing.png" class="block w-lg"/>
+</div>
+
+Correct but horrible performance due to bouncing the cache line back and forth.
+
+---
+title: OpenMP V
+---
+
+## Exercise-3
+
+Try to refactor the program from **exercise-2** using **synchronization** in OpenMP.
+
+Synchronization in OpenMP:
+
+- **barrier**: `#pragma omp barrier`
+- **critical**: `#pragma omp critical`
+- **atomic**: `#pragma omp atomic` but only for basic binary operations such as `=`, `++`, `--`, `+=`, `-=` etc.
+- _ordered_
+- _flush_
+- _locks_
+
+---
+title: OpenMP VI
+---
+
+## Worksharing
+
+A parallel construct by itself creates an SPMD, i.e., each thread redundantly executes the same code. 
+
+How do you split up pathways through the code between threads within a team? -- via _worksharing_
+
+- **Loop construct**: `#pragma omp for` with schedule clauses (affects how loop iterations are mapped onto threads)
+    - `schedule(static [,chunk])`
+    - `schedule(dynamic, [,chunk])`
+    - `schedule(guided, [,chunk])`
+    - `schedule(runtime)`
+    - `schedule(auto)`
+- Sections/section construct: `#pragma omp sections` and `#pragma omp section`
+- Single construct: `#pragma omp single`
+- _Task construct_
+
+---
+title: OpenMP VII
+---
+
+## Exercise-4
+
+Try to parallelize the original serial pi program with a loop construct in OpenMP.
+
+### Hints
+
+- loop index `i` is private by default
+- Use reduction: `reduction (op : list) `
+    - A local copy of each list variable is made and initialized depending on the “op” (e.g. 0 for “+”). 
+    - The variables in “list” must be shared in the enclosing parallel region.
+    - Updates occur on the local copy
+    - Local copies are reduced into a single value and combined with the original global value