diff --git a/README.md b/README.md index 6b67e9d6fb9f13b8bcdccac669990a33b2b73624..cc76a9d06e91552617340610e090288860566471 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,9 @@ -# ALP4 Tutorial-8 +# ALP4 Tutorial-9 -This branch contains all materials for the 8th tutorial session. +This branch contains all materials for the 9th tutorial session. ## Agenda -- Assignment's solution presentation (if any) -- Recap & Discussion: Parallelism, MPI, Evaluation, Amdahl's Law +- Assignment's solution presentation +- Recap & Discussion: Java - Q&A diff --git a/slides/images/MPI-all-to-all.png b/slides/images/MPI-all-to-all.png deleted file mode 100644 index e19075022d254bc552cf85541680f4f9f649e39a..0000000000000000000000000000000000000000 Binary files a/slides/images/MPI-all-to-all.png and /dev/null differ diff --git a/slides/images/MPI-broadcast.png b/slides/images/MPI-broadcast.png deleted file mode 100644 index f4b9271876a9912937dfa6b8bb7a5e25ac20f67f..0000000000000000000000000000000000000000 Binary files a/slides/images/MPI-broadcast.png and /dev/null differ diff --git a/slides/images/MPI-gather-to-all.png b/slides/images/MPI-gather-to-all.png deleted file mode 100644 index 8993c31f2fae0756a1e634f09326f7d87451734d..0000000000000000000000000000000000000000 Binary files a/slides/images/MPI-gather-to-all.png and /dev/null differ diff --git a/slides/images/MPI-gather.png b/slides/images/MPI-gather.png deleted file mode 100644 index f070b09f22ed809a4b2b171800848fe8ed3c5970..0000000000000000000000000000000000000000 Binary files a/slides/images/MPI-gather.png and /dev/null differ diff --git a/slides/images/MPI-global-reduction.png b/slides/images/MPI-global-reduction.png deleted file mode 100644 index 6edd08bfd387c5fbec0d31b759366d236f12e6bc..0000000000000000000000000000000000000000 Binary files a/slides/images/MPI-global-reduction.png and /dev/null differ diff --git a/slides/images/MPI-predefined-ops.png b/slides/images/MPI-predefined-ops.png deleted file mode 100644 index 56bcfe8d8592812a075dc93981549eac51d8a5f1..0000000000000000000000000000000000000000 Binary files a/slides/images/MPI-predefined-ops.png and /dev/null differ diff --git a/slides/images/MPI-scatter.png b/slides/images/MPI-scatter.png deleted file mode 100644 index ef7d1c18ea05e5cfc2e7780a7b7e89bc1b56db1b..0000000000000000000000000000000000000000 Binary files a/slides/images/MPI-scatter.png and /dev/null differ diff --git a/slides/images/agglomeration-mapping.png b/slides/images/agglomeration-mapping.png deleted file mode 100644 index cfaa4430fb91384b80e092e64f025d9858101475..0000000000000000000000000000000000000000 Binary files a/slides/images/agglomeration-mapping.png and /dev/null differ diff --git a/slides/images/communication.png b/slides/images/communication.png deleted file mode 100644 index c24ef475c7e887ff3df0c5dae4543c8eb0850c3e..0000000000000000000000000000000000000000 Binary files a/slides/images/communication.png and /dev/null differ diff --git a/slides/images/do-8-10.png b/slides/images/do-8-10.png deleted file mode 100644 index 41521202b86a2174cef7f2b329a8bc1e64fdcdb7..0000000000000000000000000000000000000000 Binary files a/slides/images/do-8-10.png and /dev/null differ diff --git a/slides/images/execution-model.png b/slides/images/execution-model.png deleted file mode 100644 index e523e97c0938d61b471a0b003a7d4f3ae8c3cf5a..0000000000000000000000000000000000000000 Binary files a/slides/images/execution-model.png and /dev/null differ diff --git a/slides/images/fosters-model.png b/slides/images/fosters-model.png deleted file mode 100644 index f823bd10e847d25a01d82718031958d7a9ff0db1..0000000000000000000000000000000000000000 Binary files a/slides/images/fosters-model.png and /dev/null differ diff --git a/slides/images/fri-12-14.png b/slides/images/fri-12-14.png deleted file mode 100644 index 33f6f80b382485031fbb04247540993320bd56ff..0000000000000000000000000000000000000000 Binary files a/slides/images/fri-12-14.png and /dev/null differ diff --git a/slides/images/how-java-program-runs.webp b/slides/images/how-java-program-runs.webp new file mode 100644 index 0000000000000000000000000000000000000000..db2682f15439c2d060148b26c4894be4c799f5b6 Binary files /dev/null and b/slides/images/how-java-program-runs.webp differ diff --git a/slides/images/inf-runtime.png b/slides/images/inf-runtime.png deleted file mode 100644 index ac37b46b1f5c85a3fa18f40cedfe1760bbcef50e..0000000000000000000000000000000000000000 Binary files a/slides/images/inf-runtime.png and /dev/null differ diff --git a/slides/images/jdk-jre-jvm.webp b/slides/images/jdk-jre-jvm.webp new file mode 100644 index 0000000000000000000000000000000000000000..bd1b88cddc1b432401c8cedd39be1e2e6e9e7fc0 Binary files /dev/null and b/slides/images/jdk-jre-jvm.webp differ diff --git a/slides/images/jdk.webp b/slides/images/jdk.webp new file mode 100644 index 0000000000000000000000000000000000000000..2412662eb47266a32aa153ec324fe1968094b7b2 Binary files /dev/null and b/slides/images/jdk.webp differ diff --git a/slides/images/jre.webp b/slides/images/jre.webp new file mode 100644 index 0000000000000000000000000000000000000000..3dcc762ba3c183c93cf05a15ec346229e9b6e856 Binary files /dev/null and b/slides/images/jre.webp differ diff --git a/slides/images/p-runtime.png b/slides/images/p-runtime.png deleted file mode 100644 index 663bebe5c4d7bd039197f5c8e20b79af945ed50d..0000000000000000000000000000000000000000 Binary files a/slides/images/p-runtime.png and /dev/null differ diff --git a/slides/images/partition.png b/slides/images/partition.png deleted file mode 100644 index 77f464ce43632ef6021106eb56b7d2cb5143e246..0000000000000000000000000000000000000000 Binary files a/slides/images/partition.png and /dev/null differ diff --git a/slides/images/runtime.png b/slides/images/runtime.png deleted file mode 100644 index de29bb7564308c98d41fbbc0f8f500f692e0b2f5..0000000000000000000000000000000000000000 Binary files a/slides/images/runtime.png and /dev/null differ diff --git a/slides/pages/qa.md b/slides/pages/qa.md index f7fbfe553499e52fa9d1aa3859969876f13e5557..c79648f0704529925983bc64525604f7d78a8d2b 100644 --- a/slides/pages/qa.md +++ b/slides/pages/qa.md @@ -6,15 +6,9 @@ title: Q&A Any questions about: -- Sixth Assignment Sheet - Seventh Assignment Sheet +- Eighth Assignment Sheet - Topics from the lectures - Organisation <br/> - -## References - -- [Rice - COMP425 on Amdahl's Law](https://www.clear.rice.edu/comp425/slides/L06.pdf) -- [Oslo - Parallel Algorithm Design](https://www.uio.no/studier/emner/matnat/ifi/INF3380/v11/undervisningsmateriale/inf3380-week09-2011.pdf) -- [Monitors: An Operating System Structuring Concept](https://dl.acm.org/doi/pdf/10.1145/355620.361161) diff --git a/slides/pages/recap.md b/slides/pages/recap.md index 0262cb7f93f1c836db61a9781c86b12b8edd0136..34df5da6bb393d2c871417d452d04fba015b684f 100644 --- a/slides/pages/recap.md +++ b/slides/pages/recap.md @@ -1,530 +1,74 @@ --- -title: Recap I ---- - -# Recap & Discussion - -### MPI vs OpenMP - -<br/> - -- What are the differences between MPI and OpenMP? -- What are the advantages and disadvantages of each? -- Can we combine them? - ---- -title: Recap II ---- - -## Monitor - -Monitor is construct that **contains** shared data structures, operations, and synchronization between concurrent procedure calls. - -Monitors provide a **high level** of synchronization between processes. - -<v-click> - -### Components of Monitor - -</v-click> - -<br/> - -<v-clicks> - -- Initialization/Destruction -- Private Data, Locks, Condition Variables -- Monitor Procedures/Interfaces -- Monitor Entry Queue (managed by **condition variables**) - -</v-clicks> - -<v-click> - -**Advantages and Disadvantages of Monitors?** - -</v-click> - -<v-click> - -More details in the [original paper](https://dl.acm.org/doi/pdf/10.1145/355620.361161) from C.A.R. Hoare. - -</v-click> - - ---- -title: Recap III - Monitor Java Example I ---- - -### Monitor Example in Java - -The following Java example code explains the synchronization via monitor: - -```java -class Count { - // synchronized block - synchronized void displayCounting(int n) { - for (int i = 1; i <= n; i++) { - System.out.println(i); - try { - // sleep for 500 milliseconds - Thread.sleep(500); - } catch (Exception e) { - System.out.println(e); - } - } - } -} -``` - ---- -title: Recap III - Monitor Java Example II ---- - -### Monitor Example in Java (cont.) - -```java -// Thread 1 -class Thread_A extends Thread { - Count c; - Thread_A(Count c) { - this.c = c; - } - public void run() { - c.displayCounting(5); - } -} - -// Thread 2 -class Thread_B extends Thread { - Count c; - Thread_B(Count c) { - this.c = c; - } - public void run() { - c.displayCounting(5); - } -} -``` - ---- -title: Recap III - Monitor Java Example III ---- - -### Monitor Example in Java (cont.) - -```java -public class main { - public static void main(String args[]) { - Count obj = new Count(); - Thread_A t1 = new Thread_A(obj); - Thread_B t2 = new Thread_B(obj); - t1.start(); - t2.start(); - } -} -``` - -See live demo. - ---- -title: Recap IV ---- - -### Amdahl's Law - -**The runtime of a program**: - -- sequential part: $T_s$ -- parallelisable part: $T_p$ -- total execution time: $T = T_s + T_p$ -- serial fraction: $f = \frac{T_s}{T} \, (0 \le f \le 1)$ - -<div class="container flex justify-left"> - <img src="/images/runtime.png" class="block w-lg"/> -</div> - -**The speedup with n-fold (n processors) parallelisation:**: - -- total execution time: $T_n = T_s + \frac{T_p}{n}$ -- parallel speedup: $S_n = \frac{T}{T_n} = \frac{T}{T_s + \frac{T_p}{n}} = \frac{1}{f + \frac{1-f}{n}} = \frac{n}{(n-1)f + 1}$ -- parallel effinciency: $E_n = \frac{T}{nT_n} = \frac{S_n}{n} = \frac{1}{(n-1)f + 1}$ - -<div class="container flex justify-left"> - <img src="/images/p-runtime.png" class="block w-lg"/> -</div> - ---- -title: Recap IV ---- - -### Amdahl's Law (cont.) - -**What happens when $n \rightarrow \infty$?** - -<v-click> - -- $T_{\infty} = T_s + \frac{T_p}{\infty} \rightarrow T_s$ -- $S_{\infty} = \frac{T}{T_{\infty}} \rightarrow \frac{T}{T_s} = \frac{1}{f}$ -- $E_{\infty} = 1$ if $f = 0$; otherwise $E_{\infty} = 0$ - -</v-click> - -<v-click> - -<div class="container flex justify-left mt-5"> - <img src="/images/inf-runtime.png" class="block w-lg"/> -</div> - -</v-click> - -<v-click> - -**What does this tell us?** - - -</v-click> - -<v-click> - -**No parallel program can outrun the sum of its sequential parts!** - -</v-click> - - ---- -title: Recap V +title: Agenda +layout: center --- -### Amdahl's Law - Exercise 1 - -How is **system performance** altered when **some component** is changed? - - -Program execution time is made up of **75% CPU time** and **25% I/O time**. Which is the better enhancement: - -- a) Increasing the CPU speed by 50% or -- b) Reducing the I/O time by 50%? - -<div class="container flex justify-center mt-5"> - <img src="/images/execution-model.png" class="block w-lg"/> -</div> - -**Hint:** Use Amdahl's Law and derive the speedup for each case. +# Agenda +- Presentation +- Java --- -title: Recap V +title: Java I --- -### Amdahl's Law - Exercise 2 - -A program made up of **10% serial initialization and finalization code** and it has a **fully parallelizable loop of N iterations**. - -Assumption: **fork/join** overhead is negligible, execution time for parallelizable loop is scales linearly with N, that is: - -- For p processors, each processor executes $\frac{N}{p}$ iterations -- Parallel time for executing the loop is: $T_{ploop} = \frac{T_{loop}}{p}$ - -Given: $T_{serial} = 0.1T$ and $T_{loop} = 0.9T$, answer the following questions: - -- a) What is the speedup of the program with 4 processors? -- b) What is the maximum speedup of the program? -- c) What can we conclude from this? +# Java JDK, JRE and JVM ---- -title: Recap VI ---- +Know the difference between JDK, JRE and JVM. -# Recap: Foster's Design Methodology +### What is JVM? -**What are the steps to parallelize a program using Foster's Design Methodology?** - -<v-click> +JVM (Java Virtual Machine) is an abstract machine that enables your computer to run a Java program. <div class="container flex justify-center mt-20"> - <img src="/images/fosters-model.png" class="block w-lg"/> + <img src="/images/how-java-program-runs.webp" class="block w-lg"/> </div> -</v-click> - --- -title: Recap VI +title: Java II --- -## Exercise +### What is JDK? -Parallelize a program using Foster's Design Methodology - -- Given a set of $n$ values: $a_0, a_1, a_2, \dots, a_{n-1}$ -- Given an associative binary operator $\oplus$ (e.g. addition, multiplication, etc.) -- **Reduction**: Compute the result of applying $\oplus$ to all values in the set: $a_0 \oplus a_1 \oplus a_2 \oplus \dots \oplus a_{n-1}$ -- On a single processor, $n - 1$ operations are required to compute the result -- How to parallelize this reduction operation? - -<br/> - -### Rules - -- Reduction operation is the **sum** of all values in the set -- Form two groups and each group should come up with a solution using Foster's Design Methodology. -- Each group should present their solution to the other group. - ---- -title: Recap VI -layout: center ---- +JDK (Java Development Kit) is a software development kit required to develop applications in Java, including JRE, the compiler, interpreter, etc. -### A Possible Solution - Partitioning -<div class="container flex justify-center mt-5"> - <img src="/images/partition.png" class="block w-md"/> +<div class="container flex justify-center mt-20"> + <img src="/images/jdk.webp" class="block w-sm"/> </div> --- -title: Recap VI -layout: center +title: Java III --- -### A Possible Solution - Communication - -<div class="container flex justify-center mt-5"> - <img src="/images/communication.png" class="block w-md"/> -</div> +### What is JRE? ---- -title: Recap VI -layout: center ---- +JRE (Java Runtime Environment) is a software package that provides Java class libraries, Java Virtual Machine (JVM), and other components that are required to run Java applications. -### A Possible Solution - Agglomeration and Mapping - -<div class="container flex justify-center mt-5"> - <img src="/images/agglomeration-mapping.png" class="block w-md"/> +<div class="container flex justify-center mt-20"> + <img src="/images/jre.webp" class="block w-sm"/> </div> --- -title: Evaluation -layout: two-cols +title: Java IV --- -## Evaluation - -**Tutorial on Thursday:** - -<div class="container flex justify-left mt-5"> - <img src="/images/do-8-10.png" class="block w-1/2"/> -</div> - -https://lehrevaluation.fu-berlin.de/productive/de/sl/45t12G5YcM0a +### Relationship between JVM, JRE, and JDK -:: right :: - -## _ - -**Tutorial on Friday:** - -<div class="container flex justify-left mt-5"> - <img src="/images/fri-12-14.png" class="block w-1/2"/> +<div class="container flex justify-center mt-20"> + <img src="/images/jdk-jre-jvm.webp" class="block w-sm"/> </div> -https://lehrevaluation.fu-berlin.de/productive/de/sl/X2vKMukzWSsk - - --- -title: MPI Collective Communication +title: Java Dev Environment Setup layout: center --- -## MPI Collective Communication - -- MPI collective operations involve all ranks in a given communicator at the same time. -- **All ranks must make the same MPI call for the operation to succeed.** -- Some collective operations are globally synchronous. - ---- -title: Data Replication (Broadcast) ---- - -## Data Replication (Broadcast) - -Replicate data from one rank to all other ranks: - -```c -MPI_Bcast (void *data, int count, MPI_Datatype dtype, int root, MPI_Comm comm) -``` - -<div class="container flex justify-center mt-5"> - <img src="/images/MPI-broadcast.png" class="block w-lg"/> -</div> - -### Notes - -- in all ranks but root, data is an output argument -- in rank root, data is an input argument -- **MPI_Bcast completes only after all ranks in comm have made the call** - ---- -title: Data Scatter ---- - -## Data Scatter - -Distribute chunks of data from one rank to all ranks: - -```c -MPI_Scatter (void *sendbuf, int sendcount, MPI_Datatype sendtype, - void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm) -``` - -<div class="container flex justify-center mt-5"> - <img src="/images/MPI-scatter.png" class="block w-lg"/> -</div> - -### Notes - -- **sendbuf** must be large enough in order to supply **sendcount** elements -- data chunks are taken in increasing order following the receiver’s rank -- root also sends one data chunk to itself -- **for each chunk the amount of data sent must match the receive size** - ---- -title: Data Gather ---- - -## Data Gather - -Collect chunks of data from all ranks in one place: - -```c -MPI_Gather (void *sendbuf, int sendcount, MPI_Datatype sendtype, - void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm) -``` - -<div class="container flex justify-center mt-5"> - <img src="/images/MPI-gather.png" class="block w-lg"/> -</div> - -### Notes - -- The opposite operation of **MPI_Scatter** -- root also receives one data chunk from itself -- data chunks are stored in increasing order of the sender’s rank - ---- -title: Gather-to-All ---- - -## Gather-to-All - -Collect chunks of data from all ranks in all ranks: - -```c -MPI_Allgather (void *sendbuf, int sendcount, MPI_Datatype sendtype, - void *recvbuf, int recvcount, MPI_Datatype recvtype, MPI_Comm comm) -``` - -<div class="container flex justify-center mt-5"> - <img src="/images/MPI-gather-to-all.png" class="block w-lg"/> -</div> - -### Notes - -- each rank distributes its **sendbuf** to every rank in the communicator -- almost equivalent to **MPI_Scatter** + **MPI_Gather** - ---- -title: All-to-All ---- - -## All-to-All - -Combined scatter and gather operation: - -```c -MPI_Alltoall (void *sendbuf, int sendcount, MPI_Datatype sendtype, - void *recvbuf, int recvcount, MPI_Datatype recvtype, MPI_Comm comm) -``` - -<div class="container flex justify-center mt-5"> - <img src="/images/MPI-all-to-all.png" class="block w-lg"/> -</div> - -### Notes - -- a kind of global chunked transpose - ---- -title: Global Reduction ---- - -## Global Reduction - -Perform an arithmetic reduction operation while gathering data: - -```c -MPI_Reduce (void *sendbuf, void *recvbuf, int count, - MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm) -``` - -<div class="container flex justify-center mt-5"> - <img src="/images/MPI-global-reduction.png" class="block w-sm"/> -</div> - -### Notes - -- Result is computed **in- or out-of-order** depending on the operation - - **All predefined operations are associative and commutative** - - **Beware of non-commutative effects on floats** - ---- -title: Global Reduction ---- - -## Global Reduction - -Some predefined operations for reduction: - -<div class="container flex justify-center mt-5"> - <img src="/images/MPI-predefined-ops.png" class="block w-lg"/> -</div> - -### Notes - -- You can create your own reduction operations (not covered here) - ---- -title: Global Reduction ---- - -## Global Reduction - -Perform an arithmetic reduction and broadcast the result: - -```c -MPI_Allreduce (void *sendbuf, void *recvbuf, int count, - MPI_Datatype datatype, MPI_Op op, MPI_Comm comm) -``` +### Java Dev Environment Setup -### Notes - -- every rank receives the result of the reduction operation -- equivalent to **MPI_Reduce + MPI_Bcast** with the same root -- also beware of non-commutative effects - ---- -title: MPI Extensions ---- - -## MPI Extensions - -E.g. **MPI 2**, **MPI 3**, **MPI 4** ... (**MPI 5 is in the making**) - -Starting with MPI 2, the MPI standard has been extended with additional functionality: - -- Dynamic process management -- File I/O -- Shared memory +<br/> -See live demo. +- Using an IDE is recommended: [IntelliJ IDEA](https://www.jetbrains.com/idea/download/), [Eclipse](https://www.eclipse.org/downloads/) +- Install JDK: [Oracle JDK 17](https://docs.oracle.com/en/java/javase/17/install/overview-jdk-installation.html#GUID-8677A77F-231A-40F7-98B9-1FD0B48C346A) (same version on Andorra) +- Use [Gradle](https://gradle.org/) for building and running your project, or use the IDE's built-in tools diff --git a/slides/slides.md b/slides/slides.md index 80898177c5bc5d711c6731c745b25f538d676273..ff1548027490065c5662de3218bff499e18cc29e 100644 --- a/slides/slides.md +++ b/slides/slides.md @@ -16,7 +16,7 @@ transition: fade-out css: unocss --- -# ALP4 Tutorial 8 +# ALP4 Tutorial 9 ## Chao Zhan