Update README.md

d1ac5cd9 · mishraa94 · 1efec7c2 · d1ac5cd9
Commit d1ac5cd9 authored 1 year ago by mishraa94
--- a/README.md
+++ b/README.md
@@ -3,8 +3,11 @@

 The module is divided into three blocks namely, Data Science, Complex systems, and Advanced algorithms. You can find the project reports for each block, and presentation on **Project 1** (given on 18th November 2022) in the respective folders.

-## Data Science 
-Contributors: Jule Brenningmeyer, Maike Herkenrath, Abhinav Mishra, Se Yeon Kim
+## Data Science  
+
+Contributors: Jule Brenningmeyer, Maike Herkenrath, Abhinav Mishra, Se Yeon Kim 
+Language: Python, R 
+
 ### Project 1 
 
 The data, and figures used in writing the report section for **Project 1** can be found in the respective subfolders. The scripts has ran successfully on a system, and the compiled R markdown pdf has been added for better understanding. Here's some description of the script if you'd like to read:
@@ -33,6 +36,60 @@ Since the classes were not that imbalanced, there was no change made in the part

 ### Project 2  
 
-## Complex Systems 
-Contributors: Jule Brenningmeyer, Maike Herkenrath, Abhinav Mishra 
-### Exercise 1
+## Complex Systems  
+
+Contributors: Jule Brenningmeyer, Maike Herkenrath, Abhinav Mishra  
+Language: Python  
+
+### Exercise 1 
+  
+**Pen & Paper** - Modelling
+ 
+**Implementation** - Write a small program to generate the stoichiometric matrix and propensity function vector. Regrading the former, only use stoichiometric coefficients -1, 0 or 1. 
+ 
+### Exercise 2 
+  
+**Programming** 
+ 
+1. Write a program implementing SIR (usceptible-infected-recovere) model and generate trajectories using the stochastic simulation algorithm (SSA; also called Gillespie’s algorithm). 
+2. Write a program implementing Slögl model and generate trajectories using the stochastic simulation algorithm.  
+  
+### Exercise 3 
+
+1. 
+-  Write a program implementing predator-prey model. 
+-  Extend your program above by solving the ODEs with the built-in ODE-solver `scipy.integrate.solve_ivp` using ’RK45’ for integration. 
+-  Plot the simulation output of the explicit Euler method and superimpose the solution obtained from the adaptive step-size method (’RK45’).  
+2. 
+- Implement a pharmacokinetic model and perform simulations using the stochastic simulation algorithm (SSA). 
+- Simulate the corresponding ordinary differential equations (ODEs) with the built-in ODE-solver `scipy.integrate.solve_ivp` using ’RK45’ for integration.  
+ 
+### Exercise 4 
+ 
+1. Parameter estimation using `curve_fit`. 
+2. Perform stochastic simulations with the SSA algorithm to study how the infection probability depends on the number of viruses that an individual is exposed
+with. 
+3. Perform 300 stochastic simulations. Stop your simulation when   
+a) either the viral infection has been eliminated, or
+b) the number of viruses exceeds 50, i.e. X2(t) ≥ 50. Count the former as an elimination event and the latter as an infection event. Based on the outcome of the 300 simulations, estimate the infection probability.  
+c) Plot the infection probability (y-axis) as a function of the viral exposure (x-axis).
+d) Can you come up with a simple formula that estimates the elimination probability after exposure with n viruses, based on the elimination probability after exposure with a single virus?  
+  
+## Project 3  
+  
+Contributors: Jule Brenningmeyer, Maike Herkenrath, Abhinav Mishra  
+Language: C++, R, perl 
+
+### Advanced Algorithms
+ 
+1. You are given a reference text, which has parts of the first chromosome of the human genome. You also got a list of markers ’illumina reads XYZ.fasta.gz’. You need to figure out how many of these markers appear in the reference.
+2. Implement a naive search algorithm (don’t use an index).
+3. Implement a suffix array based search.
+4. Benchmark (runtime and memory) your solutions for 1’000, 10’000, 100’000 1’000’000 queries of length 100.
+5. Benchmark (runtime) queries of the length 40, 60, 80, and 100 with a suitable numberof queries.
+6. Implement an fmindex based search.
+7. Benchmark (runtime and memory) your solutions for 1’000, 10’000, 100’000 1’000’000 queries of length 100.
+8. Which algorithm is best suited?
+9. Download the Humane Reference Genome https://www.ncbi.nlm.nih.gov/assembly/GCF_000001405.26/ (1. click on download assembly, 2. Source database RefSeq3. File type: Genomic FASTA (.fna), 3. Download)
+10. Benchmark (runtime) of queries with k = 0, k = 1 and k = 2 errors of length 40, 60, 80, and 100 with a suitable number of queries.
+