From d1ac5cd990bf2a3fb728ae376460abc8ad9ff79c Mon Sep 17 00:00:00 2001 From: mishraa94 <mishraa94@mi.fu-berlin.de> Date: Thu, 28 Mar 2024 01:02:56 +0000 Subject: [PATCH] Update README.md --- README.md | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 62 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index c6e19aa..1617116 100644 --- a/README.md +++ b/README.md @@ -3,8 +3,11 @@ The module is divided into three blocks namely, Data Science, Complex systems, and Advanced algorithms. You can find the project reports for each block, and presentation on **Project 1** (given on 18th November 2022) in the respective folders. -## Data Science -Contributors: Jule Brenningmeyer, Maike Herkenrath, Abhinav Mishra, Se Yeon Kim +## Data Science + +Contributors: Jule Brenningmeyer, Maike Herkenrath, Abhinav Mishra, Se Yeon Kim +Language: Python, R + ### Project 1 The data, and figures used in writing the report section for **Project 1** can be found in the respective subfolders. The scripts has ran successfully on a system, and the compiled R markdown pdf has been added for better understanding. Here's some description of the script if you'd like to read: @@ -33,6 +36,60 @@ Since the classes were not that imbalanced, there was no change made in the part ### Project 2 -## Complex Systems -Contributors: Jule Brenningmeyer, Maike Herkenrath, Abhinav Mishra -### Exercise 1 +## Complex Systems + +Contributors: Jule Brenningmeyer, Maike Herkenrath, Abhinav Mishra +Language: Python + +### Exercise 1 + +**Pen & Paper** - Modelling + +**Implementation** - Write a small program to generate the stoichiometric matrix and propensity function vector. Regrading the former, only use stoichiometric coefficients -1, 0 or 1. + +### Exercise 2 + +**Programming** + +1. Write a program implementing SIR (usceptible-infected-recovere) model and generate trajectories using the stochastic simulation algorithm (SSA; also called Gillespie’s algorithm). +2. Write a program implementing Slögl model and generate trajectories using the stochastic simulation algorithm. + +### Exercise 3 + +1. +- Write a program implementing predator-prey model. +- Extend your program above by solving the ODEs with the built-in ODE-solver `scipy.integrate.solve_ivp` using ’RK45’ for integration. +- Plot the simulation output of the explicit Euler method and superimpose the solution obtained from the adaptive step-size method (’RK45’). +2. +- Implement a pharmacokinetic model and perform simulations using the stochastic simulation algorithm (SSA). +- Simulate the corresponding ordinary differential equations (ODEs) with the built-in ODE-solver `scipy.integrate.solve_ivp` using ’RK45’ for integration. + +### Exercise 4 + +1. Parameter estimation using `curve_fit`. +2. Perform stochastic simulations with the SSA algorithm to study how the infection probability depends on the number of viruses that an individual is exposed +with. +3. Perform 300 stochastic simulations. Stop your simulation when +a) either the viral infection has been eliminated, or +b) the number of viruses exceeds 50, i.e. X2(t) ≥ 50. Count the former as an elimination event and the latter as an infection event. Based on the outcome of the 300 simulations, estimate the infection probability. +c) Plot the infection probability (y-axis) as a function of the viral exposure (x-axis). +d) Can you come up with a simple formula that estimates the elimination probability after exposure with n viruses, based on the elimination probability after exposure with a single virus? + +## Project 3 + +Contributors: Jule Brenningmeyer, Maike Herkenrath, Abhinav Mishra +Language: C++, R, perl + +### Advanced Algorithms + +1. You are given a reference text, which has parts of the first chromosome of the human genome. You also got a list of markers ’illumina reads XYZ.fasta.gz’. You need to figure out how many of these markers appear in the reference. +2. Implement a naive search algorithm (don’t use an index). +3. Implement a suffix array based search. +4. Benchmark (runtime and memory) your solutions for 1’000, 10’000, 100’000 1’000’000 queries of length 100. +5. Benchmark (runtime) queries of the length 40, 60, 80, and 100 with a suitable numberof queries. +6. Implement an fmindex based search. +7. Benchmark (runtime and memory) your solutions for 1’000, 10’000, 100’000 1’000’000 queries of length 100. +8. Which algorithm is best suited? +9. Download the Humane Reference Genome https://www.ncbi.nlm.nih.gov/assembly/GCF_000001405.26/ (1. click on download assembly, 2. Source database RefSeq3. File type: Genomic FASTA (.fna), 3. Download) +10. Benchmark (runtime) of queries with k = 0, k = 1 and k = 2 errors of length 40, 60, 80, and 100 with a suitable number of queries. + -- GitLab