From c287309bbbe526de7607efc7c7897755918eaf1a Mon Sep 17 00:00:00 2001
From: fisched99 <fisched99@mi.fu-berlin.de>
Date: Wed, 14 Jun 2023 08:31:25 +0000
Subject: [PATCH] update readme

---
 README.md          | 13 ++++---------
 project2/README.md |  2 +-
 2 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/README.md b/README.md
index dcc3016..3512d56 100644
--- a/README.md
+++ b/README.md
@@ -2,13 +2,8 @@
 ## Group 2
 Project code for the masters course applied sequence analysis.
 
-## Run pipeline
-To run the pipeline first change into the project1 directory and then define either the `samples.tsv` containing the samplenames and paths to the different sequence files, or the directory containing the sequence files, either via the config file (`project1/config/config.yaml`) or as a command line argument using the respective flags `input_directory` and `samples`. You then also have to provide a reference genome file for the mapping rule:
+## Project 1
+This project contains the basics of Snakemake and was part of the Snakemake tutorial of the course.
 
-```
-snakemake --profile config/local --config samples=input.tsv ref=path/to/ref.fa
-```
-or
-```
-snakemake --profile config/local --config input_directory=path/to/sequence/files ref=path/to/ref.fa
-```
\ No newline at end of file
+## Project 2
+This project contains a workflow to process (viral) NGS data including quality control, trimming, decontamination, denovo assembly, assembly polishing, scaffolding, clustering of the assemblies and per base variance analysis.
diff --git a/project2/README.md b/project2/README.md
index 0e33337..a408e78 100644
--- a/project2/README.md
+++ b/project2/README.md
@@ -5,7 +5,7 @@ Project code for the masters course applied sequence analysis.
 ## Test data
 Test data (SARS-CoV2 sequencing data and human reference genome) can be found [here](https://box.fu-berlin.de/s/dt2d5MbwaxjfWtZ).
 
-The SARS-CoV2 reference genome can be found in the resources directory in `ref.fasta`.
+The SARS-CoV2 reference genomes (different variants for optimal reference selection in the scaffolding process) can be found under `resources/references` in the respective `.fasta` files.
 
 ## Run pipeline
 To run the pipeline define either the `samples.tsv` containing the samplenames and paths to the different sequence files, or the directory containing the sequence files, either via the config file (`project1/config/config.yaml`) or as a command line argument using the respective flags `input_directory` and `samples`. You then also have to provide a reference genome file of your target organism for the scaffolding rule and a kraken2 database for the screening (please provide a host sequence if you want to run the decontamination step):
-- 
GitLab