update readme

c287309b · fisched99 · 3124e86a · c287309b · c287309b
Commit c287309b authored 2 years ago by fisched99
--- a/README.md
+++ b/README.md
@@ -2,13 +2,8 @@
 ## Group 2
 Project code for the masters course applied sequence analysis.
-## Run pipeline
+## Project 1
-To run the pipeline first change into the project1 directory and then define either the `samples.tsv` containing the samplenames and paths to the different sequence files, or the directory containing the sequence files, either via the config file (`project1/config/config.yaml`) or as a command line argument using the respective flags `input_directory` and `samples`. You then also have to provide a reference genome file for the mapping rule:
+This project contains the basics of Snakemake and was part of the Snakemake tutorial of the course.
-```
+## Project 2
-snakemake --profile config/local --config samples=input.tsv ref=path/to/ref.fa
+This project contains a workflow to process (viral) NGS data including quality control, trimming, decontamination, denovo assembly, assembly polishing, scaffolding, clustering of the assemblies and per base variance analysis.
-```
-or
-```
-snakemake --profile config/local --config input_directory=path/to/sequence/files ref=path/to/ref.fa
-```
\ No newline at end of file
--- a/project2/README.md
+++ b/project2/README.md
@@ -5,7 +5,7 @@ Project code for the masters course applied sequence analysis.
 ## Test data
 Test data (SARS-CoV2 sequencing data and human reference genome) can be found [here](https://box.fu-berlin.de/s/dt2d5MbwaxjfWtZ).
-The SARS-CoV2 reference genome can be found in the resources directory in `ref.fasta`.
+The SARS-CoV2 reference genomes (different variants for optimal reference selection in the scaffolding process) can be found under `resources/references` in the respective `.fasta` files.
 ## Run pipeline
 To run the pipeline define either the `samples.tsv` containing the samplenames and paths to the different sequence files, or the directory containing the sequence files, either via the config file (`project1/config/config.yaml`) or as a command line argument using the respective flags `input_directory` and `samples`. You then also have to provide a reference genome file of your target organism for the scaffolding rule and a kraken2 database for the screening (please provide a host sequence if you want to run the decontamination step):