Skip to content
Snippets Groups Projects
Commit c287309b authored by fisched99's avatar fisched99
Browse files

update readme

parent 3124e86a
No related branches found
No related tags found
No related merge requests found
...@@ -2,13 +2,8 @@ ...@@ -2,13 +2,8 @@
## Group 2 ## Group 2
Project code for the masters course applied sequence analysis. Project code for the masters course applied sequence analysis.
## Run pipeline ## Project 1
To run the pipeline first change into the project1 directory and then define either the `samples.tsv` containing the samplenames and paths to the different sequence files, or the directory containing the sequence files, either via the config file (`project1/config/config.yaml`) or as a command line argument using the respective flags `input_directory` and `samples`. You then also have to provide a reference genome file for the mapping rule: This project contains the basics of Snakemake and was part of the Snakemake tutorial of the course.
``` ## Project 2
snakemake --profile config/local --config samples=input.tsv ref=path/to/ref.fa This project contains a workflow to process (viral) NGS data including quality control, trimming, decontamination, denovo assembly, assembly polishing, scaffolding, clustering of the assemblies and per base variance analysis.
```
or
```
snakemake --profile config/local --config input_directory=path/to/sequence/files ref=path/to/ref.fa
```
\ No newline at end of file
...@@ -5,7 +5,7 @@ Project code for the masters course applied sequence analysis. ...@@ -5,7 +5,7 @@ Project code for the masters course applied sequence analysis.
## Test data ## Test data
Test data (SARS-CoV2 sequencing data and human reference genome) can be found [here](https://box.fu-berlin.de/s/dt2d5MbwaxjfWtZ). Test data (SARS-CoV2 sequencing data and human reference genome) can be found [here](https://box.fu-berlin.de/s/dt2d5MbwaxjfWtZ).
The SARS-CoV2 reference genome can be found in the resources directory in `ref.fasta`. The SARS-CoV2 reference genomes (different variants for optimal reference selection in the scaffolding process) can be found under `resources/references` in the respective `.fasta` files.
## Run pipeline ## Run pipeline
To run the pipeline define either the `samples.tsv` containing the samplenames and paths to the different sequence files, or the directory containing the sequence files, either via the config file (`project1/config/config.yaml`) or as a command line argument using the respective flags `input_directory` and `samples`. You then also have to provide a reference genome file of your target organism for the scaffolding rule and a kraken2 database for the screening (please provide a host sequence if you want to run the decontamination step): To run the pipeline define either the `samples.tsv` containing the samplenames and paths to the different sequence files, or the directory containing the sequence files, either via the config file (`project1/config/config.yaml`) or as a command line argument using the respective flags `input_directory` and `samples`. You then also have to provide a reference genome file of your target organism for the scaffolding rule and a kraken2 database for the screening (please provide a host sequence if you want to run the decontamination step):
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment