From ce97404ae7e68b2e055fbf9caa948e5e57290f06 Mon Sep 17 00:00:00 2001
From: mdriller <mdriller@mi.fu-berlin.de>
Date: Sun, 27 Oct 2019 14:14:24 +0000
Subject: [PATCH] Update README.md

---
 README.md | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/README.md b/README.md
index 39bb20c..b8f7aaa 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-# REAPRLong : A Tool to Scaffold and Quality Control genome assemblies using (low coverage) long reads
+# REAPRLong: Improvement and QC of genome assemblies using (low coverage) long reads
 <p align="center"> 
 <img src="figures/simple_workflow.svg">
 </p>
@@ -37,16 +37,16 @@ REAPRLong can be used as follows:
   
 REAPRLong generates multiple output files in the specified output directory.
 
-1. scaffolds.fasta - the generated scaffolds in fasta format  
-2. scaffolds.stats - statistics generated for scaffolds.fasta (total basepairs in the assembly, number of scaffolds, longest scaffold, average length and N10/20/30/40/ 50/60/70/80/90/100 values)  
-3. scaffolds.gff - gff3 file describing the regions of each new scaffold. Regions can either come from previous contigs or from reads if a gap was filled.  
-4. duplicates.fasta - fasta file containing contigs that were fully part of another contig and thus removed from the assembly.  
-5. adjusted\_contigs\_it\*.fa - fasta file containing adjusted contigs, if the QC identified misassemblies and broke the previous input. The \* is an integer value indicating the iteration of QC, starting with 0.   
-6. coverage\_map\_it\*.gff - a "coverage" map for the input assembly of each iteration. Regions are summarised giving a start and end position and the support given for the region. The support describes the amount of reads mapping continuously in the region substracted by the amount of reads mapping dis-continuously. Negative numbers indicate misassemblies.  
-7. deletions\_it\*.txt - identified deletions (within the genome compared to the reads). The \* is an integer value indicating the iteration of QC, starting with 0 which represents the original assembly. Every subsequent number relates to the adjusted\_contigs\_it\*.fa of the previous iteration.  
-8. insertions\_it\*.txt - identified insertions (within the genome compared to the reads). The \* is an integer value indicating the iteration of QC, starting with 0 which represents the original assembly. Every subsequent number relates to the adjusted\_contigs\_it\*.fa of the previous iteration.  
-9. inversions\_it\*.txt - identified inversions (within the genome compared to the reads). The \* is an integer value indicating the iteration of QC, starting with 0 which represents the original assembly. Every subsequent number relates to the adjusted\_contigs\_it\*.fa of the previous iteration.  
-10. misjoins\_it\*.txt - identified misjoins (within the genome compared to the reads). The \* is an integer value indicating the iteration of QC, starting with 0 which represents the original assembly. Every subsequent number relates to the adjusted\_contigs\_it\*.fa of the previous iteration.  
+1. **scaffolds.fasta** - the generated scaffolds in fasta format  
+2. **scaffolds.stats** - statistics generated for scaffolds.fasta (total basepairs in the assembly, number of scaffolds, longest scaffold, average length and N10/20/30/40/ 50/60/70/80/90/100 values)  
+3. **scaffolds.gff** - gff3 file describing the regions of each new scaffold. Regions can either come from previous contigs or from reads if a gap was filled.  
+4. **duplicates.fasta** - fasta file containing contigs that were fully part of another contig and thus removed from the assembly.  
+5. **adjusted\_contigs\_it\*.fa** - fasta file containing adjusted contigs, if the QC identified misassemblies and broke the previous input. The \* is an integer value indicating the iteration of QC, starting with 0.   
+6. **coverage\_map\_it\*.gff** - a "coverage" map for the input assembly of each iteration. Regions are summarised giving a start and end position and the support given for the region. The support describes the amount of reads mapping continuously in the region substracted by the amount of reads mapping dis-continuously. Negative numbers indicate misassemblies.  
+7. **deletions\_it\*.txt** - identified deletions (within the genome compared to the reads). The \* is an integer value indicating the iteration of QC, starting with 0 which represents the original assembly. Every subsequent number relates to the adjusted\_contigs\_it\*.fa of the previous iteration.  
+8. **insertions\_it\*.txt** - identified insertions (within the genome compared to the reads). The \* is an integer value indicating the iteration of QC, starting with 0 which represents the original assembly. Every subsequent number relates to the adjusted\_contigs\_it\*.fa of the previous iteration.  
+9. **inversions\_it\*.txt** - identified inversions (within the genome compared to the reads). The \* is an integer value indicating the iteration of QC, starting with 0 which represents the original assembly. Every subsequent number relates to the adjusted\_contigs\_it\*.fa of the previous iteration.  
+10. **misjoins\_it\*.txt** - identified misjoins (within the genome compared to the reads). The \* is an integer value indicating the iteration of QC, starting with 0 which represents the original assembly. Every subsequent number relates to the adjusted\_contigs\_it\*.fa of the previous iteration.  
 ### Workflow
 <p align="center"> 
 <img src="figures/workflow_noOptional.svg">
-- 
GitLab