In the project we analyze shared genetic traits between three psychiatric diseases, namely autism, depression and schizophrenia using publicly available RNA-Seq and DNA Methylation datasets.
This repository contains the notebooks with conducted analyses of RNA-Seq and DNA Methylation datasets.
All data necessary to run the notebooks can be downloaded from GEO database and this [link](TODO)
### Statistical analysis
#### Differential Expression Analysis (limma)
`differential_expression_analysis.Rmd`
In order to run the analysis, please download following datasets from GEO database:
In order to run the analysis, please download following datasets from GEO database:
TODO
### Machine Learning
Following notebooks contain code for multiclass classification based on RNA-Seq and DNA Methylation data:
*`rnaseq_ml.ipynb` [RNA-Seq]
*`methylation_ml.ipynb` [DNA Methylation]
In order to run the notebooks, choose one of the two options:
* Run `differential_expression_analysis.Rmd` and `differential_methylation_analysis.Rmd` in order to generate necessary input data
* (recommended) Download the already generated input data from this [link](TODO)
### Annotation
`Methylation_Postprocessing.ipynb`
TODO
`annotation_gsea.ipynb`
TODO
# DSLS Project
## Shared genetic traits in psychiatric disorders
In the project we analyze shared genetic traits between three psychiatric diseases, namely autism, depression and schizophrenia using publicly available RNA-Seq and DNA Methylation datasets.
This repository contains the notebooks with conducted analyses of RNA-Seq and DNA Methylation datasets.
All data necessary to run the notebooks can be downloaded from GEO database and this [link](https://drive.google.com/drive/folders/1V1I6pUEiTr2J5Ixma6nM69cd_u1aqiFp?usp=drive_link).
### Statistical analysis
#### Differential Expression Analysis (limma)
`differential_expression_analysis.Rmd`
In order to run the analysis, please download following datasets from GEO database:
* Autism dataset - [GSE25507 Series matrix](https://ftp.ncbi.nlm.nih.gov/geo/series/GSE25nnn/GSE25507/matrix/)
* Schizophrenia dataset - [GSE27383 Series matrix](https://ftp.ncbi.nlm.nih.gov/geo/series/GSE27nnn/GSE27383/matrix/)
* Depression dataset - [GSE98793 Series matrix](https://ftp.ncbi.nlm.nih.gov/geo/series/GSE98nnn/GSE98793/matrix/)
Following notebooks contain code for multiclass classification based on RNA-Seq and DNA Methylation data:
*`rnaseq_ml.ipynb` [RNA-Seq]
*`methylation_ml.ipynb` [DNA Methylation]
In order to run the notebooks, choose one of the two options:
* Run `differential_expression_analysis.Rmd` and `differential_methylation_analysis.Rmd` in order to generate necessary input data
* (recommended) Download the already generated input data from this [link](https://drive.google.com/drive/folders/19xE-Op_HhuKsD_RzS7DQ4XFe3gN7WDv6?usp=drive_link)
*`/rna-seq` folder for `rnaseq_ml.ipynb`
*`/dna-methylation` folder for `methylation_ml.ipynb`
#### Annotation
`Methylation_Postprocessing.ipynb`
The following files are declared in the notebook's DMP part as df_mdd, df_asd and df_scz: