diff --git a/README.md b/README.md index 61659138a36d4312ac8ee34315cb33e3e4606274..2bd5b11310d20988efaf81ec17892654ad16f743 100644 --- a/README.md +++ b/README.md @@ -7,6 +7,8 @@ The input files from folder 'liver' are used each step. ### Step 2 : Do differential expression analysis for feature selection Run the file "liver cancer mirna_mrna and methylation differential analysis.R" using R or R Studio. It will use the files from "liver" folder as input. Differentially expressed columns will be found for each type of data(mrna, mirna, methylation). Files named "Diff_mrnaupdated.csv", "Diff_methylupdated.csv" and "Diff_mirnaupdated.csv" will be generated and will be used in Step 4. +(Some of the output files generated for the Differential expression data produced could be commented out, to generate the output for those data please uncomment those lines). + ### Step 3 : Do logistic regression analysis for feature selection + Machie Learling classification models for it Run the python notebook "Logreg_part.ipynb" using Jupyter Notebook or Colab. It will use the files from "liver" folder as input. Number of features will be reduced using logistic regression. 2 binary classification models will be trained. "imp_features_svm_logreg.csv" and "imp_features_gb_logreg.csv" will be generated as output files, but only the first one will be used in Step 5 as results of the best model. The latter is provided if wanted for additional evaluation