@@ -7,6 +7,8 @@ The input files from folder 'liver' are used each step.
### Step 2 : Do differential expression analysis for feature selection
Run the file "liver cancer mirna_mrna and methylation differential analysis.R" using R or R Studio. It will use the files from "liver" folder as input. Differentially expressed columns will be found for each type of data(mrna, mirna, methylation). Files named "Diff_mrnaupdated.csv", "Diff_methylupdated.csv" and "Diff_mirnaupdated.csv" will be generated and will be used in Step 4.
(Some of the output files generated for the Differential expression data produced could be commented out, to generate the output for those data please uncomment those lines).
### Step 3 : Do logistic regression analysis for feature selection + Machie Learling classification models for it
Run the python notebook "Logreg_part.ipynb" using Jupyter Notebook or Colab. It will use the files from "liver" folder as input. Number of features will be reduced using logistic regression. 2 binary classification models will be trained. "imp_features_svm_logreg.csv" and "imp_features_gb_logreg.csv" will be generated as output files, but only the first one will be used in Step 5 as results of the best model. The latter is provided if wanted for additional evaluation