hist(tT2$adj.P.Val, col = "grey", border = "white", xlab = "P-adj",
ylab = "Number of genes", main = "P-adj value distribution")
```
The above plot shows the adjusted p-value distribution across the number of genes. The p-value for this experiment was adjusted using Benjamini Hochberg method(BH). Genes falling to the left of the significance threshold of 0.05 are considered statistically significant and may have potential biological relevance.
```{r}
# summarize test results as "up", "down" or "not expressed"
The Venn diagram visualizes how many genes fall into each category ("up," "down," and "not expressed"), and if there are any overlapping genes between these categories. In the plot above we can see that , there were 6928 significant up and down regulated genes. others 15349 were not expressed.
```{r}
# create Q-Q plot for t-statistic
t.good <- which(!is.na(fit2$F)) # filter out bad probes
qqt(fit2$t[t.good], fit2$df.total[t.good], main="Moderated t statistic")
```
Q-Q plots (Quantile-Quantile plots) are useful for visually assessing whether the observed data follows an expected theoretical distribution, such as the normal distribution. In the plot above we can observe that the data points fall approximately along a straight line which suggests that the t-statistics are approximately normally distributed.
Volcano plots are commonly used in genomics to identify differentially expressed genes based on their statistical significance and magnitude of change. The resulting volcano plot has have the log-fold change (x-axis) i.e. 0.263 plotted against the negative logarithm of the adjusted p-value (y-axis)i.e. 0.05. Genes with a significant change in expression (based on adjusted p-values) appear as points far away from the center along the y-axis, while genes with a substantial fold change appear farther away from the center along the x-axis.
```{r}
# MD plot (log fold change vs mean log expression)
MD plots are commonly used to assess the magnitude and direction of gene expression changes between groups. Each point on the plot represents a gene, and the position of the point indicates the log-fold change and the mean log expression level for that gene.
1. Box-and-Whisker Plot:depicts the distribution of gene expression values across distinct sample groups (here, normal and cancer).The boxes show the interquartile range (IQR), while the centre line inside the box reflects the median expression value. The whiskers extend from the margins of the boxes and represent the variability of the data. Outliers are points that are not within the whiskers. The figure allows us to examine the expression distributions of the two groups and discover any variations in their central tendencies and spread.\
2. Expression Value Distribution Plot: shows the density of gene expression levels for each sample group.The plot shows how gene expression values are spread among each group. It enables us to determine if the distributions of the normal and cancer groups are similar or dissimilar. Denser patches imply higher levels of gene expression, whereas sparser regions indicate lower levels of expression.\
3. UMAP Plot (Dimensionality Reduction): displays a reduced-dimensional depiction of gene expression data using the UMAP method. The UMAPmethod uses dimensionality reduction to project high-dimensional gene expression data onto a 2D space. Each point on the plot represents a sample, and the colors indicate whether the sample is normal or cancer. The map allows us to see the separation or grouping of samples depending on their gene expression patterns. \
4. Mean-Variance Trend Plot:depicts the connection between mean expression and variance in gene expression data. The plot shows us the variation of gene expression varies with the mean expression level. \