Explaining Predictive Uncertainty in Transformer- Based Text Classification