The problem with this is that NN are in general, overconfident in their predictions, even when they are wrong. This is a well known problem in the AI/ML literature. Using ensembles of overconfident predictors is not the same as getting an unbiased estimate of the uncertainty.
Yes, definitely. Though there has been some interesting papers recently on this using different methods to approximate bayesian posteriors in NN's. One of the most recent ones (Mi et al., 2019; https://arxiv.org/abs/1910.04858) that benchmarks a few different methods -- infer-dropout, infer-transformation, and infer-noise -- are all promising for different applications and neural network models (black, 'grey', 'white' box).