The evaluation panel presents the results that you would get if you used two different methods to predict: mode (for classifications) or mean (for regressions) and random:
- Mode: Using always the most repeated value of the set as prediction.
- Mean: Using always the mean value of the test dataset as prediction.
- Random: Using random values from the test dataset as prediction.
For each of these three cases BigML separately calculates the different evaluation metrics in order to compare them. If your model evaluation metrics outperform the mode/mean and the random evaluation metrics, it's because its predictions are better than predicting the mode/mean or at random.
Please take a look at this blog post about evaluations, it may help you understand the process. Moreover, there are a few lessons about evaluations posted in the Summer School event from 2015: