Overview
In this article, you will learn how DORA organizes your Learning Points into train-test clusters to train and validate a predictive model. Understanding this process helps you interpret your Prediction Accuracy results with confidence and know what the model has been tested on.
What Are Train-Test Clusters
When you click Run Prediction in Step 4: Select Input Features, DORA automatically groups your Learning Points into spatial clusters, which are sets of geographically nearby points. These clusters are used to separate training data from testing data.
DORA applies a method called spatial cross-validation.
Rather than training on some points and testing on others chosen at random, it rotates through each cluster one at a time. Each cluster takes a turn as the test set while the model trains on the remaining clusters. This process repeats until every cluster has been held out for testing.
You can view train-test clusters on the Prediction Map from the 3D Layers List, on their own or overlaid with other data such as Learning Points.
Why DORA Uses This Method
Many machine learning models use a random split; for example, 70% of points for training and 30% for testing. DORA uses a different approach, because random splits are not well-suited to geospatial data.
Mineralization is spatially correlated. Points that are geographically close tend to share similar geological characteristics. If training and test points are chosen at random, they often end up as neighbours. The model can then be evaluated on patterns it has effectively already seen nearby, which can produce results that look strong but do not hold up on genuinely unexplored ground.
By using spatial cross-validation, DORA ensures that the test set is always geographically independent from the training set. This gives a more realistic measure of how the model will perform when predicting in new, unsampled areas — the “core” purpose of prospectivity mapping.
How Performance is Measured
The performance metrics shown in the Prediction Accuracy and Performance Breakdown outputs reflect how the model performed across geographically independent test clusters. This includes accuracy, precision, recall, and F1 scores.
Because the test clusters are spatially distinct, these metrics represent the model's ability to generalise across different parts of your Area of Interest (AOI), not just areas it has already seen.
Learn More
Still Have Questions?
Reach out to your dedicated DORA contact or email support@VRIFY.com for more information.

