Overview
Learning data defines which elements you're targeting and separates sample points into positive (mineralized) and negative (unmineralized) examples. These labeled points train the model to recognize patterns in your dataset and make accurate predictions.
In this step, you'll select your target elements and set thresholds to classify your learning points.
What is Learning Data?
Learning data comes from the Learning Points Shapefile that you upload as part of your project data. This file contains sample points, including attributes such as element grades and elevation values.
At a minimum, each learning point must include assay data, with each value containing the following information:
Coordinates (easting and northing)
Elevation
Most learning points are sourced from drill assays (which are de-surveyed to place them accurately in 3D space) or in-situ surface rock assays. Precise locations are essential for the model to make accurate predictions.
From this file, DORA generates learning points that are labeled as positive or negative based on the thresholds you define:
Positive learning points represent samples that meet or exceed a chosen grade threshold.
Negative learning points fall below that threshold.
In early-stage exploration projects where assay data may be limited, you can optionally include synthetic learning points based on publicly available geological information to guide model learning.
Why This Step Matters
Geoscience Perspective
Separating mineralized from unmineralized data mirrors the way geoscientists interpret anomalies. It ensures the model is trained on examples that reflect real-world exploration targets, leading to more reliable predictions.
AI Perspective
From a data science perspective, learning data forms the foundation of a two-class classification model, where each sample is labelled as either positive or negative. Clear thresholds help the model distinguish meaningful patterns from irrelevant data.
A balanced dataset is recommended, typically with around 10–20% positive labeled learning points and the rest negative. This balance prevents model bias and improves both training accuracy and predictive performance.
💡 Learning Data is the foundation for the Confusion Matrix, one of three output graphs generated by DORA alongside the VRIFY Prospectivity Score (VPS). The Confusion Matrix evaluates how well your labeled examples helped the model distinguish between mineralized and unmineralized zones.
Learn more in Prediction Accuracy (Confusion Matrix).
Step by Step Instructions
Open Set Up Learning Data
From the experiment setup panel, click Step 3: Set up Learning Data.
Select Learning Points File
Select the Learning Points file from the dropdown. If only one file exists, it is used automatically and this option will not be available.
Select Elevation Field
Configure Learning Data Filters
Select a Target Element from the dropdown.
Set the Greater Than/Less Than toggle depending on how the element should be classified.
Enter a threshold grade. Samples meeting or exceeding this grade are considered positive. Others will be negative.
The minimum and maximum values for your element are shown below the input.
Note: The Learning Data Breakdown table will only appear after you click Generate Learning Data. If unsure where to start, try any value within the allowed range and adjust based on the real-time results after completing step 5 below.
(Optional) Add More Target Elements
Click Add to include additional elements and set individual thresholds. If using multiple elements, choose how they should be evaluated.
AND: All thresholds must be met to classify a point as positive. Use this when targeting a specific combination of elements.
OR: Any threshold met will classify a point as positive. Use this for broader targeting across multiple elements.
Review Learning Data Breakdown
Complete Step
Click Proceed to save and continue.
Tips & Considerations
Learning Data Breakdown Tips
The proportion of positive to negative learning points significantly affects how well DORA can learn from your data:
A positive ratio below 10% gives the model too few examples to learn from, increasing the risk of skewed or unreliable predictions.
Too few positives can distort clustering, leading to target areas with no positive samples at all.
Setting a very high threshold (e.g., top 2% copper grades) narrows focus but limits variation, reducing the model’s ability to generalize.
Aim for a 10–20% positive ratio to balance grade quality with enough examples for effective learning.
If your learning points are tightly clustered in one area, you may need a higher proportion of positives to teach the model local mineralization patterns. This will become apparent when you reach Step 6: Build Predictive Model.
Expanding Coverage with Regional Examples
To capture broader targets, consider adding learning points near known deposits or documented mineral occurrences, especially if they align with the mineralization style you’re exploring.
In some cases, synthetic learning points (positive or negative) can help train the model using relevant external signals when data is limited.
Final Trade-off Reminder
High thresholds focus the model on elite mineralization, but may result in too few positives to learn from. Lower thresholds give you more positives, but may include lower-quality samples.
Think strategically: what is the geologically meaningful target, and how many examples do you need to represent it well?
Learn More
Previous Steps:
Next Steps:
Still Have Questions?
Reach out to your dedicated DORA contact or email support@VRIFY.com for more information.







