Overview
The beauty of AI exists in machines finding patterns and predictions within a massive data set that would otherwise take many human beings a long time to find, if ever.
In VRIFY AI’s context (supervised Machine Learning), we need to guide the model on how to find these new patterns and predictions without being overly or insufficiently prescriptive. If we are too prescriptive, the model will not find all the possible true predictions that exist in reality (overfitting). If we are not prescriptive enough, the model will find too many inaccurate predictions (underfitting).
When building AI models, especially in tasks like predicting mineral-rich sites, it's important to strike a balance between capturing the right patterns in the data and ensuring the model performs well on your project-specific data.
Overfitting and Underfitting are two common problems that occur when this balance is off.
What is Overfitting
Overfitting occurs when a model becomes too reliant on the training data, learning not only the real patterns but also the noise, or irrelevant data. This can happen when:
The training data size is too small and does not contain enough data samples to accurately represent the geological conditions in the field.
The training data contains large amounts of irrelevant information (noisy data).
Overfitting could result in the model only finding mineralization prediction patterns similar to the supervised learning examples. The model may fail to identify other possible patterns that exist in reality, falling short of the purpose and benefit of AI.
While an expected accuracy of 99% looks compelling, that would indicate that a model is overfitted. Typically, an ideal accuracy would be 90-95%, which would indicate high accuracy but not overfitted.
What is Underfitting
Underfitting occurs when a model is too simple to capture the key relationships within the geological data. As a result, the model may perform poorly on both its training data and the specific site data you input. Instead of making predictions based on actual geological patterns, the model might produce results that resemble random guesses. This can lead to predictions that are unreliable and inaccurate.
Avoiding Overfitting and Underfitting
Balancing the complexity of your model is crucial to achieving good performance. Here are some tips:
Avoid Overfitting: Use techniques like cross-validation, simplify the model architecture, and add regularization to prevent the model from learning noise.
Avoid Underfitting: Ensure the model is complex enough to capture the underlying patterns by adding more relevant features or adjusting model parameters.
Achieving the right balance helps create models that are both accurate and generalizable to new data.
Still have questions?
Reach out to your dedicated VRIFY AI Contact or email Support@VRIFY.com for more information.