Welcome to the CSC Q&A, on our server named in honor of Ada Lovelace. Write great code! Get help and give help!
It is our choices... that show what we truly are, far more than our abilities.

Categories

+14 votes

So if a model has like a 100% training accuracy but only 70% testing accuracy, can we still use that model to predict? It is overfitted but the accuracy is still roughly 70% (which is not too bad).

asked in DATA360_Spring2019 by (1 point)

1 Answer

+7 votes
 
Best answer

Yes, it might be fine to use an overfit model like that -- but you can only claim/expect that it is 70% accurate, not 100%.

However, you should probably work to refine your model or get more data, or remove features that are not useful to the prediction task -- do something to reduce overfitting, because you will likely be able to do BETTER if your algorithm isn't focusing on irrelevant details that happen to work in the training set, but won't work in general... e.g. maybe 75%.

Now, whether 70% is good or bad is entirely context-dependent. If you're trying to predict which one out of 100 stocks will perform the best, 70% accuracy would be amazing.

However, if you're trying to predict whether an Augustana student is going to drop out of college during their senior year, then 70% might be awful, because you should be able to get well over 90% by always guessing "no".

In general, accuracy is a simplified measure of machine learning performance. You may care more about which way your algorithm makes mistakes -- is it worse for your algorithm to predict that someone has cancer when they don't, or predict that they don't have cancer when they do?

For a more detailed description of ways to measure machine learning performance, see: https://medium.com/thalus-ai/performance-metrics-for-classification-problems-in-machine-learning-part-i-b085d432082b

answered by (508 points)
selected by
...