Machine Learning Classification and predication in weka
I am very new to machine learning. Sorry if there are any mistakes in my English. I am using the weka J48 Classification for prediction in true or false. I have almost 999K training set which i used to train the model. I used the cross validation method with 3 folds to train the Model which gives me accuracy of ~84%. Now after storing the model. i tried to test it on 50k dataset. which is giving very bad results and 50% of them are mismatch. I have 11 attributes with nominal and numeric fields. I dont know why its happening. I have two questions. How can i train to perform better on test set. what could be possible issues. I am using weka api in java.
It means that your model is overfit for your 999k training set and doesn't generalize well to your 50k testing set. You should look into cross-validating with (a good portion, but not all) of your 50k dataset in addition to your 999k. You may also want to try something higher than a k=3, k-fold crossvalidation, because k=3 folds may be too "coarse". Good luck!
ArcMap conditional statement raster attribute?
WEKA classifier evaluation
KNN giving highest accuracy with K=1?
Ensemble classifier for different features
getting paragraph representation for unseen paragraphs in doc2vec
Does Weka setClassIndex and setAttributeIndices start attribute from different rage?
Criteria to classify retail customers as churn Y or N
How to quantify similarity of tree models? (XGB, Random Forest, Gradient Boosting, etc.)
Logistic Regression(Classification Technique) on Time-dependent Predictors/variables Data
High Relative absolute error and Root relative squared error in classification
voting with average of probabilities in weka
Weka : how to use cross validation in code
Decision Tree relevent classification for this task?
Accuracy of a naive bayes classifier
Weka library java: how to get the prospect of a classification?
Multilabel Text Classification NLTK