### classification

#### Suggested unsupervised feature selection / extraction method for 2 class classification?

I've got a set of F features e.g. Lab color space, entropy. By concatenating all features together, I obtain a feature vector of dimension d (between 12 and 50, depending on which features selected. I usually get between 1000 and 5000 new samples, denoted x. A Gaussian Mixture Model is then trained with the vectors, but I don't know which class the features are from. What I know though, is that there are only 2 classes. Based on the GMM prediction I get a probability of that feature vector belonging to class 1 or 2. My question now is: How do I obtain the best subset of features, for instance only entropy and normalized rgb, that will give me the best classification accuracy? I guess this is achieved, if the class separability is increased, due to the feature subset selection. Maybe I can utilize Fisher's linear discriminant analysis? Since I already have the mean and covariance matrices obtained from the GMM. But wouldn't I have to calculate the score for each combination of features then? Would be nice to get some help if this is a unrewarding approach and I'm on the wrong track and/or any other suggestions?

One way of finding "informative" features is to use the features that will maximise the log likelihood. You could do this with cross validation. https://www.cs.cmu.edu/~kdeng/thesis/feature.pdf Another idea might be to use another unsupervised algorithm that automatically selects features such as an clustering forest http://research.microsoft.com/pubs/155552/decisionForests_MSR_TR_2011_114.pdf In that case the clustering algorithm will automatically split the data based on information gain. Fisher LDA will not select features but project your original data into a lower dimensional subspace. If you are looking into the subspace method another interesting approach might be spectral clustering, which also happens in a subspace or unsupervised neural networks such as auto encoder. Hope that helps

### Related Links

Classification results interpretation (TFlearn, Keras)

discretization for feature selection in weka

ROC result interpretation

Classification using Mallet and MaxEntropy

Measuring Error Correlation of Classifiers

caffe: Confused about regression

How to cut a dendrogram in r

Building weka classifier

Does Orange data mining software has multi-layer perceptron classification?

User Classification in RapidMiner - output should be the user based on a fed test data

Error in building mean image file(Caffe)

caffe: probability distribution for regression / expanding classification (softmax layer) to allow 3D output

Does MLE produce a generative or discriminative classifier?

Basic Hidden Markov Model, Viterbi algorithm

Where do I write the code for LIBSVM?

How to understand the output of ADTree classification in WEKA