How to normalize close range data?
I use logistic regression. I have some features. Their values are between 0 and 1, (The maximum value that the function can produce is 1 and the minimum value is 0), but both in training and test data the maximum value is very low (e.g. 0.11) therefore all values are low and close to each other. My question is that what is the best standard way to normalize/transfer the feature values to a normal scale (between 0 and 1) so that the logistic regression isn't affected by inappropriate values. Any help would be highly appreciated.
There are different methods for feature scaling/normalization. If you just want the feature values to be in range [0..1] do the following for each feature: Some tutorials recommend to scale features into the range [-0.5 .. 0.5]: I prefer to scale features by their standard deviation how explained in Stanford lectures (see chapter Preprocessing your data):
Mahout “classifier” for documents
Stacking Filters Weka Explorer
Converting separate text files containing training data and its labels to ARFF format
SVM-pref package from Cornell university
Text classification using Weka
different results by SMO, NaiveBayes, and BayesNet classifiers in weka
Dicompose LinSVM model into binary classifiers
weigh group of features as one with Weka
How to specify strings in Weka file?
Evaluating Test set using Weka
Need help interpret weka results
Different results in Weka GUI and Weka via Java code
imbalanced data classification with boosting algorithms
How to create ARFF file for 2D data points?
How to use weighted vote for classification using weka
Convert Web page to ARFF File for Weka classification