Mahout currently has Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Mean Shift clustering Dirichlet process clustering Latent Dirichlet Allocation Singular value decomposition Parallel Frequent Pattern mining Complementary Naive Bayes classifier Random forest decision tree based classifier High performance java collections (previously colt collections) A vibrant community and many more cool stuff to come by this summer thanks to Google summer of code
LIBLINEAR is a linear classifier for data with millions of instances and features. It supports L2-regularized logistic regression (LR), L2-loss linear SVM, and L1-loss linear SVM.
Main features of LIBLINEAR include
* Same data format as LIBSVM, our general-purpose SVM solver, and also similar usage
* Multi-class classification: 1) one-vs-the rest, 2) Crammer & Singer
* Cross validation for model selection
* Probability estimates (logistic regression only)
* Weights for unbalanced data
* MATLAB/Octave, Java interfaces