**Polynomial Cascades for Data Mining** - Classification and Regression for large, high dimensional datasets

Polynomial Cascades were originally developed by Greg Grudic et.al. for high-dimensional regression problems. An extension for classification, Polynomial MPMC Cascades, were developed by Sander Bohte, Greg Grudic and myself. On this page I list some of the features of the regression and the classification method.

## Features (Regression)

**Fast**- can handle large, high-dimensional problems.**Accurate models**- minimizes mean-squared error- Can fit non-linear functions without the use of kernels.
**No parameter tuning required**- can be run by non-experts.**Interpretable model**- the learned hypothesis is one high-degree, multi-variate polynomial. The hypothesis is**not**an ensemble of classifiers.

## Features (Classification)

**Fast**- Runtime for using the features only is**O(L x N x d)**with L being the number of levels (data dependant, but usually <400), N being the number of training examples and d being the dimensionality of the problem. The method scales to very large problems with millions of examples (read: millions of rows in terms of SQL Databases).**No parameter tuning required**- You can just run the algorithm and it will give you a good model. This means no cross validation is needed to determine a suitable kernel and kernel parameters.- Can classify
**non linearly separable**classification problems without fine tuning of parameters **Competitive Performance**on benchmarks with the Minimax Probability Machine for Classification (MPMC) and Support Vector Machines (SVM)**Incremental building of the model**- You can stop the learning at any point in time and have a usable model. This can be useful if you have time constraints and a classifier must be learned in real time.**Bounded misclassification error**- The PMC generates a worst case bound on the misclassification error for future, unseen data.**The underlying assumptions are minimal**- No specific distributions are assumed.**Interpretable model**- the learned hypothesis is one high-degree, multi-variate polynomial. The hypothesis is**not**an ensemble of classifiers.**Feature Selection**- One feature per level is used, chosen by a theoretically well founded metric.**Kernels**- the algorithm can be easily extended by kernels.

**Future Work** - Possible extensions

- Handling multiple classes - currently the PMC only handles binary classification problems.
- Handling missing values by not accounting for them during the computation of mean and covariance.

## Code

You can download Matlab code for the classification cascade from here: Download Polynomial MPMC Cascades sourcecode.

## Papers

**Linear Time Nonparametric Classification and Feature Selection with Polynomial MPMC Cascades for large datasets**- Available as Technical Report CU-CS-977-04.**Nonparametric Classification with Polynomial MPMC Cascades**- ICML 2004**Is Nonparametric Learning Practical in Very High Dimensional Spaces?**- IJCAI 1997