NetMaker neural engine features:
Neural network types:
MLP - feed-forward multi-layer perceptron; RMLP - recurrent multi-layer perceptron (back-propagation through time with teacher forcing); Cascade-Correlation based on S. Fahlman papers (+simplified version).
Training algorithms:
standard steepest-descend with momentum term (off-line and on-line training); conjugate gradients: update formula: Polak-Ribiere, Fletcher-Reeves, no conjugation; reset condition: standard, Powell-Beale; scaled conjugate gradient; quick-prop; Levenberg-Marquardt; (still developing, but quite efficient already)
regularization: weight decay (+option for excluding biases from regularization); Bayesian Framework (MLP networks, function approximation tasks): network output uncertanties; online optimization of the regularization factor; Hessian calculation modes: exact, approximated (around the net error minimum), finite differences;
automated training stop; dynamic network size adjustment: smart insertion of new hidden units; pruning of twin, dead and constant hidden units; Optimal Brain Surgeon (OBS) for weights elimination (based on the original paper by B. Hassibi et al.).
All structure modifications are safe to the network state (no error increase should be observed).
Activation functions:
standard sigmoid (logistic); softmax (to be used with cross-entropy error function only); 0-centered sigmoid; hyperbolic tangent; Elliott function (+unipolar version); arcus tangent (scaled to unipolar and bipolar); linear.
Error functions:
standard MSE; cross-entropy (cooperates with softmax output layer); Pow4 and integrated hyperbolic arcus tangent for improved sensitivity on network error distribution tails; integrated hyperbolic tangent for training data polluted with outliers and gross errors; various asymmetric functions for different costs of sig->bkg and bkg->sig misidentification; sample weighted - which allows χ2 minimization in case of MSE and 1/σ2 weight; user defined.
Preprocessing:
normalization (scaling) to 0-mean and unitary standard deviation; QSVD / ICA transformations - elimination of redundant data dimensions, better data representation; FFT filtering.
Standard classification algorithms:
kNN (k-nearest neighbors); SVM classification and regression based on LIBSVM v2.89 library; probability density estimation.