Bookcover of A Unifying Theory of Learning: DL Meets Kernel Methods
Booktitle:
A Unifying Theory of Learning: DL Meets Kernel Methods
ETH Zürich
LAP LAMBERT Academic Publishing
(2021-06-21
)
eligible for voucher
ISBN-13:
978-620-3-92495-4
ISBN-10:
6203924954
EAN:
9786203924954
Book language:
English
Blurb/Shorttext:
We introduce a framework to use kernel approximates in the mini-batch setting with Stochastic Gradient Descent (SGD) as an alternative to Deep Learning. Based on Random Kitchen Sinks, we provide a C++ library for Large-scale ML. It contains a CPU optimized implementation of the algorithm in Le et al. 2013, that allows the computation of approximated kernel expansions in log-linear time. The algorithm requires to compute the product of matrices Walsh Hadamard. A cache friendly Fast Walsh Hadamard that achieves compelling speed and outperforms current state-of-the-art methods has been developed. McKernel establishes the foundation of a new architecture of learning that allows to obtain large-scale non-linear classification combining lightning kernel expansions and a linear classifier. It travails in the mini-batch setting working analogously to Neural Networks. We also propose a new architecture to reduce over-parametrization in Neural Networks. It introduces an operand for rapid computation in the framework of Deep Learning that leverages learned weights. The formalism is described in detail providing both an accurate elucidation of the mechanics and the theoretical implications.<div><p style="text-align: justify;">We introduce a framework to use kernel approximates in the mini-batch setting with Stochastic Gradient Descent (SGD) as an alternative to Deep Learning. Based on Random Kitchen Sinks, we provide a C++ library for Large-scale ML. It contains a CPU optimized implementation of the algorithm in Le et al. 2013, that allows the computation of approximated kernel expansions in log-linear time. The algorithm requires to compute the product of matrices Walsh Hadamard. A cache friendly Fast Walsh Hadamard that achieves compelling speed and outperforms current state-of-the-art methods has been developed.</p><p style="text-align: justify;"> </p><p style="text-align: justify;">McKernel establishes the foundation of a new architecture of learning that allows to obtain large-scale non-linear classification combining lightning kernel expansions and a linear classifier. It travails in the mini-batch setting working analogously to Neural Networks. </p><p style="text-align: justify;"> </p><p style="text-align: justify;">We also propose a new architecture to reduce over-parametrization in Neural Networks. It introduces an operand for rapid computation in the framework of Deep Learning that leverages learned weights. The formalism is described in detail providing both an accurate elucidation of the mechanics and the theoretical implications.</p></div>