A Mathematical Exploration of Neural Networks Using Sparsity, Entropy, and Multi-Valued Neurons
Artificial neural networks (ANN) are a major component of the artificial intelligence (AI) community that were designed based on the present understanding of their counterpart: the biological neuron. ANNs involve many different disciplines including mathematics, statistics, computer science, and engineering. The complexity of the different constructions of neural networks makes analytical mathematical work next to impossible and as a result computation and simulation are the methods used to study them. Our mathematical work focusses on two aspects. First, performance improvements through the inclusion of (generally small) perturbation terms, mathematically linked to the notions of sparsity and entropy, to the cost or objective function of the network. The formulation naturally fits in the realm of multicriteria decision making and is practically linked to overfitting. Overfitting is a common problem that is faced when dealing with neural networks, especially as computers continue to get more powerful, and we have the capability to train larger networks with many free parameters. As a result there is a pressing need to develop and explore different techniques to reduce overfitting. Loosely speaking, the thesis work suggests that the inclusion of either a sparsity criterion or an entropy criterion gives performance improvement, and the inclusion of both criteria gives a further improvement. Second, multi-valued neurons, particularly mathematical formulations in which the neurons are vector-valued, interval-valued, or convex-set-valued, are explored. Some potential applications are discussed and examples are explored.