Abstract:
It is proved that a neural network with sigmoidal activation functions is a Morse function for almost all, with respect to the Lebesgue measure, sets of parameters (weights) in the case when the network architecture has no bottlenecks, i.e., layers with fewer neurons than in the adjacent layers. It is shown by examples that the requirement for no bottlenecks is essential.