Fusion of Neural Networks, Fuzzy Systems and Genetic Algorithms: Industrial Applications Fusion of Neural Networks, Fuzzy Systems and Genetic Algorithms: Industrial Applications
by Lakhmi C. Jain; N.M. Martin
CRC Press, CRC Press LLC
ISBN: 0849398045   Pub Date: 11/01/98
  

Previous Table of Contents Next


4.3 TDNN Computational Overhead

In order to compare the performances of different sized networks, the number of floating-point operations executed during the direct phase (validation) represents a valid and homogeneous yardstick. This number, in fact, is directly correlated to the proficiency of the network since it is in proportion to the network size (number of free variables) and, consequently, to the complexity of the solvable problems.

The number of operations can be computed in two steps. Let us define the number of input patterns which are necessary for computing each output of the network, that is, the time-window maxW which is “seen” from the input layer

maxW = 1 + mD(0)

The number of operations Nop1 necessary to initialize the network or, in other terms, to fill all the neuron buffers with data derived from the first maxW input patterns and determine the first valid output value, is equal to

while the number of operations NopS corresponding to each new input vector, necessary for producing subsequent output values, is equal to

If Npat defines the number of input patterns, the number of floating-point operations (sums and multiplications) necessary for the feed-forward phase is

The value of NopT provides a measure of the computational speed of the network and, therefore, of its performances. Let us also notice that, since NopS is equivalent to the number Nw of network weights, it actually represents an estimate of the TDNN memory requirements.

4.4 Learning Criteria for TDNN Training

The quality of the TDNN learning can be assessed through different methods depending on the kind of performances required. In order to optimize the network size and parameters, the cost curves defined in the previous section have been analyzed. This analysis, however, loses validity when the performances of structurally different networks must be compared. In this case it is necessary to determine suitable performance figures for validating the estimates provided by the network which are independent from the network structure itself. Moreover, secondary figures must be considered like the computational speed (number of operations executed in the direct phase), the storage required and, optionally, a threshold on the maximum acceptable error.

Indicating with ut and dt the TDNN output and target vectors at time t, the following costs can be defined:

This figure, despite being the most commonly used error, turns out to be unreliable for the specific estimates of the network. What is basically required from the network is, in fact, the ability to follow the parameters modes without necessarily tracking their trajectories point by point. A linear phase distortion, as an example, producing a constant time delay between the pattern and the target sequences, yields a high MSE value but can often correspond to acceptable or even very good estimates.

Even if the network output values are limited to the interval [-1.0, +1.0], this kind of error does not provide reliable indications since it depends on the distribution of dt. The analysis of MAX during learning indicates that, after an initial decreasing phase, it increases in contrast to the MSE value which is progressively reduced (corresponding to an improvement of the global network performances).

In contrast to the previous figures, the cross-correlation coefficient measures the similarity between the two sequence with invariance to translations and to the particular distribution of the sequence samples.

It provides the same advantage of the cross-correlation coefficient; the occurrence of time translations is detected in case the maximum of Rud [τ] is shifted with respect to τ = 0. For evaluating the reliability of the estimate, the curve Rud [τ] can be compared to the self-correlation Rdd [τ].

4.5 Multi-Output vs. Single-Output Architecture

Keeping constant the global computational overhead of each examined system configuration, the performance figures previously described have been adopted to check the convenience of using a single multi-output TDNN in alternative to multiple single-output TDNNs. For this reason a parametric optimization has been applied to find out the most suitable configuration for a 5-output TDNN charged with estimating a vector of 5 mouth articulatory parameters, in particular, as shown in Figure 10, the vertical offset of the upper lip Lup and the contact segment between the upper and lower lips dw.


Figure 15  Optimization of the number of neurons in the first hidden layer with minimization of the MSE. In this case, either in case of data-base 1 (vowels) or data-base 2 (V/C/V transitions), the configuration net2 with 15 neurons in the first hidden layer provides the least MSE (1051-8215/97$10.00 © 1997 IEEE).

Constraining the investigated TDNN configurations to have constant complexity, 2 hidden layers and a pattern-target delay DT = 8, the cost descending speed (correlated to the level of learning) has been analyzed by varying the number of neurons in the first and in the second hidden layer and by changing the size of the neuron memory buffer. Although these tests have been carried out separately for minimizing the MSE, minimizing MAX, or maximizing r, the optimal TDNN configuration has proven to be almost independent of the specific cost functional adopted. Figures 15 and 16 provide an example of the optimization procedure followed in case of MSE minimization.


Figure 16  Optimization of the number of neurons in the second hidden layer with minimization of the MSE. In this case the number of neurons in the first hidden layer has been fixed to 15 for all the configurations. Either in case of data-base 1 (vowels) or data-base 2 (V/C/V transitions), the configuration net2 with 10 neurons in the first hidden layer provides the least MSE (1051-8215/97$10.00 © 1997 IEEE).


Previous Table of Contents Next

Copyright © CRC Press LLC