IEEE—ITJ, Vol 11 20(2024)
Improving the performance of artificial neural network (ANN) regression models on small or scarce data sets, such as wireless network positioning data, can be realized by simplifying the task. One such approach includes implementing the regression model as a classifier, followed by a probabilistic mapping algorithm that transforms class probabilities into the multidimensional regression output. In this work, we propose the so-called classification-to-regression model (C2R), a novel ANN-based architecture that transforms the classification model into a robust regressor, while enabling end-to-end training. The proposed solution can remove the impact of less likely classes from the probabilistic mapping by implementing a novel, trainable differential thresholded rectified linear unit layer. The proposed solution is introduced and evaluated in the indoor positioning application domain, using 23 real-world, openly available positioning data sets. The proposed C2R model is shown to achieve significant improvements over the numerous benchmark methods in terms of positioning accuracy. Specifically, when averaged across the 23 data sets, the proposed C2R improves the mean positioning error by 7.9% compared to weighted k-nearest neighbors (kNN) with $k = 3$ , from 5.43 to 5.00m, and by 15.4% compared to a dense neural network (DNN), from 5.91 to 5.00m, while adapting the learned threshold. Finally, the proposed method adds only a single training parameter to the ANN, thus as shown through analytical and empirical means in the article, there is no significant increase in the computational complexity.
Fig. 1: General system model illustrating the considered overall structure with an ANN as the classification model, in an example case with four classes. The softmax layer transforms hidden variables x1 – x4 into probabilities 1–4, followed by thresholding and normalization functions (or layers). For thresholding, we consider thresholded ReLU (thrReLU) or dtReLU functions described in (4) and (7), respectively, while for normalization, the L1 (Manhattan) norm is adopted. The likelihood-based matching algorithm sums together the products of the label coordinates L1 – L4 with the corresponding probabilities 1– 4 to obtain the final estimates.
Fig. 2: Illustration of the step function approximation using sigmoid with varying k, shown in (a), and comparison of thrReLU and dtReLU with varying hyperparameter k, shown in (b).
Fig. 3: General illustration of the different solutions. The gray box encapsulates the model elements that are trained as classification ANN, while blue boxes indicate the regression ANNs. The yellow box denotes the single-neuron layer determining the value of γthr in the vC2R model.
Fig. 4: Distributions of the percentage error differences relative to the best performing model, illustrating the robust and reliable performance of the proposed C2R approach.
Fig. 5:Individual ECDFs of the positioning errors with the different utilized models on all data sets.
Fig. 6: Comparison of the averaged sample-wise localization times for the different solutions across the available data sets. The values on the x-axis are in logarithmic scale.