Z. W. Zhang, S. T. Ye, Y. R. Zhang, W. P. Ding, and H. Wang, “Belief combination of classifiers for incomplete data,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 4, pp. 652–667, Apr. 2022. doi: 10.1109/JAS.2022.105458
Belief Combination of Classifiers for Incomplete Data

doi: 10.1109/JAS.2022.105458
Funds:  This work was supported in part by the Center-initiated Research Project and Research Initiation Project of Zhejiang Laboratory (113012-AL2201, 113012-PI2103), the National Natural Science Foundation of China (61300167, 61976120), the Natural Science Foundation of Jiangsu Province (BK20191445), the Natural Science Key Foundation of Jiangsu Education Department (21KJA510004), and Qing Lan Project of Jiangsu Province
  • Data with missing values, or incomplete information, brings some challenges to the development of classification, as the incompleteness may significantly affect the performance of classifiers. In this paper, we handle missing values in both training and test sets with uncertainty and imprecision reasoning by proposing a new belief combination of classifier (BCC) method based on the evidence theory. The proposed BCC method aims to improve the classification performance of incomplete data by characterizing the uncertainty and imprecision brought by incompleteness. In BCC, different attributes are regarded as independent sources, and the collection of each attribute is considered as a subset. Then, multiple classifiers are trained with each subset independently and allow each observed attribute to provide a sub-classification result for the query pattern. Finally, these sub-classification results with different weights (discounting factors) are used to provide supplementary information to jointly determine the final classes of query patterns. The weights consist of two aspects: global and local. The global weight calculated by an optimization function is employed to represent the reliability of each classifier, and the local weight obtained by mining attribute distribution characteristics is used to quantify the importance of observed attributes to the pattern classification. Abundant comparative experiments including seven methods on twelve datasets are executed, demonstrating the out-performance of BCC over all baseline methods in terms of accuracy, precision, recall, F1 measure, with pertinent computational costs.


  • 1 To avoid ambiguity, we apply the term incomplete data for a dataset with missing values, and incomplete pattern for a pattern with missing values.
    2 In evidence theory, the term evidential refers to variables with both uncertainty and imprecision.
    3 This paper focuses on the classification of incomplete data, which means that the reliability weight can be obtained by the proposed optimization strategy quickly based on the training set.
    4 It is used to measure the similarity between two sets of variables. Other correlation coefficients, such as Spearman’s correlation coefficient [55], Kendall’s correlation [56] are also applicable.
    5 All results demonstrated in this paper are average values.6 The differences between the chosen classifiers are beyond this paper.
    6 The differences between the chosen classifiers are beyond this paper.
    • We propose a method to handle missing values in both training and test patterns.
    • The method can characterize uncertainty and imprecision brought by incompleteness.
    • The method is considered as a strategy of multiple classifiers fusion.
    • We consider each attribute as a sub-source to provide complementary information.
    • We provide a method to optimize the weight of each sub-source.


