DMS Home

ILLM measure for covering quality

Generally it is accepted that good models satisfy many positive examples and no, or very small number of, negative examples. It is easy to apply this principle when we need to compare two models which satisfy no negative example and one satisfies more positive examples than the other one. But if the first model which satisfies more positive examples, satisfies also one or two negative examples, then the decision might be not so easy.

This problem is in ILLM solved so that a measure of the covering quality is defined as:

covering_quality = TP / (FP + g)

total number of positive training examples correctly classified by the model
total number of negative training examples incorrectly classified by the model as the positive cases
generalization level selected by the user.

All potentially good models must satisfy the condition of high quality generalization of the submitted examples. The covering quality measure serves for selection among potentially good models. In connection with this formula, few important things should be noted:

  • There is no absolutely the best model for the given problem. The best model selection depends on parameter 'g' value which should include all user preferences in the concrete domain. Changes in parameter 'g' value correspond to the movements along the ROC curve: small 'g' correspond to the lower left corner and high 'g' values to the upper right corner of the ROC space.
  • High generalization parameter value means that the relative importance of incorrect negative example predictions is low. The consequence is that models covering many positive examples are preferred.
  • covering_quality defined by the formula is only a relative measure of covering quality which enables comparison of the covering properties of different models.
  • TP and FN are absolute numbers and value of the 'g' parameter should be adjusted to the properties of every problem.

    © 2001 LIS - Rudjer Boskovic Institute
    Last modified: January 23 2018 13:51:13.