Currently Being Moderated
Hi Brendan,
So you are correct, there is a change in how AI scales the results.
Here is the explanation from the algorithm developer to clarify the intent.
Thanks, Mark
The raw score for attribute Importance is a simple two-part code MDL measure. It views a model as an attempt to reduce communication costs, measured in transmission bits. The cost is the sum of the costs of transmitting the model and transmitting the data using the model to compress the data. This gives a way of comparing a set of different models, in particular, a model consisting of the of the target probability conditioned on a binned set of attribute values versus the prior. The benefit is measured, using an idealized codes, the entropies, p log p. The best possible code has a cost equal to within a bit of the entropy. The benefit is equal to the reduction in communication cost when the attribute model is chosen relative to the prior model. It is not a good thing, if that reduction is negative. In that respect, the measure differs from correlation, where the sign is a direction and the magnitude, a strength. Negative values represent uninteresting attributes, so these were set to benefit 0.
The problem with the raw measure is that the range of values depends on the problem. The higher the entropy of the target (prior model), the greater the scale of raw values. This makes it difficult for users to interpret. To simplify, we re-scaled the values. The rescaling is the per-row benefit.