- Home
- Products
- Machine Learning
- Advanced courses
- Decision Forests
Variable importances
Variable importance (also known as feature importance) is a score thatindicates how "important" a feature is to the model. For example, if for a givenmodel with two input features "f1" and "f2", the variable importances are{f1=5.8, f2=2.5}, then the feature "f1" is more "important" to the model thanfeature "f2". As with other machine learning models, variable importance is asimple way to understand how a decision tree works.
You can apply model agnostic variable importances such as permutation variableimportances,to decision trees.
Decision trees also have specific variable importances, such as:
- The sum of the split score with a given variable.
- The number of nodes with a given variable.
- The average depth of the first occurrence of a feature across all the treepaths.
Variable importances can differ by qualities such as:
- semantics
- scale
- properties
Furthermore, variable importances provide different types of information about:
- the model
- the dataset
- the training process
For example, the number of conditions containing a specific feature indicateshow much a decision tree is looking at this specific feature, which mightindicate variable importance. After all, the learning algorithm would not haveused a feature in multiple conditions if it did not matter. However, the samefeature appearing in multiple conditions might also indicate that a model istrying but failing to generalize the pattern of a feature. For example, thiscan happen when a feature is just an example identifier with no informationto generalize.
On the other hand, a high value for a high permutation variable importanceindicates that removing a feature hurts the model, which is an indication ofvariable importance. However, if the model is robust, removing any one featuremight not hurt the model.
Because different variable importances inform about different aspects of themodels, looking at several variable importances at the same time is informative.For example, if a feature is important according to all the variableimportances, this feature is likely important. As another example, if a featurehas a high "number of nodes" variable importance and a small "permutation"variable importance, then this feature might be hard to generalize and canhurt the model quality.
YDF Code
In YDF, you can see the variable importance of a model by callingmodel.describe()
and looking at the "variable importance" tab.See theModel understanding tutorial for more details.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-04-18 UTC.
[{ "type": "thumb-down", "id": "missingTheInformationINeed", "label":"Missing the information I need" },{ "type": "thumb-down", "id": "tooComplicatedTooManySteps", "label":"Too complicated / too many steps" },{ "type": "thumb-down", "id": "outOfDate", "label":"Out of date" },{ "type": "thumb-down", "id": "samplesCodeIssue", "label":"Samples / code issue" },{ "type": "thumb-down", "id": "otherDown", "label":"Other" }] [{ "type": "thumb-up", "id": "easyToUnderstand", "label":"Easy to understand" },{ "type": "thumb-up", "id": "solvedMyProblem", "label":"Solved my problem" },{ "type": "thumb-up", "id": "otherUp", "label":"Other" }]