Best Split Algorithm—Gini Impurity Measure (2024)

When a decision tree is defined with a target variable and the Best Split algorithm is applied, the algorithm aims to partition the data so that the resulting group of records at the new node minimizes impurity. A node with high impurity has a high population of several different values of the target variable because the parent split has not segmented the data effectively.

When you minimize impurity, you want the observations in each node to have the same value of the target variable. The hom*ogeneity or purity of a partition increases with the proportion of observations that share the same target value. The Best Split algorithm in Xpress Insight uses the measure of Gini impurity, which calculates the heterogeneity or impurity of the node. When the Gini impurity value is 0.0 (minimum value), the partition is hom*ogeneous or pure. When the Gini impurity value is at its maximum value, the node is heterogeneous or impure. The maximum Gini impurity value varies for binary and multinomial target variables.

Settings for Target-Driven Decision Trees

When you define a decision tree with a target variable, you can specify the decision tree settings when you define the tree, or you can adjust your original settings while you are editing the tree. After the tree is created and you select the Best Split (Best Split Algorithm—Gini Impurity Measure (1)) from the Tree View, Xpress Insight takes the values from these decision tree settings to decide when to stop searching for the optimal split.

The resulting list of predictors, sorted by gain in purity, helps inform your decisions about the predictors you want to insert in your decision tree.

Default Settings and Descriptions for Target-Driven Decision Trees

The following table lists and describes each setting for a target-driven decision tree. The Best Split algorithm continues splitting while all the following conditions are true. When any one of the conditions becomes false, the algorithm stops searching for the optimal split.

Settings for Target-Driven Decision Trees
Setting Default Value Description

Impurity is greater than

The valid values for this setting range from 0.0 to 1.0.

The Best Split algorithm does not attempt to split a node if the impurity value of the node is less than or equal to this value. For example, if the Impurity value is set to 0.5 and there is a node with an impurity value of 0.4, the algorithm stops trying to separate the remaining records so as to reduce overfitting.

Tip An impurity value of 0.0 always allows splitting, while high values, above 0.5, may inhibit splitting completely.

Gain in purity is greater than

0.0001

The valid values for this setting range from 0.0 to 0.5.

The Best Split algorithm stops when any further splitting would not improve the purity by more than this value. After the Best Split algorithm runs, the predictors are ranked in order of their gain in purity.

Tip Lower values lead to more splitting, while higher values may inhibit splitting completely.

Raw counts are greater than

100

The valid values for this setting are greater than 0.

The Best Split algorithm stops searching for the optimal split when the raw counts are less than the value specified in this setting.

Splits are less than or equal to

4

The valid values for this setting range from 2 through 256.

The Best Split algorithm stops searching for the optimal split when the number of splits is greater than the value specified in this setting.

© 2001-2019 Fair Isaac Corporation. All rights reserved. This documentation is the property of Fair Isaac Corporation (“FICO”). Receipt or possession of this documentation does not convey rights to disclose, reproduce, make derivative works, use, or allow others to use it except solely for internal evaluation purposes to determine whether to purchase a license to the software described in this documentation, or as otherwise set forth in a written software license agreement between you and FICO (or a FICO affiliate). Use of this documentation and the software described in it must conform strictly to the foregoing permitted uses, and no other use is permitted.

Best Split Algorithm—Gini Impurity Measure (2024)
Top Articles
A day in the life of a FP&A manager
What Is Flat Tummy Walk | Spatz3
Lowe's Garden Fence Roll
Terrorist Usually Avoid Tourist Locations
News - Rachel Stevens at RachelStevens.com
Wizard Build Season 28
Obituary (Binghamton Press & Sun-Bulletin): Tully Area Historical Society
Www Thechristhospital Billpay
Sotyktu Pronounce
UEQ - User Experience Questionnaire: UX Testing schnell und einfach
Labor Gigs On Craigslist
Kitty Piggy Ssbbw
Xxn Abbreviation List 2023
111 Cubic Inch To Cc
Curry Ford Accident Today
FDA Approves Arcutis’ ZORYVE® (roflumilast) Topical Foam, 0.3% for the Treatment of Seborrheic Dermatitis in Individuals Aged 9 Years and Older - Arcutis Biotherapeutics
Allentown Craigslist Heavy Equipment
Acts 16 Nkjv
Culver's Flavor Of The Day Taylor Dr
Craigslistodessa
The Boogeyman (Film, 2023) - MovieMeter.nl
Enduring Word John 15
Hrconnect Kp Login
Angel Haynes Dropbox
My Reading Manga Gay
2487872771
Alima Becker
Homewatch Caregivers Salary
Pch Sunken Treasures
Baldur's Gate 3 Dislocated Shoulder
ShadowCat - Forestry Mulching, Land Clearing, Bush Hog, Brush, Bobcat - farm & garden services - craigslist
SOC 100 ONL Syllabus
KM to M (Kilometer to Meter) Converter, 1 km is 1000 m
Dying Light Nexus
Chatropolis Call Me
Compare Plans and Pricing - MEGA
Publictributes
Qlima© Petroleumofen Elektronischer Laserofen SRE 9046 TC mit 4,7 KW CO2 Wächter • EUR 425,95
Restored Republic June 6 2023
Carroll White Remc Outage Map
Actor and beloved baritone James Earl Jones dies at 93
Winta Zesu Net Worth
Grizzly Expiration Date Chart 2023
Watch Chainsaw Man English Sub/Dub online Free on HiAnime.to
Comanche Or Crow Crossword Clue
Alba Baptista Bikini, Ethnicity, Marriage, Wedding, Father, Shower, Nazi
Perc H965I With Rear Load Bracket
Europa Universalis 4: Army Composition Guide
Aznchikz
Hcs Smartfind
Access One Ummc
Latest Posts
Article information

Author: Saturnina Altenwerth DVM

Last Updated:

Views: 6178

Rating: 4.3 / 5 (44 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Saturnina Altenwerth DVM

Birthday: 1992-08-21

Address: Apt. 237 662 Haag Mills, East Verenaport, MO 57071-5493

Phone: +331850833384

Job: District Real-Estate Architect

Hobby: Skateboarding, Taxidermy, Air sports, Painting, Knife making, Letterboxing, Inline skating

Introduction: My name is Saturnina Altenwerth DVM, I am a witty, perfect, combative, beautiful, determined, fancy, determined person who loves writing and wants to share my knowledge and understanding with you.