Stay organized with collections
Save and categorize content based on your preferences.
Out-of-bag evaluation
Random forests do not require a validation dataset. Most random forests use a
technique called out-of-bag-evaluation (OOBevaluation) to evaluate
the quality of the model. OOB evaluation treats the training set as if it were
on the test set of a cross-validation.
As explained earlier, each decision tree in a random forest is typically trained
on ~67% of the training examples. Therefore, each decision tree does not see
~33% of the training examples. The core idea of OOB-evaluation is as follows:
To evaluate the random forest on the training set.
For each example, only use the decision trees that did not see the example
during training.
The following table illustrates OOB evaluation of a random forest with 3
decision trees trained on 6 examples. (Yes, this is the same table as in
the Bagging section). The table shows which decision tree is used with
which example during OOB evaluation.
Table 7. OOB Evaluation - the numbers represent the number of times a given
training example is used during training of the given example
Training examples
Examples for OOB Evaluation
#1
#2
#3
#4
#5
#6
original dataset
1
1
1
1
1
1
decision tree 1
1
1
0
2
1
1
#3
decision tree 2
3
0
1
0
2
0
#2, #4, and #6
decision tree 3
0
1
3
1
0
1
#1 and #5
In the example shown in Table 7, the OOB predictions for training example 1
will be computed with decision tree #3 (since decision trees #1 and #2 used
this example for training). In practice, on a reasonable size dataset and
with a few decision trees, all the examples have an OOB prediction.
YDF Code
In YDF, the OOB-evaluation is available in the training logs if the model is
trained with compute_oob_performances=True.
OOB evaluation is also effective to compute permutation variable importance for
random forest models. Remember from Variable
importances that
permutation variable importance measures the importance of a variable by
measuring the drop of model quality when this variable is shuffled. The random
forest "OOB permutation variable importance" is a permutation variable
importance computed using the OOB evaluation.
YDF Code
In YDF, the OOB permutation variable importances are available in the training
logs if the model is trained with
compute_oob_variable_importances=True.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eRandom forests utilize out-of-bag (OOB) evaluation, eliminating the need for a separate validation dataset by treating the training set as a test set in a cross-validation-like approach.\u003c/p\u003e\n"],["\u003cp\u003eOOB evaluation leverages the fact that each decision tree in the forest is trained on approximately 67% of the training data, allowing the remaining 33% to be used for evaluation, similar to a test set.\u003c/p\u003e\n"],["\u003cp\u003eDuring OOB evaluation, predictions for a specific example are generated using only the decision trees that did not include that example in their training process.\u003c/p\u003e\n"],["\u003cp\u003eYDF provides access to OOB evaluation metrics and OOB permutation variable importances within the training logs, offering insights into model performance and feature relevance.\u003c/p\u003e\n"]]],[],null,["\u003cbr /\u003e\n\nOut-of-bag evaluation\n---------------------\n\nRandom forests do not require a validation dataset. Most random forests use a\ntechnique called **out-of-bag-evaluation** (**OOB** **evaluation**) to evaluate\nthe quality of the model. OOB evaluation treats the training set as if it were\non the test set of a cross-validation.\n\nAs explained earlier, each decision tree in a random forest is typically trained\non \\~67% of the training examples. Therefore, each decision tree does not see\n\\~33% of the training examples. The core idea of OOB-evaluation is as follows:\n\n- To evaluate the random forest on the training set.\n- For each example, only use the decision trees that did not see the example during training.\n\nThe following table illustrates OOB evaluation of a random forest with 3\ndecision trees trained on 6 examples. (Yes, this is the same table as in\nthe Bagging section). The table shows which decision tree is used with\nwhich example during OOB evaluation.\n\n**Table 7. OOB Evaluation - the numbers represent the number of times a given\ntraining example is used during training of the given example**\n\n| | Training examples |||||| Examples for OOB Evaluation |\n| | #1 | #2 | #3 | #4 | #5 | #6 | |\n| original dataset | 1 | 1 | 1 | 1 | 1 | 1 |\n| decision tree 1 | 1 | 1 | 0 | 2 | 1 | 1 | #3 |\n| decision tree 2 | 3 | 0 | 1 | 0 | 2 | 0 | #2, #4, and #6 |\n| decision tree 3 | 0 | 1 | 3 | 1 | 0 | 1 | #1 and #5 |\n|------------------|----|----|----|----|----|----|-----------------------------|\n\nIn the example shown in Table 7, the OOB predictions for training example 1\nwill be computed with decision tree #3 (since decision trees #1 and #2 used\nthis example for training). In practice, on a reasonable size dataset and\nwith a few decision trees, all the examples have an OOB prediction. \nYDF Code\nIn YDF, the OOB-evaluation is available in the training logs if the model is trained with `compute_oob_performances=True`.\n\nOOB evaluation is also effective to compute permutation variable importance for\nrandom forest models. Remember from [Variable\nimportances](/machine-learning/decision-forests/variable-importances) that\npermutation variable importance measures the importance of a variable by\nmeasuring the drop of model quality when this variable is shuffled. The random\nforest \"OOB permutation variable importance\" is a permutation variable\nimportance computed using the OOB evaluation. \nYDF Code\nIn YDF, the OOB permutation variable importances are available in the training logs if the model is trained with `compute_oob_variable_importances=True`."]]