Nested cross-validation: which implementation to use? different purpose?

Ask Question

Asked 5 months ago

Modified 5 months ago

Viewed 40 times

I am learning Machine Learning and exploring nested cross-validation. I don't understand the example given in scikit-learn as the model seems to learn from the whole dataset and the evaluation is not performed on a hold-out set.

scikit documentation

scikit implementation

From what I read in Applied Predictive Modeling from Kuhn & Johnson, the model resulting from the inner loop should be evaluated on the hold-out set of the outer loop and the following post adheres to this point machinelearningmastery blog

As I am far from a Python expert, could you tell me the advantages, drawbacks and purposes of each of these implementations?

edited Jun 5 at 6:30

Guna

8971 silver badge16 bronze badges

asked Jun 4 at 20:23

SamGG

112 bronze badges

$\begingroup$ I don't see a difference between the methods in your three links. Can you clarify what difference you're asking about? $\endgroup$

Ben Reiniger
– Ben Reiniger ♦

2025-06-04 22:11:48 +00:00
Commented Jun 4 at 22:11
$\begingroup$ Shortly, I feel the first approach using the whole dataset in the inner loop features a data leakage. The first 2 links show the same approach comprising an inner loop tuning hyper-parameters on the whole dataset and an outer loop evaluating the model performance on the outer loop on the whole dataset. The third link shows what I consider as a really nested approach with the outer loop splitting the dataset into a training part that feeds the inner loop aims at tuning hyper-parameters and a hold-out part that is used to evaluate the performance of the tuned model. $\endgroup$

SamGG
– SamGG

2025-06-05 05:11:48 +00:00
Commented Jun 5 at 5:11

Add a comment |

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Stack Exchange Network

Nested cross-validation: which implementation to use? different purpose?

0

Hot Network Questions

Nested cross-validation: which implementation to use? different purpose?

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Related

Hot Network Questions