Skip to main content
Remove extra commentary
Source Link

I am fairly new to ML.

I used Random Forest and hypertuned the parameters for a binary classification problem on a dataset (dataset A). I got a F1 score of 0.78. I then used a second dataset (dataset B). It was very similar to dataset A (same variables and the distribution of classes in the target variable). I again built and trained a different Random Forest algorithm for dataset B. I expected the f1 score to be around 0.78, but the f1 score for dataset B was 0.50.

I don't understand whyWhy could there isbe such a large difference between the f1 scores of the 2 datasets.?

Both datasets (A & B) are very similar to each other and I trained separate models on both of them.

Any inputs on how to approach this issue would be greatly appreciated. Thanks!

I am fairly new to ML.

I used Random Forest and hypertuned the parameters for a binary classification problem on a dataset (dataset A). I got a F1 score of 0.78. I then used a second dataset (dataset B). It was very similar to dataset A (same variables and the distribution of classes in the target variable). I again built and trained a different Random Forest algorithm for dataset B. I expected the f1 score to be around 0.78, but the f1 score for dataset B was 0.50.

I don't understand why there is such a large difference between the f1 scores of the 2 datasets. Both datasets (A & B) are very similar to each other and I trained separate models on both of them.

Any inputs on how to approach this issue would be greatly appreciated. Thanks!

I used Random Forest and hypertuned the parameters for a binary classification problem on a dataset (dataset A). I got a F1 score of 0.78. I then used a second dataset (dataset B). It was very similar to dataset A (same variables and the distribution of classes in the target variable). I again built and trained a different Random Forest algorithm for dataset B. I expected the f1 score to be around 0.78, but the f1 score for dataset B was 0.50.

Why could there be such a large difference between the f1 scores of the 2 datasets?

Both datasets (A & B) are very similar to each other and I trained separate models on both of them.

Why does the same algorithm gavegive very different metrics on similar datasets?

I am fairly new to ML and still in the learning phase.

I used Random Forest (and hypertuned the parameters) for a binary classification problem on onea dataset ( datasetdataset A). I got a F1 score of 0.78. I then gotused a second dataset  (dataset B).It It was very similar to dataset A (A).By similar I mean samesame variables and the distribution of classes in the target variable).I I again built and trained a different random forestRandom Forest algorithm for dataset B. I expected the f1 score to be around 0.78  , but the f1 score for dataset B was 0.50.

I don't understand why there is such a starklarge difference inbetween the f1 scores forof the 2 datasets.Both Both datasets ( A&A & B) are very similar to each other and I trained separate models on both of them.

Any inputs on how to approach this issue?Thanks would be greatly appreciated. Thanks!

Why same algorithm gave very different metrics on similar datasets?

I am fairly new to ML and still in the learning phase.

I used Random Forest ( hypertuned the parameters) for a binary classification problem on one dataset ( dataset A). I got a F1 score of 0.78. I then got a second dataset(dataset B).It was very similar to dataset(A).By similar I mean same variables and the distribution of classes in the target variable.I again built and trained a different random forest algorithm for dataset B. I expected the f1 score to be around 0.78  , but the f1 score for dataset B was 0.50.

I don't understand why there is such a stark difference in the f1 scores for the 2 datasets.Both datasets ( A& B) are very similar to each other and I trained separate models on both of them.

Any inputs on how to approach this issue?Thanks!

Why does the same algorithm give very different metrics on similar datasets?

I am fairly new to ML.

I used Random Forest and hypertuned the parameters for a binary classification problem on a dataset (dataset A). I got a F1 score of 0.78. I then used a second dataset  (dataset B). It was very similar to dataset A (same variables and the distribution of classes in the target variable). I again built and trained a different Random Forest algorithm for dataset B. I expected the f1 score to be around 0.78, but the f1 score for dataset B was 0.50.

I don't understand why there is such a large difference between the f1 scores of the 2 datasets. Both datasets (A & B) are very similar to each other and I trained separate models on both of them.

Any inputs on how to approach this issue would be greatly appreciated. Thanks!

Source Link

Why same algorithm gave very different metrics on similar datasets?

I am fairly new to ML and still in the learning phase.

I used Random Forest ( hypertuned the parameters) for a binary classification problem on one dataset ( dataset A). I got a F1 score of 0.78. I then got a second dataset(dataset B).It was very similar to dataset(A).By similar I mean same variables and the distribution of classes in the target variable.I again built and trained a different random forest algorithm for dataset B. I expected the f1 score to be around 0.78 , but the f1 score for dataset B was 0.50.

I don't understand why there is such a stark difference in the f1 scores for the 2 datasets.Both datasets ( A& B) are very similar to each other and I trained separate models on both of them.

Any inputs on how to approach this issue?Thanks!