I'm testing the internal test data(about 750 data set) and then checking for performance(about accuracy 88%) and testing the external data(about 180 data set).

The problem is that the performance is completely reversed. For example, those that were true positives, true negatives in internal data enter false positives, false negatives in external data.

I wondering why happen this situation.

I thought about 2 reason. First, external data have different local minima with internal data set Second, it's because there is not enough data.

  • What does "performance is completely reversed" mean? Are your factor labels reversed?– user2974951Oct 16 at 6:18
  • For example, if there were 400 true positives and 300 true negatives in internal test data (about 750 data set), 80 false positives and 60 false negative in external data(about 180 data set)– Touch TooOct 16 at 8:42
  • Why are you comparing true positives / negatives in your training data set with false positives / negatives in your test set? That does not make sense. You should use 1 metric for them all, for ex. accuracy of F-score.– user2974951Oct 16 at 8:45
  • i'm not comparing traing data with test data. I'm testing to generalize model about internal test data and test external data– Touch TooOct 16 at 8:53
  • "I'm testing to generalize..." Yeah, you are testing, and my previous point still holds, your comparisons do not make sense.– user2974951Oct 16 at 9:04

Your Answer


By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Browse other questions tagged or ask your own question.