Yeah, but you wait a month, test it again and you got your 10-25% out-of-sample data.

I really don't see a benefit.

What if we pick January 2022 as our out-of-sample data?
Why such data should be allowed to trump everything else just because it got ignored while fine-tuning?