This source would argue the other way
https://cissm.umd.edu/sites/default/files/2019-08/2000-UCS-C...
It seems that a warhead or decoy would fill a single pixel on the sensor until you got very close to the target, 1 km or so, at which case you have to execute a high-g maneuver in a few ms. The state of the art is a "two color" system where you could make either the warhead or the decoys any "color" you want with surface treatments and/or thermal management (worst case: put the warhead inside a shroud cooled to liquid nitrogen temperature and fire the weapon at night when it won't be illuminated by the sun.)
There was a test in the early 2000s I read about where they were able to pick out a warhead which was intermediate in properties with a set of decoys with various surface treatments. That 's great but they knew exactly what they were up against which we wouldn't know if it was a North Korean missile.
I'd have more faith in the discrimination abilities of ground-based radars in the 12 GHz range than in the "two color" focal plane imager system.
This gets deep into the classified domain, no one is going to be talking about it here. :)