Well, the real answer is that it is usually calculated as a convolution of the corresponding kennel function(s) over the signal in the spatial domain - which is each sort of a sum of multiplications using a 2D sliding window. That happens to be like the complex multiplication (phase and amplitude) in the frequency domain... But I'm not sure if this is a helpful reply although reasonably accurate
Resolution is often measured as the frequency where e.g. "half the contrast" remains (MTF50). You can multiply two values at the same frequency to get the combination of two "transfer functions" at that frequency, but the new MTF50 will be the frequency where the two individual transfer functions multiple now end up being 50% - so it's not just multiplying two MTF50 values together, nor adding them somehow.
In this particular case, the heat distortion will most often be dominant, so there will be only smallv benefits from a better lens when the conditions get bad.