In general, I'm deeply suspicious of comparisons where the approach is, "I touched no other settings other than to swap between the different components being tested". First, because I think it makes assumptions about the non-interaction between the components and the effects of swapping them out that haven't been proven.
So if the question is, "Does the modeler sound just like the real power amp?". We still don't know. All we do know is that the answer to the the question, "Does the modeler sound just like the real power amp if you add the artificial constraint that you can't adjust any parameters?", is "No".
Second, and most important, it ignores the fact that this doesn't reflect any real-world application. Those other settings are there, and any normal user would adjust them to get the best sound possible out of the rig that they're using. It's entirely possible that you can get a sound which is MORE to you liking than the real power amp, by messing with power amp settings like compression or with transformer settings.
A more interesting test to me would be to fiddle with each setup to get the best possible sound out of it, then compare all three and ask which you like more.
One last thing, this test was done with the power amp into a resistive load. There's no way of knowing if that load presents exactly the same to the power amp as a real speaker cabinet. So if the argument is, "The default settings on the power amp model in the AF2 should give exactly the same sound as the real power amp", you would have to know if the power amp tone recorded in the test was the same as it was when Fractal modeled it, which might have been through a real cabinet, or through a different dummy load.
All that being said, I always find it interesting when people post these real world tests, because it makes it clear that the differences that we hear are always highly nuanced. This software is pretty damned close to the real thing, and at this point I'd say that the listener's playback device makes more impact on the sound than the differences we were hearing in this test.