Benchmarking Reasoning Reliability in Artificial Intelligence Models for Energy-System Analysis