It is there, not for all the benchmarks, but for those where it is included, GPT-4 scores much higher.
Not surprising since GPT-4 is still state-of-the-art and much bigger. Where Mistral has been particularly impressive is when you take the size of the model into account.