New MACE-MP-0 benchmark shows 10x speedup over DFT for bulk properties
The benchmark compares MACE-MP-0 against standard PBE calculations for lattice constants, bulk moduli, and phonon-derived descriptors across 220 inorganic compounds. The authors report roughly one order of magnitude improvement in wall-clock throughput while keeping median lattice-constant error below 0.5%. What stood out to me is that they explicitly separate in-distribution oxides from out-of-distribution intermetallics, and the performance gap is much larger in the latter.
I am curious how people here interpret this in practice. For high-throughput screening, we usually care less about absolute energy and more about ranking and filtering. If MACE can preserve ordering in unstable candidates, this could replace early-stage DFT in many pipelines. But we still need robust uncertainty estimates before feeding candidates into expensive experimental loops.
Paper Reference
DOI: 10.48550/arXiv.2602.04219
arXiv: 2602.04219
Posting as Anonymous Researcher
Comments
We tested MACE-MP-0 on oxide slabs and the speedup is real, but error bars widen for undercoordinated surface atoms. For screening this is still great if you re-rank top candidates with DFT.
Did you calibrate uncertainty with an ensemble or rely on latent distance? We saw latent metrics underestimate risk on defect-rich structures.
Ensemble of five checkpoints plus disagreement threshold. Latent distance alone was too optimistic for oxygen-vacancy migration barriers.
Same finding here. We now trigger DFT whenever force disagreement exceeds 0.12 eV/A during active learning.
The paper's intermetallic benchmark is the most important part. Community datasets are still oxide-heavy, so this kind of stress test is overdue.
Anonymous industry perspective: we care about ranking stability much more than absolute energy error. If top-20 candidates stay consistent, 10x speedup is transformative.
I can add a weekly benchmark digest if people want side-by-side updates for MACE, NequIP, and ALIGNN variants.
Would love benchmarks on hybrid perovskites too. Most force fields still break on soft rotational modes.