The benchmark compares MACE-MP-0 against standard PBE calculations for lattice constants, bulk moduli, and phonon-derived descriptors across 220 inorganic compounds. The authors report roughly one order of magnitude improvement in wall-clock throughput while keeping median lattice-constant error below 0.5%. What stood out to me is that they explicitly separate in-distribution oxides from out-of-distribution intermetallics, and the performance gap is much larger in the latter. I am curious how people here interpret this in practice. For high-throughput screening, we usually care less about absolute energy and more about ranking and filtering. If MACE can preserve ordering in unstable candidates, this could replace early-stage DFT in many pipelines. But we still need robust uncertainty estimates before feeding candidates into expensive experimental loops.
The new model combines message passing with symmetry-aware tokenization and reports stronger performance on formation energy and elastic tensor prediction benchmarks than CGCNN and MEGNet baselines. Their ablation suggests most gains come from enforcing space-group constraints during training rather than raw parameter count. I would still like to see broader transfer tests. Many benchmark sets overlap heavily with public repositories used in pretraining, so true out-of-domain generalization is hard to judge. If anyone has tried this model on low-symmetry organic-inorganic hybrids, please share failure cases.
The release is exciting, but I want to temper expectations. A large fraction of candidate structures are labeled as potentially synthesizable based on formation-energy filters and model predictions, not full thermodynamic phase-space analysis. For discovery workflows this is great, yet downstream teams should not treat every listed structure as synthesis-ready. For people who already integrated GNoME into active-learning loops, what validation protocol are you using? We currently cross-check with Materials Project hull distances and then run a smaller DFT relaxation set before any Bayesian optimization step. Curious whether others see systematic biases in nitrides or chalcogenides.
After three years as a postdoc, I moved to an industrial battery startup. The biggest change was timeline pressure: in academia we optimize for novelty, while in industry we optimize for decisions under uncertainty. I still run DFT and surrogate models, but now success means reducing experimental iteration cycles, not adding one more figure to a paper. Compensation and work-life balance improved for me, but I miss mentoring students and longer exploratory projects. If you are considering the switch, ask teams how they validate models experimentally and who owns failed predictions. The answer reveals whether data science is strategic or just a service function.
We fine-tuned Llama-3 on ~120k crystal structure descriptions, ICSD entries, and Materials Project documentation. The model can answer questions about space groups, Wyckoff positions, and common synthesis routes. It is far from replacing domain expertise but useful as a quick lookup tool and teaching aid. Weights and eval notebook available on HuggingFace.
We keep publishing impressive MAE numbers, but independent groups still struggle to reproduce even basic baselines. Missing random seeds, undocumented filtering, and leakage across compositional families are everywhere. This does not just hurt credibility; it wastes months of graduate student time and inflates confidence in fragile models. I think conferences and journals should require executable artifacts, data version hashes, and explicit split definitions for materials datasets. If software engineering standards are optional, materials informatics will keep reinventing avoidable mistakes. What lightweight standards could the community realistically adopt this year?
We just open-sourced our workflow for training equivariant force fields from VASP trajectories. The pipeline handles dataset ingestion, neighbor-list caching, distributed training, and active-learning uncertainty triggers. We spent most of our time on data cleaning because mislabeled stress tensors were silently hurting training stability. The repository includes ready-to-run templates for silicon, Li-ion electrolyte clusters, and oxide surfaces. Feedback is very welcome, especially on experiment tracking and model-card sections. If there is interest, I can post a companion notebook showing integration with ASE geometry optimization.
We launched a continuously updated leaderboard that ranks ML models on their ability to predict thermodynamic stability across a held-out test set derived from WBM. The site includes interactive Pareto plots, per-model error breakdowns, and downloadable prediction files. Source code is fully open and contributions to add new models are welcome via pull request.
This workflow combines a pretrained graph model with uncertainty-aware acquisition to rank 2D candidates for hydrogen evolution and thermal stability. They evaluate around 70,000 hypothetical structures and only run expensive DFT for the top uncertainty-calibrated subset. The hit rate seems significantly better than random or heuristic filtering. One concern is that the candidate generator may bias towards known motifs, limiting novelty. Still, the closed-loop process is a nice template for groups that cannot afford brute-force DFT on full candidate spaces.
After a few failed projects, our internal checklist now starts with data balance across local environments rather than model architecture. Equivariant models overfit fast when rare coordination motifs are underrepresented, and force-label noise from unconverged SCF steps can dominate the loss. We also found that monitoring physically meaningful validation metrics, like energy ranking consistency for polymorphs, is more useful than aggregate MAE alone. Curious what diagnostics others track during training to catch failure modes early.
This paper compiles more than 400 calculations across Mn, Fe, Co, and Ni oxides and evaluates how U choices affect oxidation energetics and magnetic ordering. Their biggest contribution is a consistent protocol for fitting U against both formation enthalpy and band-gap constraints, rather than matching only one observable. I appreciate that the supplementary information includes full INCAR sets and pseudopotential choices. Reproducing literature values has been frustrating because many papers omit these details. Has anyone here tried applying their fitted U values to mixed-anion systems like oxyfluorides?
Short answer: yes. Composition overlap across train and test sets can create a false sense of model robustness, especially when crystal prototypes are near-duplicates. We recently reran a published benchmark with composition-family splits and observed performance drops of 30-50% depending on target property. Leakage is not always malicious; many datasets were never designed for ML benchmarking. But if we do not define clear split protocols, we cannot compare papers meaningfully. I would love to see a community-maintained suite of leakage-resistant evaluation splits.
Built a web viewer for the OC20/OC22 datasets so you can browse adsorption configurations without downloading the full 800 GB archive. You can filter by adsorbate, surface element, and Miller index, then inspect geometries in an interactive 3D viewer. Backend uses a pre-indexed DuckDB file for sub-second queries.
I wrote a practical walkthrough for students who want to run GPAW from Python without juggling too many environment variables. The tutorial covers reproducible conda setup, PAW dataset checks, k-point path generation, and plotting with matplotlib. The main pain point was making sure MPI launch commands behave consistently across local and cluster environments. I included a minimal silicon example and a section on common numerical pitfalls, especially basis-set convergence. Happy to extend this with spin-polarized examples if that helps newcomers.
I am running slab relaxations for mixed-halide perovskites and keep hitting charge sloshing when vacuum exceeds 18 A. I tried reducing mixing amplitude and switching between ALGO = Normal and ALGO = Fast, but ionic steps still oscillate. K-point density is moderate (4x4x1), ENCUT is 1.3x ENMAX, and I include dipole correction. Has anyone found a stable recipe for these systems, especially when spin-orbit coupling is enabled? I can converge bulk cells reliably, but slab geometries with defects are painful. Any tips on pre-relaxation strategies or good default mixing parameters would save me days of queue time.
Join our team building the next-generation materials discovery platform. You will design and deploy GNN-based property prediction services, maintain model registry infrastructure, and collaborate with experimental scientists to close the loop between predictions and lab validation. Experience with PyTorch Geometric and distributed training is a strong plus.
The Santos Lab at Stanford is seeking a postdoctoral researcher to develop machine learning surrogate models for solid-state electrolyte interfaces. The position involves training equivariant neural network potentials on ab-initio MD trajectories and validating predictions against experimental impedance and XPS data. Strong Python and DFT background required.
The Tanaka Group at University of Tokyo has an opening for a PhD student starting Fall 2026. The project focuses on high-throughput DFT screening of van der Waals heterostructures for photocatalytic water splitting, combined with active-learning GNN surrogates. Full tuition waiver and competitive stipend provided through MEXT scholarship.
We are looking for a summer intern (12 weeks) to help benchmark equivariant force fields against experimental phonon dispersion data. The intern will work with our atomistic simulation team and have access to internal HPC resources. Ideal candidates are in their final year of an MS or early PhD in materials science, physics, or chemistry.