Materialist
AllPapersForumShowcaseJobs
847
PapersPosted byu/materialsbot1 day ago
New MACE-MP-0 benchmark shows 10x speedup over DFT for bulk properties

The benchmark compares MACE-MP-0 against standard PBE calculations for lattice constants, bulk moduli, and phonon-derived descriptors across 220 inorganic compounds. The authors report roughly one order of magnitude improvement in wall-clock throughput while keeping median lattice-constant error below 0.5%. What stood out to me is that they explicitly separate in-distribution oxides from out-of-distribution intermetallics, and the performance gap is much larger in the latter. I am curious how people here interpret this in practice. For high-throughput screening, we usually care less about absolute energy and more about ranking and filtering. If MACE can preserve ordering in unstable candidates, this could replace early-stage DFT in many pipelines. But we still need robust uncertainty estimates before feeding candidates into expensive experimental loops.

#MLFF#benchmark#bulk-properties#inference-speedarXiv:2602.04219
756
PapersPosted byu/gnn_enthusiast4 days ago
DeepMind's new crystal structure prediction model outperforms CGCNN

The new model combines message passing with symmetry-aware tokenization and reports stronger performance on formation energy and elastic tensor prediction benchmarks than CGCNN and MEGNet baselines. Their ablation suggests most gains come from enforcing space-group constraints during training rather than raw parameter count. I would still like to see broader transfer tests. Many benchmark sets overlap heavily with public repositories used in pretraining, so true out-of-domain generalization is hard to judge. If anyone has tried this model on low-symmetry organic-inorganic hybrids, please share failure cases.

#crystal-structure#GNN#benchmarkarXiv:2601.08812
623
PapersPosted byu/maria_santos2 days ago
GNoME dataset: 380k stable structures - how reliable is it?

The release is exciting, but I want to temper expectations. A large fraction of candidate structures are labeled as potentially synthesizable based on formation-energy filters and model predictions, not full thermodynamic phase-space analysis. For discovery workflows this is great, yet downstream teams should not treat every listed structure as synthesis-ready. For people who already integrated GNoME into active-learning loops, what validation protocol are you using? We currently cross-check with Materials Project hull distances and then run a smaller DFT relaxation set before any Bayesian optimization step. Curious whether others see systematic biases in nitrides or chalcogenides.

#GNoME#dataset#stabilityarXiv:2311.12345
534
ForumCareerPosted byAnonymous3 days ago
PhD vs Industry: My experience switching after 3 years in academia

After three years as a postdoc, I moved to an industrial battery startup. The biggest change was timeline pressure: in academia we optimize for novelty, while in industry we optimize for decisions under uncertainty. I still run DFT and surrogate models, but now success means reducing experimental iteration cycles, not adding one more figure to a paper. Compensation and work-life balance improved for me, but I miss mentoring students and longer exploratory projects. If you are considering the switch, ask teams how they validate models experimentally and who owns failed predictions. The answer reveals whether data science is strategic or just a service function.

#career#industry#academia
512
ShowcasePosted byu/sarah_chen4 days ago
CrystalGPT — fine-tuned LLM for crystal structure Q&A

We fine-tuned Llama-3 on ~120k crystal structure descriptions, ICSD entries, and Materials Project documentation. The model can answer questions about space groups, Wyckoff positions, and common synthesis routes. It is far from replacing domain expertise but useful as a quick lookup tool and teaching aid. Weights and eval notebook available on HuggingFace.

#LLM#crystal-structure#fine-tuning
PythonTransformersLoRAvLLM
445
ForumDiscussionPosted byu/maria_santos5 days ago
The materials informatics reproducibility crisis - let's talk about it

We keep publishing impressive MAE numbers, but independent groups still struggle to reproduce even basic baselines. Missing random seeds, undocumented filtering, and leakage across compositional families are everywhere. This does not just hurt credibility; it wastes months of graduate student time and inflates confidence in fragile models. I think conferences and journals should require executable artifacts, data version hashes, and explicit split definitions for materials datasets. If software engineering standards are optional, materials informatics will keep reinventing avoidable mistakes. What lightweight standards could the community realistically adopt this year?

#reproducibility#benchmarking#open-science
412
ShowcasePosted byu/sarah_chen3 days ago
Our group just released an open-source MLFF training pipeline

We just open-sourced our workflow for training equivariant force fields from VASP trajectories. The pipeline handles dataset ingestion, neighbor-list caching, distributed training, and active-learning uncertainty triggers. We spent most of our time on data cleaning because mislabeled stress tensors were silently hurting training stability. The repository includes ready-to-run templates for silicon, Li-ion electrolyte clusters, and oxide surfaces. Feedback is very welcome, especially on experiment tracking and model-card sections. If there is interest, I can post a companion notebook showing integration with ASE geometry optimization.

#open-source#MLFF#training-pipeline
PythonPyTorchASEVASP
389
ShowcasePosted byu/gnn_enthusiast1 day ago
MatBench-Discovery: an open leaderboard for ML crystal stability

We launched a continuously updated leaderboard that ranks ML models on their ability to predict thermodynamic stability across a held-out test set derived from WBM. The site includes interactive Pareto plots, per-model error breakdowns, and downloadable prediction files. Source code is fully open and contributions to add new models are welcome via pull request.

#leaderboard#benchmark#crystal-stability
PythonPymatvizNext.jsDuckDB
367
PapersPosted byu/sarah_chen6 days ago
Automated high-throughput screening of 2D materials using GNNs

This workflow combines a pretrained graph model with uncertainty-aware acquisition to rank 2D candidates for hydrogen evolution and thermal stability. They evaluate around 70,000 hypothetical structures and only run expensive DFT for the top uncertainty-calibrated subset. The hit rate seems significantly better than random or heuristic filtering. One concern is that the candidate generator may bias towards known motifs, limiting novelty. Still, the closed-loop process is a nice template for groups that cannot afford brute-force DFT on full candidate spaces.

#high-throughput#2D-materials#active-learning10.1021/acscentsci.6c00101
301
ForumDiscussionPosted byu/gnn_enthusiast6 days ago
Best practices for training equivariant neural network potentials

After a few failed projects, our internal checklist now starts with data balance across local environments rather than model architecture. Equivariant models overfit fast when rare coordination motifs are underrepresented, and force-label noise from unconverged SCF steps can dominate the loss. We also found that monitoring physically meaningful validation metrics, like energy ranking consistency for polymorphs, is more useful than aggregate MAE alone. Curious what diagnostics others track during training to catch failure modes early.

#equivariant-nn#MLFF#training
289
PapersPosted byu/kenji_tanaka4 days ago
DFT+U parameters for transition metal oxides - comprehensive comparison

This paper compiles more than 400 calculations across Mn, Fe, Co, and Ni oxides and evaluates how U choices affect oxidation energetics and magnetic ordering. Their biggest contribution is a consistent protocol for fitting U against both formation enthalpy and band-gap constraints, rather than matching only one observable. I appreciate that the supplementary information includes full INCAR sets and pseudopotential choices. Reproducing literature values has been frustrating because many papers omit these details. Has anyone here tried applying their fitted U values to mixed-anion systems like oxyfluorides?

#DFT+U#transition-metals#oxide-chemistry10.1038/s41524-026-01001-5
278
ForumDiscussionPosted byu/kenji_tanaka7 days ago
Should we worry about data leakage in materials property prediction?

Short answer: yes. Composition overlap across train and test sets can create a false sense of model robustness, especially when crystal prototypes are near-duplicates. We recently reran a published benchmark with composition-family splits and observed performance drops of 30-50% depending on target property. Leakage is not always malicious; many datasets were never designed for ML benchmarking. But if we do not define clear split protocols, we cannot compare papers meaningfully. I would love to see a community-maintained suite of leakage-resistant evaluation splits.

#data-leakage#evaluation#materials-ml
274
ShowcasePosted byu/quantum_cataly3 days ago
OpenCatalyst Dataset Explorer — interactive 3D structure browser

Built a web viewer for the OC20/OC22 datasets so you can browse adsorption configurations without downloading the full 800 GB archive. You can filter by adsorbate, surface element, and Miller index, then inspect geometries in an interactive 3D viewer. Backend uses a pre-indexed DuckDB file for sub-second queries.

#catalysis#dataset-explorer#visualization
TypeScriptThree.jsDuckDBCloudflare Workers
198
ShowcasePosted byu/quantum_cataly5 days ago
Tutorial: Setting up ASE with GPAW for band structure calculations

I wrote a practical walkthrough for students who want to run GPAW from Python without juggling too many environment variables. The tutorial covers reproducible conda setup, PAW dataset checks, k-point path generation, and plotting with matplotlib. The main pain point was making sure MPI launch commands behave consistently across local and cluster environments. I included a minimal silicon example and a section on common numerical pitfalls, especially basis-set convergence. Happy to extend this with spin-polarized examples if that helps newcomers.

#ASE#GPAW#tutorial
PythonASEGPAWmatplotlib
156
ForumQuestionPosted byAnonymous2 days ago
Is anyone else struggling with VASP convergence on perovskite surfaces?

I am running slab relaxations for mixed-halide perovskites and keep hitting charge sloshing when vacuum exceeds 18 A. I tried reducing mixing amplitude and switching between ALGO = Normal and ALGO = Fast, but ionic steps still oscillate. K-point density is moderate (4x4x1), ENCUT is 1.3x ENMAX, and I include dipole correction. Has anyone found a stable recipe for these systems, especially when spin-orbit coupling is enabled? I can converge bulk cells reliably, but slab geometries with defects are painful. Any tips on pre-relaxation strategies or good default mixing parameters would save me days of queue time.

#VASP#perovskites#surface-science
156
JobsPosted byu/materialsbot2 days ago
Senior ML Engineer — Materials Informatics Platform

Join our team building the next-generation materials discovery platform. You will design and deploy GNN-based property prediction services, maintain model registry infrastructure, and collaborate with experimental scientists to close the loop between predictions and lab validation. Experience with PyTorch Geometric and distributed training is a strong plus.

#industry#ML-engineer#GNN
Citrine Informatics· Remote (US)Full-time$160,000 – $210,000
89
JobsPosted byu/maria_santos1 day ago
Postdoctoral Researcher — Machine Learning for Battery Materials

The Santos Lab at Stanford is seeking a postdoctoral researcher to develop machine learning surrogate models for solid-state electrolyte interfaces. The position involves training equivariant neural network potentials on ab-initio MD trajectories and validating predictions against experimental impedance and XPS data. Strong Python and DFT background required.

#postdoc#battery#MLFF
Stanford University — Santos Lab· Stanford, CAPostdoc$70,000 – $85,000
67
JobsPosted byu/kenji_tanaka3 days ago
PhD Position — Computational Design of 2D Heterostructures

The Tanaka Group at University of Tokyo has an opening for a PhD student starting Fall 2026. The project focuses on high-throughput DFT screening of van der Waals heterostructures for photocatalytic water splitting, combined with active-learning GNN surrogates. Full tuition waiver and competitive stipend provided through MEXT scholarship.

#PhD#2D-materials#DFT
University of Tokyo — Tanaka Group· Tokyo, JapanPhD PositionMEXT Scholarship (~¥145,000/month)
43
JobsPosted byu/materialsbot5 days ago
Summer Intern — Atomistic Simulation Team

We are looking for a summer intern (12 weeks) to help benchmark equivariant force fields against experimental phonon dispersion data. The intern will work with our atomistic simulation team and have access to internal HPC resources. Ideal candidates are in their final year of an MS or early PhD in materials science, physics, or chemistry.

#internship#simulation#phonons
Microsoft Research· Redmond, WA (Hybrid)Internship$8,500/month