Abstract

Machine learning potentials (MLPs) have become indispensable for conducting accurate large-scale atomistic simulations and for the efficient prediction of crystal structures. Polynomial MLPs, defined by polynomial rotational invariants, have been systematically developed for a wide range of elemental, alloy, and ionic systems. This study introduces a highly efficient and robust methodology for enumerating globally stable and metastable structures utilizing polynomial MLPs. To develop MLPs that are sufficiently robust for structure enumeration, an iterative process involving random structure searches and subsequent updates to the MLPs is employed. This methodology has been systematically applied to elemental systems such as As, Bi, Ga, In, La, P, Sb, Sn, and Te, where numerous local minima present significant competition with the global minimum in terms of energy. The proposed approach facilitates robust global structure searches and structure enumerations by markedly accelerating the search process and expanding the search space.

1 Introduction↩︎

Machine learning potentials (MLPs) are becoming increasingly popular for performing accurate calculations that are prohibitively expensive if conducted solely using the density functional theory (DFT) calculation, such as large-scale atomistic simulations [1]–[25]. MLPs represent the short-range part of interatomic interactions with systematic structural features using machine learning models, including artificial neural network models, Gaussian process models, and linear models. These MLP models are typically estimated using extensive datasets generated by DFT calculations. MLPs are designed to be more accurate than conventional interatomic potentials while also being computationally much more efficient than DFT calculations.

Global crystal structure optimizations employing MLPs have become increasingly common (e.g., [26]–[31]). In such optimizations, heuristic methods such as multi-start approaches and evolutionary algorithms are typically utilized [32]–[35]. To reliably identify global minimum structures using heuristic methods, it is essential to enumerate local minima until no new minima are discovered. This necessity is highlighted by the fact that stopping criteria in multi-start methods often rely on the number of trials and the count of local minima identified [36]. Furthermore, enumerating local minima with energy values close to the global minimum is crucial for predicting stable structures at finite temperatures and for constructing temperature-dependent phase diagrams. MLPs can significantly improve the efficiency of the energy calculations required for such local minimum enumeration. To robustly enumerate local minima, these MLPs must provide both high predictive accuracy across a broad range of structures and computational efficiency.

This study presents an iterative procedure that uses MLPs and random structure searches to enumerate both global and local minimum structures. The performance of the current procedure for enumerating global and local minimum structures is systematically evaluated in the elemental systems of As, Bi, Ga, In, La, P, Sb, Sn, and Te, which possess complex potential energy surfaces with numerous local minimum structures that have energy values comparable to the global minimum energy. The systematic application of this procedure to these complex systems highlights the reliability of the current procedure.

This study employs polynomial MLPs, which have been demonstrated to accurately predict properties across a diverse range of structures in various elemental and alloy systems [37]–[39]. The polynomial MLPs used in this study are derived from datasets encompassing various prototype structures, their derivatives, and local minima iteratively discovered through random structure searches, as described below. These polynomial MLPs are expressed as polynomials of rotational invariants, systematically derived from order parameters involving radial and spherical harmonic functions. Although the descriptive capacity for potential energy is limited by the use of simple polynomial functions rather than more complex models such as artificial neural networks or Gaussian process models, efficient model estimation is still achievable through linear regression methods, facilitated by powerful linear algebra libraries. Additionally, force and stress tensor components from DFT training datasets can be incorporated straightforwardly. Even when these force and stress tensor components are included as training data, model coefficients can be efficiently estimated using rapid linear regression techniques.

This paper is organized as follows: Section 2 demonstrates the formulation of the polynomial MLP and the current procedure for developing the MLP. Section 3 introduces the current iterative procedure for performing structure enumeration, which involves the revision of MLPs. In Sec. 3.3, a structural similarity used for eliminating duplicate structures in the structure search, which is derived from the formulation of the polynomial MLP, is shown. Section 4.1 provides the prediction errors associated with the polynomial MLPs developed using the current iterative procedure. Section 4.2 lists the global and local minimum structures found in the elemental systems and analyzes their preference. Finally, Section 5 provides a summary of this study.

2 Polynomial Machine Learning Potentials↩︎

2.1 Formulation↩︎

This section provides the formulation of the polynomial MLP in elemental systems, which can be simplified from the formulation in multi-component systems [38], [39]. The short-range part of the potential energy for a structure, \(E\), is assumed to be decomposed as \(E = \sum_i E^{(i)}\), where \(E^{(i)}\) denotes the contribution of interactions between atom \(i\) and its neighboring atoms within a given cutoff radius \(r_c\), referred to as the atomic energy. The atomic energy is then approximately given by a function of invariants \(\{d_{m}^{(i)}\}\) with any rotations centered at the position of atom \(i\) as \[\label{mlp-go:Eqn-atomic-energy-features} E^{(i)} = F \left( d_1^{(i)}, d_2^{(i)}, \cdots \right),\tag{1}\] where \(d_{m}^{(i)}\) can be referred to as a structural feature for modeling the potential energy.

The polynomial MLP adopts polynomial invariants of the order parameters representing the neighboring atomic density as structural features and employs polynomial functions as function \(F\). In other words, the atomic energy is modeled as a polynomial function of polynomial invariants in the polynomial MLP. Thus, the polynomial MLPs can be regarded as a generalization of embedded atom method (EAM) potentials, modified EAM potentials [14], a spectral neighbor analysis potential (SNAP) [15], and a quadratic SNAP [16]. In addition, linear polynomial models using polynomial invariants are analogous to the formulation of the atomic cluster expansion [25].

When the neighboring atomic density is expanded in radial functions \(\{f_n\}\) and spherical harmonics \(\{Y_{lm}\}\), a \(p\)th-order polynomial invariant for radial index \(n\) and set of angular numbers \(\{l_1,l_2,\cdots,l_p\}\) is given by a linear combination of products of \(p\) order parameters, expressed as

\[\label{mlp-go:Eqn-invariant-form} d_{nl_1l_2\cdots l_p,(\sigma)}^{(i)} = \sum_{m_1,m_2,\cdots, m_p} c^{l_1l_2\cdots l_p,(\sigma)}_{m_1m_2\cdots m_p} a_{nl_1m_1}^{(i)} a_{nl_2m_2}^{(i)} \cdots a_{nl_pm_p}^{(i)},\tag{2}\]

where order parameter \(a^{(i)}_{nlm}\) is component \(nlm\) of the neighboring atomic density of atom \(i\). Coefficient set \(\{c^{l_1l_2\cdots l_p,(\sigma)}_{m_1m_2\cdots m_p}\}\) ensures that the linear combinations are invariant for arbitrary rotations, which can be enumerated using group theoretical approaches such as the projection operator method [37], [40]. In terms of fourth- and higher-order polynomial invariants, multiple linear combinations are linearly independent for most of the set \(\{l_1,l_2,\cdots,l_p\}\). They are distinguished by index \(\sigma\) if necessary.

Here, the radial functions are Gaussian-type ones expressed by \[f_{n}(r)=\exp\left[-\beta_n(r-r_n)^{2}\right] f_c(r),\] where \(\beta_n\) and \(r_n\) denote given parameters. Cutoff function \(f_c\) ensures the smooth decay of the radial function. The current MLP employs a cosine-based cutoff function expressed as \[\begin{align} f_c(r) = \left\{ \begin{aligned} & \frac{1}{2} \left[ \cos \left( \pi \frac{r}{r_c} \right) + 1\right] & (r \le r_c)\\ & 0 & (r > r_c) \end{aligned} \right .. \end{align}\] The order parameter of atom \(i\), \(a_{nlm}^{(i)}\), is approximately evaluated from the neighboring atomic distribution of atom \(i\) as \[a_{nlm}^{(i)} = \sum_{\{j | r_{ij} \leq r_c\} } f_n(r_{ij}) Y_{lm}^* (\theta_{ij}, \phi_{ij}),\] where \((r_{ij}, \theta_{ij}, \phi_{ij})\) denotes the spherical coordinates of neighboring atom \(j\) centered at the position of atom \(i\). Note that this approximation for the order parameters ignores the non-orthonormality of the Gaussian-type radial functions, but it is acceptable in developing the polynomial MLP [37].

Given a set of structural features \(D^{(i)} = \{d_1^{(i)},d_2^{(i)},\cdots\}\), polynomial function \(F_\xi\) composed of all combinations of \(\xi\) structural features is represented as \[\begin{align} \label{mlp-go:Eqn-polynomials} F_1 \left(D^{(i)}\right) &=& \sum_{s} w_{s} d_{s}^{(i)}, \nonumber \\ F_2 \left(D^{(i)}\right) &=& \sum_{\{st\}} w_{st} d_{s}^{(i)} d_{t}^{(i)}, \\ F_3 \left(D^{(i)}\right) &=& \sum_{\{stu\}} w_{stu} d_{s}^{(i)} d_{t}^{(i)} d_{u}^{(i)} \nonumber, \end{align}\tag{3}\] where \(w\) denotes a regression coefficient. A polynomial of the polynomial invariants \(D^{(i)}\) is then described as \[\label{Eqn-polynomial-model1} E^{(i)} = F_1 \left( D^{(i)} \right) + F_2 \left( D^{(i)} \right) + F_3 \left( D^{(i)} \right) + \cdots.\tag{4}\] The current model has no constant term, which means that the atomic energy is measured from the energies of isolated atoms. In addition to the model given by Eqn. (4 ), simpler models composed of a linear polynomial of structural features and a polynomial of a subset of the structural features are also introduced, such as \[\begin{align} \label{Eqn-polynomial-model2} E^{(i)} &=& F_1 \left( D^{(i)} \right) + F_2 \left( D_{\rm pair}^{(i)} \cup D_2^{(i)} \right), \end{align}\tag{5}\] where subsets of \(D^{(i)}\) are denoted by \[\begin{align} D_{\rm pair}^{(i)} = \{d_{n0}^{(i)}\}, D_2^{(i)} = \{d_{nll}^{(i)}\}. \end{align}\]

2.2 Datasets↩︎

fig: — Figure 1: (a) Distribution of the coordination numbers around atoms in structures consisting of the training and test datasets. (b) Distribution of the volumes for structures in the datasets. The distribution is displayed in the logarithmic scale for visibility.

Datasets for each elemental system were generated as follows. First, the atomic positions and lattice constants of 86 prototype structures [37] using the DFT calculation were fully optimized. They comprise single elements with zero oxidation state from the Inorganic Crystal Structure Database (ICSD) [41], including metallic closed-packed structures, covalent structures, layered structures, and structures reported as high-pressure phases. Then, 13000–15000 structures were generated by randomly introducing lattice expansions, lattice distortions, and atomic displacements into supercells of optimized prototype structures. The entire set of structures was randomly divided into training and test datasets at a ratio of nine to one. These datasets were used to develop initial MLPs for the iterative procedure aimed at creating polynomial MLPs robust for structure enumeration, as described in Sec 3.2. During the iterative process, structures predicted to be local minima are incorporated into these datasets.

Figure 1 (a) displays the distribution of coordination numbers around atoms in structures that are present in both the training and test datasets. The datasets for the elements Ga, In, La, and Sn include structures with coordination numbers ranging from 2 to 12. In contrast, the datasets for the elements As, Bi, P, Sn, and Te are biased and predominantly composed of structures with coordination numbers between 2 and 8. The distribution of coordination numbers is system-dependent, reflecting the diversity of prototype structures optimized using DFT calculations. Even though local geometry optimizations are performed starting from the same prototype structure in different systems, the converged structures may differ due to the varying preferences of the local environment.

Figure 1 (b) shows the volume distribution of the datasets, displayed in a logarithmic scale for better visibility. Most of the structures are distributed around the equilibrium volumes of the prototype structures. On the other hand, the maximum value of the volume in the distribution is nearly ten times the equilibrium volumes.

DFT calculations were performed for structures in the datasets using the plane-wave-basis projector augmented wave (PAW) method [42] within the Perdew–Burke–Ernzerhof exchange-correlation functional [43] as implemented in the vasp code [44]–[46]. The cutoff energy was set to 300 eV. The allowed spacing between \(k\)-points was approximately set to 0.09 Å\(^{-1}\). The total energies converged to less than 10\(^{-3}\) meV/supercell. The atomic positions and lattice constants of the prototype structures were optimized until the residual forces were less than 10\(^{-2}\) eV/Å. The configurations of the valence electrons in the PAW potentials are \(4s^2 4p^3\) for As, \(6s^2 6p^3\) for Bi, \(4s^2 4p^1\) for Ga, \(5s^2 5p^1\) for In, \(5s^2 5p^6 5d^1 6s^2\) for La, \(3s^2 3p^3\) for P, \(5s^2 5p^3\) for Sb, \(5s^2 5p^2\) for Sn, and \(5s^2 5p^4\) for Te. These PAW potentials include scalar-relativistic corrections, and spin-orbit coupling was not considered in all the systems.

2.3 Regression↩︎

Weighted linear ridge regression was used to determine the regression coefficients for potential energy models. The energy values and force components in the training dataset were employed as observation entries in the regression. The forces acting on atoms are derived as linear models with the coefficients of the potential energy model, and the predictor matrix \(\boldsymbol{X}\) and observation vector \(\boldsymbol{y}\) can be written in a submatrix form as \[\boldsymbol{X} = \begin{bmatrix} \boldsymbol{X}_{\rm energy} \\ \boldsymbol{X}_{\rm force} \\ \end{bmatrix} ,\qquad \boldsymbol{y} = \begin{bmatrix} \boldsymbol{e} \\ \boldsymbol{f} \\ \end{bmatrix}.\] The matrix \(\boldsymbol{X}_{\rm energy}\) comprises structural features and their products, and the elements of \(\boldsymbol{X}_{\rm force}\) are related to the derivatives of structural features with respect to the Cartesian coordinates of atoms, which were derived in Ref. [37]. The observation vector \(\boldsymbol{y}\) includes components of \(\boldsymbol{e}\) and \(\boldsymbol{f}\), which contain the total energies and the forces acting on atoms in the training dataset, respectively. These values were obtained from DFT calculations.

Weighted linear ridge regression is a technique that can be used to shrink the regression coefficients by imposing a penalty. The approach involves minimizing the penalized residual sum of squares, which is given by \[L(\boldsymbol{w}) = ||\boldsymbol{W} (\boldsymbol{X}\boldsymbol{w} - \boldsymbol{y}) ||_2^2 + \lambda ||\boldsymbol{w}||_2^2,\] where \(\lambda\) and \(\boldsymbol{W}\) denote the magnitude of the penalty and the diagonal matrix where non-zero elements correspond to weights for data entries, respectively. The solution to the minimization problem is represented by \[\boldsymbol{\hat{w}} = (\boldsymbol{X}^\top \boldsymbol{W}^2 \boldsymbol{X} + \lambda \boldsymbol{I})^{-1} \boldsymbol{X}^\top \boldsymbol{W}^2 \boldsymbol{y},\] where \(\boldsymbol{I}\) denotes the unit matrix. The solution can be easily obtained using fast linear algebra algorithms while avoiding the well-known multicollinearity problem. In this study, the magnitude of the penalty was determined so that the estimated regression coefficients result in the lowest root mean square (RMS) error for an extensive test dataset.

The energy and forces acting on atoms have different units that resemble weights for data entries. Specifically, the units eV/cell and eV/Åwere used for defining energy and force, respectively. In addition, weights were applied to data entries depending on their values, which aims to develop robust MLPs for essential structures. This is achieved by decreasing the influence of less significant data entries that have large values. The weight for the energy entry of \(i\)-th structure \(W(e_{[i]})\) was given as \[W(e_{[i]}) = \begin{cases} 0.1 & (e_{[i]} \geq 0)\\ 1 & (e_{[i]} < 0) \end{cases},\] where \(e_{[i]}\) denotes the total energy of structure \(i\). Small weights were applied to energy entries of structures that are less stable than isolated atoms. The weight for the force entry on \(\alpha\)-component of atom \(j\) in structure \(i\), \(W(f_{[i],j\alpha})\), was given as \[W(f_{[i],j\alpha}) = \begin{dcases} \frac{\varepsilon}{\left|f_{[i],j\alpha}\right|} & (|f_{[i],j\alpha}| \geq \varepsilon)\\ 1 & (|f_{[i],j\alpha}| < \varepsilon) \end{dcases}.\] Here, \(\varepsilon\) was set to 1 eV/Å. Smaller weights were assigned to strong force components, whereas larger weights were given to weak force components observed in structures near local minima. This weighting strategy can naturally enhance the performance of MLPs in predicting structures close to local minima while reducing the risk of anomalous predictions for structures located far from local minima with high energy values.

2.4 Model selection↩︎

The accuracy and computational efficiency of MLPs are influenced by several input parameters. To determine optimal MLPs that achieve a balance between these factors, a systematic grid search is performed. The parameters considered in the grid search include the cutoff radius, the type of potential energy model, the number of radial functions, and the truncation of polynomial invariants, specifically the maximum angular momentum of spherical harmonics \(\{l_{\rm max}^{(2)}, l_{\rm max}^{(3)}, l_{\rm max}^{(4)}, \cdots \}\), and the polynomial order of the invariants. Since the accuracy and computational efficiency are conflicting properties in polynomial MLPs, Pareto-optimal MLPs are considered suitable candidates for enumerating crystal structures.

For each system, an efficient MLP is selected from the Pareto-optimal candidates. The chosen MLP is desired to meet the following several criteria to effectively perform crystal structure enumerations, although it is nearly impossible to develop MLPs that perfectly satisfy all these criteria. First, the MLP must accurately reconstruct the shape of the potential energy surface, including the locations of local minima, and should be capable of predicting a broad range of local minima structures, ideally encompassing all relevant minima. Second, the MLP should be able to identify unrealistic and hypothetical structures with high DFT energy values as energetically unstable. Numerous such structures are evaluated during crystal structure enumerations. Third, the potential energy surface of the MLP should contain no spurious local minima to ensure efficient structure predictions. Lastly, the MLP should perform energy and force calculations efficiently; thus, MLPs requiring approximately 1 ms/atom for a single point calculation are preferred.

3 Structure enumeration↩︎

3.1 General statement↩︎

In a structure optimization problem with \(N\) atoms at a given composition, a crystal structure can be represented by \(3N + 3\) independent variables \(\boldsymbol{x}\). They do not contain the translational and rotational degrees of freedom. The goal of global structure optimization is to minimize the structure-dependent potential energy \(E(\boldsymbol{x})\), formulated as \[E^* = \min_{\boldsymbol{x} \in \mathbb{D}} E(\boldsymbol{x}),\] within the nonempty feasible region \(\mathbb{D}\) for crystal structure representation \(\boldsymbol{x}\). The crystal structure that yields the minimum energy \(E^*\) is denoted as the global minimum structure \(\boldsymbol{x}^*\), hence \(E^* = E(\boldsymbol{x}^*)\). Similarly, a local minimum structure is denoted as \(\boldsymbol{x}_l^*\), and a set of local minimum structures with energy values lower than a given energy threshold \(\theta\) is denoted as \[\mathbb{D}^*_\theta = \left \{ \boldsymbol{x}_l^* \in \mathbb{D} \:|\: E(\boldsymbol{x}_l^*) \leq \theta \right \}.\]

Depending on the choice of the crystal structure representation, multiple representations can correspond to the same structure, making the entire feasible region redundant. In addition, there may be subregions within the feasible region where it is impossible to form crystal structures. Furthermore, the feasible region contains physically unrealistic structures. To search for crystal structures efficiently, these unrealistic structures are often eliminated from the feasible region in advance. These subregions can be defined by constraints applied to the entire feasible region, particularly linear constraints. Given a set of constraints \(\{g_1(\boldsymbol{x}) \leq 0, \cdots, g_m(\boldsymbol{x}) \leq 0\}\), the reduced feasible region \(\Delta\) is described as \[\label{mlp-go:Eq-reduced-region} \Delta = \bigcap_m \Delta_m,\tag{6}\] where \(\Delta_m\) denotes the feasible region reduced by constraint \(g_m(\boldsymbol{x}) \leq 0\), written as \[\Delta_m = \{\boldsymbol{x} \in \mathbb{D} \:| \: g_m(\boldsymbol{x}) \leq 0 \}.\] When the feasible region is reduced, the problems of finding the global minimum structure and enumerating local minimum structures are restated as \[E^* = \min_{\boldsymbol{x} \in \Delta} E(\boldsymbol{x})\] and \[\Delta^*_\theta = \left \{ \boldsymbol{x}_l^* \in \Delta \:|\: E(\boldsymbol{x}_l^*) \leq \theta \right \},\] respectively.

3.2 Computational procedures↩︎

This study employs a random structure search method [33], which corresponds to the multi-start approach for global optimization [47]. This method involves repeatedly performing local geometry optimizations starting from uniformly sampled structures within the feasible region \(\mathbb{D}\) or a reduced feasible region \(\Delta\). The random structure search is an effective method for enumerating both global and local minimum structures, owing to its uniform sampling strategy, straightforward implementation, and ease of parallelization.

In the current procedure, most of the \(ab\) \(initio\) calculations in the \(ab\) \(initio\) random structure search (AIRSS) method [33] are replaced with MLP calculations. However, there are concerns associated with using MLPs for crystal structure enumeration. Firstly, MLPs often exhibit significant prediction errors for structures that lie outside the training dataset and sometimes produce spurious local minimum structures. Since it is infeasible to prepare a training dataset that covers all possible local minimum structures in advance, an iterative approach involving updates to polynomial MLPs is employed. Secondly, even with iterative updates to polynomial MLPs that reduce prediction errors, some small but non-negligible errors may persist. This can complicate the assessment of stability between local minimum structures based on energy values predicted by MLPs. Consequently, the final assessment of stability between local minimum structures is carried out using DFT calculations. Local geometry optimizations for the local minimum structures identified by the MLP are systematically performed using DFT calculations.

The following is the current procedure for finding the global and local minimum structures: (1) A polynomial MLP is developed for the chosen model using DFT training and test datasets. (2) A large number of initial structures (\(\sim 10^5\)) are randomly and uniformly sampled from the reduced feasible region \(\Delta\), which is limited to structures with up to twelve atoms. (3) Local geometry optimizations are systematically performed on the initial structures using the polynomial MLP. (4) Duplicate local minimum structures are identified and removed using space-group identification and a similarity measure that is closely related to the polynomial MLP (as described in Sec. 3.3). A small tolerance value is used to eliminate duplicate structures, while ensuring that all borderline structures are kept as candidates for local minimum structures. (5) Single-point DFT calculations are performed for local minimum structures with low energy values, and their results are added to the training dataset. Steps (1)–(5) are repeated iteratively until a robust random structure search becomes possible. (6) Local geometry optimizations are then performed using the DFT calculation. These optimizations begin with a subset of local minimum structures with MLP energy values lower than a given threshold value \(\theta^{\rm MLP}\). (7) Duplicate local minimum structures are again identified and removed using a tolerance value that is slightly larger than the one used in step (4).

The feasible region is defined by the metric tensor of lattice basis vectors and the fractional coordinates of the atomic positions with respect to the lattice basis vectors. This feasible region is then reduced by applying the main conditions for defining the Niggli reduced cell [48]. To create initial structures, a set of three diagonal components of the metric tensor is uniformly sampled, given an upper bound for the diagonal elements. Following the main conditions for the Niggli cell determined by the diagonal components, the non-diagonal components of the metric tensor are then randomly sampled between the upper and lower bounds. Although the fractional coordinates of the atomic positions in the reduced feasible region have duplicate subregions, the efficiency of the random structure search remains unaffected.

Note that the AIRSS can be computationally expensive because it requires numerous local geometry optimizations using \(ab\) \(initio\) calculations. To reduce the computational cost, initial structures are usually sampled within a feasible region based on prior knowledge of crystal structure stability. On the other hand, the current approach does not rely on any prior knowledge about the initial crystal structure, except for the upper bound of the metric tensor of lattice basis vectors. This approach allows for robust enumeration of both global and local minimum structures without preconceived assumptions about the initial structures.

3.3 Duplicate structure elimination↩︎

Numerous local geometry optimizations, as outlined in steps (3) and (6) of the current procedure described in Sec. 3.2, can lead to the generation of many duplicate structures. This issue has also been noted in global structure searches conducted in many other studies. In this study, both space group identification and a distance measure based on structural features are employed to eliminate duplicate structures. The structural features, detailed in Eq. (2 ), are derived from the selected polynomial MLP. These features are then normalized using the regression coefficients of the chosen polynomial MLP.

When indices of the structural features are written by single index \(s = \{n,l_1,l_2,\cdots,l_p,(\sigma)\}\), the averages of the normalized structural features and their products over atoms in structure \(i\), \(\boldsymbol{\bar d}_{[i]}\), are described as \[\boldsymbol{\bar d}_{[i]} = \left[\cdots,\bar d_{[i],s},\cdots,\bar d_{[i],st},\cdots,\bar d_{[i],stu},\cdots\right],\] where \[\begin{align} \bar d_{[i],s} &=& \frac{1}{N_{[i]}} \sum_k w_s d_{[i],s}^{(k)}, \nonumber \\ \bar d_{[i],st} &=& \frac{1}{N_{[i]}} \sum_k w_{st} d_{[i],s}^{(k)} d_{[i],t}^{(k)}, \\ \bar d_{[i],stu} &=& \frac{1}{N_{[i]}} \sum_k w_{stu} d_{[i],s}^{(k)} d_{[i],t}^{(k)} d_{[i],u}^{(k)},\nonumber \end{align}\] and \(N_{[i]}\) denotes the number of atoms in structure \(i\). The distance between structures, \(i\) and \(j\), is then measured as the Euclidean distance of their corresponding \(\boldsymbol{\bar d}_{[i]}\) and \(\boldsymbol{\bar d}_{[j]}\), \(\|\boldsymbol{\bar d}_{[i]}-\boldsymbol{\bar d}_{[j]}\|_2\). If the distance between the structures is small, they are considered similar in terms of the local neighboring atomic distribution. It is important to note that the current set of structural features is a generalization of radial distribution functions and angular distribution functions. These functions represent local neighboring atomic distributions and have often been used to eliminate duplicate structures (e.g., [49]).

In the current structure searches using the MLPs, the number of local minimum structures frequently exceeds 50,000, which makes it impractical to compute distances for all pairs of structures. Therefore, an efficient procedure is introduced to eliminate duplicate structures. (1) Structures with different energy values are considered to be different. The energy value can act as a concise key to determine whether structures are different or not. This is based on the fact that the energy of structure \(i\) is expressed using its normalized structural features as \[E_{[i]} = \sum_s \bar d_{[i],s} + \sum_{st} \bar d_{[i],st} + \sum_{stu} \bar d_{[i],stu} + \cdots.\] If the absolute difference between the energy values of structures \(i\) and \(j\) is more than a given tolerance parameter \(\varepsilon_E\), \(|E_{[i]} - E_{[j]}| > \varepsilon_E\), they are regarded as different. Here, the tolerance parameter is set to \(\varepsilon_E = 0.01\) meV/atom. If \(|E_{[i]} - E_{[j]}| \leq \varepsilon_E\), we proceed to step (2). (2) Structures with different space groups are considered different. Because the space group identification depends on the tolerance parameter, multiple tolerance parameters are used. A single structure is identified with the set of space groups from the multiple tolerance parameters. If the sets of space groups for structures \(i\) and \(j\), \(\{\mathbb{G}\}_{[i]}\) and \(\{\mathbb{G}\}_{[j]}\), have no common elements, i.e., \(\{\mathbb{G}\}_{[i]} \cap \{\mathbb{G}\}_{[j]} = \varnothing\), the structures are regarded to be different. Otherwise, we proceed to step (3). (3) Two structures separated by a distance larger than a tolerance parameter \(\varepsilon_d\), i.e., \(\|\boldsymbol{\bar d}_{[i]}-\boldsymbol{\bar d}_{[j]}\|_2 > \varepsilon_d\), are considered different. Here, the tolerance parameter is set to \(\varepsilon_d = 0.001\). Otherwise, they are regarded to be identical.

Note that the distances between structures that are clearly identical are less than \(10^{-7}\). In contrast, the distances between structures that are visibly different but have similar energy values are greater than 0.1. Most structure pairs fall into either of these categories, although some pairs may lie near the borderline. For local minimum structures obtained through DFT calculations, tolerance parameters of \(\varepsilon_E = 1\) meV/atom and \(\varepsilon_d = 0.01\) are used. These values are larger than those applied to structures obtained using MLPs due to the larger convergence criteria employed in local geometry optimizations with DFT.

4 Results and discussion↩︎

4.1 Machine learning potentials↩︎

Table 1: Accuracy and computational efficiency of the polynomial MLPs used for enumerating the global and local minimum structures.The RMS errors are calculated for test datasets, excluding structures with exceptionally high positive energy values.The computational efficiency is evaluated by measuring the elapsed time to compute the energy, forces, and stress tensors of a structure consisting of many atoms.The elapsed time is normalized by the number of atoms because it is proportional to the number of atoms.The elapsed time for a single point calculation is estimated using a single core of IntelXeonE5-2695 v4 (2.10 GHz) and an implementation of the polynomial MLP to the lammps code [50].
	RMS error (energy)	RMS error (force)	Time
	(meV/atom)	(eV/Å)	(ms/atom/step)
As	5.7	0.117	1.4
Bi	4.0	0.065	1.0
Ga	2.1	0.030	0.6
In	1.7	0.019	0.5
La	3.1	0.052	1.0
P	6.2	0.142	1.3
Sb	4.5	0.075	1.4
Sn	2.6	0.037	1.0
Te	6.0	0.098	2.4

fig: — Figure 2: Prediction errors of the selected MLPs for 86 prototype structures.

fig: — Figure 3: Distribution of the energy values for the training and test datasets, predicted using the DFT calculation and the selected MLPs.

Table 1 lists the RMS errors of the updated MLPs at the final iteration. The RMS errors are assessed for the test datasets, excluding structures with exceptionally high positive energy values, which offers a more practical measure of the accuracy. Figure 2 shows the prediction errors of the selected MLPs for 86 prototype structures. As depicted in Fig. 2, the MLPs have a high predictive power for various structures. Figure 3 displays the distributions of energies of structures in both the training and test datasets, computed using the DFT calculation and the polynomial MLP. The distributions demonstrate that the polynomial MLP is accurate for many typical structures and their derivatives containing diverse neighborhood environments and coordination numbers, as evidenced by a narrow distribution of errors. Overall, these results indicate that the polynomial MLPs are reliable for predicting the energies of a wide range of structures.

Polynomial MLPs developed using the final datasets including local minimum structures can be accessed through the Polynomial machine learning potential repository [39], [51]. While only the predictive power for energy has been demonstrated in this study, the repository website includes predictions for other properties as well. Additionally, the predictive power for liquid states in the elemental Bi, Ga, In, and Sn can be found elsewhere [52].

4.2 Structure enumeration↩︎

Structure enumeration begins with polynomial MLPs developed according to the procedure outlined in Sec. 3.2. These MLPs are constructed from datasets that include various prototype structures, their derivatives, and structures predicted to be local minima by the MLPs. Subsequently, random structure searches are performed using these MLPs, involving systematic repetitions of local geometry optimizations. From these searches, unique local minimum structures with MLP energy values, \(E^{\rm MLP}\), lower than a given energy threshold \(\theta^{\rm MLP}\), \(E^{\rm MLP} \leq \theta^{\rm MLP}\), are extracted. To ensure reliable structure enumerations, the following threshold values are used: 25 meV/atom for Ga, 50 meV/atom for As, Bi, and In, 75 meV/atom for La, P, Sb, and Sn, and 100 meV/atom for Te, which are measured from the lowest energy.

When it is assumed that the Bayesian estimates on the number of local minima in multi-start approaches [36] are applicable to the current structure search, the estimates are calculated for \(E^{\rm MLP} \leq \theta^{\rm MLP}\) as follows: 465.9, 189.8, 1566.7, 520.1, 298.9, 282.7, 407.6, 403.3, 418.9 for As, Bi, Ga, In, La, P, Sb, Sn, and Te, respectively. These estimates are very close to the numbers of the local minimum structures found in the current random structure searches, i.e., 464, 189, 1529, 516, 297, 282, 406, 400, and 417, respectively. This indicates that most of the local minimum structures with \(E^{\rm MLP} \leq \theta^{\rm MLP}\) existing in the feasible region should be found in the current structure searches using the MLPs, including the globally stable structure.

fig: — Figure 4: Energy distribution for the global and local minimum structures. The MLP energy is calculated using the updated MLP at the final iteration. The energy scales in this figure are significantly smaller compared to those used in Figs. 2 and 3.

Once a set of local minimum structures with \(E^{\rm MLP} \leq \theta^{\rm MLP}\) is obtained using the MLP, local geometry optimizations are performed using the DFT calculation. This allows us to evaluate the stability of the structures more accurately. Figure 4 shows the energy distribution of the local minimum structures in the nine elemental systems. The energy values of the local minimum structures predicted using the DFT calculation and the polynomial MLP show strong correlations as a result of the iterative update of the MLPs. This result suggests that the updated MLPs can enumerate most of the local minimum structures with low energy values. However, there are some local minimum structures whose energies predicted by the polynomial MLPs deviate from their DFT energies. While the MLPs are less reliable in predicting the energy of such structures, they can still play an adequate role in finding local minimum structures with low energy values. If we aim to improve the predictive power for such structures, more complex models for the MLPs are necessary.

Table 2: Number of the global and local minimum structures for several energy threshold values \(\theta\).The energy values of the local minimum structures are computed using the DFT calculation.The energy threshold values are described in the unit of meV/atom.
	\(\theta=5\)	\(\theta=10\)	\(\theta=15\)	\(\theta=20\)	\(\theta=25\)
Ga	1	101	616	902	1142
	\(\theta=10\)	\(\theta=20\)	\(\theta=30\)	\(\theta=40\)	\(\theta=50\)
As	269	336	387	407	415
Bi	2	11	27	54	101
In	99	113	135	336	474
	\(\theta=15\)	\(\theta=30\)	\(\theta=45\)	\(\theta=60\)	\(\theta=75\)
La	73	150	205	220	253
P	112	191	255	274	276
Sb	2	24	101	220	302
Sn	1	4	13	81	310
	\(\theta=20\)	\(\theta=40\)	\(\theta=60\)	\(\theta=80\)	\(\theta=100\)
Te	7	41	81	253	339

Table 2 presents the number of unique local minimum structures with DFT energy values below various thresholds. It is observed that many local minimum structures have energy values very close to the global minimum energy in these systems. The set of global and local minimum structures identified for a small threshold is reliable; however, the set of global and local minimum structures for thresholds near \(\theta^{\rm MLP}\) may be incomplete. Due to prediction errors of the MLPs, some structures with MLP energy values exceeding \(\theta^{\rm MLP}\) may still exhibit DFT energy values below the thresholds.

Furthermore, the number of local minimum structures identified using DFT calculations is smaller than that obtained using MLPs, even though the majority of structures identified through MLPs converge to unique local minima upon DFT geometry optimization. This discrepancy occurs because some structures with MLP energy values below \(\theta^{\rm MLP}\) may have DFT energy values exceeding \(\theta^{\rm MLP}\). Additionally, the reduction in the number of structures is influenced by the tolerance parameters used to eliminate duplicate structures. Specifically, a larger tolerance value is applied to structures identified through DFT calculations compared to those obtained using MLPs, as detailed in Section 3.2.

4.2.1 As↩︎

fig: — Figure 5: (a) Distribution of the local minimum structures and equilibrium prototype structures in As. The prototype structures are used to derive structures in the training dataset. These structures are mapped into a two-dimensional plane using an MDS for visibility. The open and closed circles show the local minimum structures and prototype structures, respectively. (b) Average coordination numbers of the local minimum structures. The cutoff radius is given as 1.2 times the nearest neighbor distance.

In the elemental As, it is found that there are 415 local minimum structures that have relative energy values lower than \(\theta = 50\) meV/atom. Specifically, 269 of those structures have relative energy values less than \(\theta = 10\) meV/atom. Figure 5 (a) displays the distribution of local minimum structures and equilibrium prototype structures that are used to create structures in the training dataset. The structural features of the polynomial MLP for these structures are mapped onto a two-dimensional plane using multi-dimensional scaling (MDS) [53]. Therefore, this figure provides a visual representation of the similarity measure between the local minimum structures and training dataset. As shown in the figure, many local minimum structures are located far from the prototype structures. Figure 5 (b) shows the distribution of average coordination numbers for the local minimum structures. The cutoff radius used is 1.2 times the nearest neighbor distance. The global minimum structure and most of the local minimum structures are three-coordinated, which is consistent with the experimental structure.

Table 3: Global and local minimum structures in As.The relative energy \(\Delta E\) calculated using the DFT calculation is expressed in the unit of meV/atom.\(Z\) denotes the number of atoms included in the unitcell.Parentheses in the columns of ICSD identifiers and prototype structures indicate that the corresponding structure is close to the prototype in the parentheses.However, its space group is different from that of the prototype structure and/or large tolerance values are required for the prototype identification.
Space group	ICSD-ID	Prototype	\(Z\)	\(\Delta E\)
\(R\bar3m\)	9859	As	6	0.0
\(P\bar3m1\)	(1425)	(HCP)	2	1.9
\(C2/m\)	(1425)	(HCP)	8	2.0
\(P\bar3m1\)	(1425)	(HCP)	4	2.0
\(R\bar3m\)	\(-\)	\(-\)	24	2.1
\(R\bar3m\)	\(-\)	\(-\)	30	2.1
\(P3m1\)	\(-\)	\(-\)	6	2.1
\(P3m1\)	(31170)	(C(\(P6_3mc\)))	4	2.1
\(R\bar3m\)	\(-\)	\(-\)	18	2.2
\(R\bar3m\)	\(-\)	\(-\)	12	2.2
\(P6_3mc\)	31170	C(\(P6_3mc\))	4	2.2
\(P\bar3m1\)	\(-\)	\(-\)	8	2.2
\(R3m\)	\(-\)	\(-\)	12	2.2
\(R\bar3m\)	\(-\)	\(-\)	18	2.2
\(R\bar3m\)	\(-\)	\(-\)	18	2.2
\(R\bar3m\)	\(-\)	\(-\)	12	2.3
\(R\bar3m\)	\(-\)	\(-\)	18	2.3
\(R3m\)	\(-\)	\(-\)	24	2.3
\(R3m\)	\(-\)	\(-\)	24	2.4
\(R3m\)	\(-\)	\(-\)	30	2.4
\(R3m\)	\(-\)	\(-\)	24	2.4
\(R3m\)	\(-\)	\(-\)	24	2.5
\(R3m\)	\(-\)	\(-\)	24	2.5
\(R3m\)	\(-\)	\(-\)	30	2.5
\(R3m\)	\(-\)	\(-\)	36	2.5
\(P3m1\)	\(-\)	\(-\)	10	2.5
\(P1\)	\(-\)	\(-\)	10	2.5
\(R3m\)	\(-\)	\(-\)	30	2.5
\(R3m\)	\(-\)	\(-\)	18	2.5
\(C2/m\)	2284	O\(_2\)(\(mS4\))	4	3.2
\(R\bar3m\)	29213	Graphite(3R)	6	3.2
\(R\bar3m\)	29213	Graphite(3R)	6	3.8
\(P6_3mc\)	31170	C(\(P6_3mc\))	4	4.1
\(Cmce\)	23836	black-P	8	18.7

Table 3 provides a list of local minimum structures found in the elemental As. This list includes only the structures that have low energy values and can be identified with prototype structures and experimental structures in As. Prototype identifiers that are consistent with the ICSD are given. The global minimum structure is the three-coordinated As-type structure, which is consistent with the experimental structure at low temperatures [54], [55]. Apart from the global minimum structure, three-coordinated structures such as C(\(P6_3mc\))-, graphite(3R)-, and black-P-type are also found as local minimum structures. Among these structures, only the black-P-type structure is an experimental polymorph [56].

4.2.2 Bi↩︎

fig: — Figure 6: (a) Distribution of the local minimum structures and equilibrium prototype structures in Bi. These structures are mapped into a two-dimensional plane using an MDS. The open and closed circles show the local minimum structures and prototype structures, respectively. (b) Average coordination numbers of the local minimum structures. The cutoff radius is given as 1.2 times the nearest neighbor distance.

In the elemental Bi, 101 different local minimum structures with relative energy values lower than \(\theta =\) 50 meV/atom have been discovered. The distribution of these local minimum structures and equilibrium prototype structures in Bi is shown in Fig. 6 (a). The majority of the local minimum structures are located around the equilibrium prototype structures. Figure 6 (b) shows the distribution of average coordination numbers for the local minimum structures in Bi. The global minimum structure is six-coordinated, and the average coordination numbers of the local minimum structures in Bi range from three to eight.

Table 4: Global and local minimum structures in Bi.The DFT relative energy \(\Delta E\) is expressed in the unit of meV/atom.\(Z\) denotes the number of atoms included in the unitcell.
Space group	ICSD-ID	Prototype	\(Z\)	\(\Delta E\)
\(R\bar3m\)	9859	As	6	0.0
\(Imma\)	(9859)	(As)	4	1.8
\(R\bar3m\)	\(-\)	\(-\)	36	12.5
\(P\bar3m1\)	\(-\)	\(-\)	12	12.5
\(P3m1\)	\(-\)	\(-\)	10	13.8
\(P\bar3m1\)	\(-\)	\(-\)	10	14.6
\(R3m\)	\(-\)	\(-\)	24	16.7
\(P\bar3m1\)	\(-\)	\(-\)	8	16.7
\(P6_3mc\)	\(-\)	\(-\)	8	16.8
\(R\bar3m\)	\(-\)	\(-\)	24	17.6
\(C2/m\)	\(-\)	\(-\)	24	19.8
\(Cm\)	\(-\)	\(-\)	24	20.4
\(R3m\)	\(-\)	\(-\)	18	20.9
\(R\bar3m\)	\(-\)	\(-\)	18	21.4
\(R3m\)	\(-\)	\(-\)	36	21.8
\(R\bar3m\)	\(-\)	\(-\)	18	22.0
\(P\bar3m1\)	\(-\)	\(-\)	6	22.2
\(P2_1/m\)	\(-\)	\(-\)	10	23.5
\(R3m\)	\(-\)	\(-\)	30	24.2
\(R3m\)	\(-\)	\(-\)	30	25.0
\(P3m1\)	\(-\)	\(-\)	10	25.0
\(P6_3mc\)	31180	C(\(P6_3mc\))	4	28.9
\(P2_1/m\)	(43211)	(SC)	8	38.7
\(C2\)	(43211)	(SC)	18	39.6
\(P1\)	(43211)	(SC)	12	39.7
\(Pm\bar3m\)	43211	SC	1	40.0
\(I4_1/amd\)	40037	\(\beta\)-Sn	4	41.1
\(Pnma\)	165995	MnP	8	41.6
\(C2/m\)	409752	Bi(II)	20	42.9
\(Cmce\)	23836	black-P	8	49.4
\(I4_1/amd\)	109018	Cs(HP)	4	49.8

Table 4 lists the local minimum structures found in the elemental Bi. The global minimum structure is the As-type structure, which is consistent with the experimental structure observed at low temperatures [55], [57], [58]. It is worth noting that the global minimum structure can be considered as six-coordinated, which is different from the three-coordinate As-type structure observed in the elemental As. This inconsistency is due to the fact that the distribution of interatomic distances in the As-type structure for Bi is different from that in the As-type structure for As.

Moreover, some of the local minimum structures are assigned as prototype structures, such as C(\(P6_3mc\))-, simple-cubic (SC)-, \(\beta\)-Sn-, MnP-, Bi(II)-, black-P-, and Cs(HP)-type structures. Among these structures, the SC-type [59] and Bi(II)-type [60] structures were reported as high-pressure polymorphs in the literature. The body-centered-cubic (BCC)-type structure, which is known as a high-pressure structure [61], is also found in the structure list obtained from the structure search using the MLPs. However, it exhibits a relative energy value larger than \(\theta =\) 50 meV/atom. In the elemental Bi, Bi(\(I4/mcm\))-type [62], Bi(III)-type [62], Si(\(oS16\))-type [63], and Sb(\(mP4\))-type [64] structures were also reported as high-pressure phases, but they are not found in the current structure list.

4.2.3 Ga↩︎

fig: — Figure 7: (a) Distribution of the local minimum structures and equilibrium prototype structures in Ga. These structures are mapped into a two-dimensional plane using an MDS. The open and closed circles show the local minimum structures and prototype structures, respectively. (b) Average coordination numbers of the local minimum structures. The cutoff radius is given as 1.2 times the nearest neighbor distance.

There are 1142 local minimum structures in the elemental Ga with relative energy values less than \(\theta =\) 25 meV/atom. Among these structures, 616 structures are highly competitive in terms of energy, with relative energy values less than 15 meV/atom. Figure 7 (a) shows the distribution of the local minimum structures and equilibrium prototype structures in Ga. The local minimum structures are located inside the distribution of the prototype structures, while many of them are distant from the prototype structures. Figure 7 (b) shows the distribution of average coordination numbers for the local minimum structures in Ga. The global minimum structure has a coordination number of seven, while 12-coordinated local minimum structures similar to the FCC structure are also competing with the global minimum structure. The average coordination numbers of the local minimum structures in Ga range from six to twelve.

Table 5: Global and local minimum structures in Ga.The DFT relative energy \(\Delta E\) is expressed in the unit of meV/atom.\(Z\) denotes the number of atoms included in the unitcell.
Space group	ICSD-ID	Prototype	\(Z\)	\(\Delta E\)
\(Cmce\)	43388	Ga(I)	8	0.0
\(P\bar1\)	(12174)	(In)	10	5.2
\(C2/m\)	(12174)	(In)	16	5.4
\(Cmcm\)	\(-\)	\(-\)	8	5.5
\(P\bar1\)	(12174)	(In)	12	5.5
\(Imma\)	\(-\)	\(-\)	16	5.6
\(P\bar1\)	(12174)	(In)	12	5.8
\(P1\)	(12174)	(In)	12	5.8
\(C2/m\)	(54338)	(Pr)	24	5.9
\(C2/m\)	(54338)	(Pr)	18	6.1
\(P\bar1\)	(12174)	(In)	12	6.1
\(Cmcm\)	\(-\)	\(-\)	24	6.9
\(R\bar3m\)	(12174)	(In)	3	7.3
\(Cmmm\)	\(-\)	\(-\)	22	7.6
\(Cmcm\)	\(-\)	\(-\)	4	7.6
\(Cm\)	\(-\)	\(-\)	24	7.8
\(Immm\)	\(-\)	\(-\)	22	7.8
\(Fm\bar3m\)	20502	FCC	4	7.9
\(Cm\)	\(-\)	\(-\)	16	8.0
\(Cm\)	\(-\)	\(-\)	20	8.1
\(R\bar3m\)	\(-\)	\(-\)	30	8.1
\(Fddd\)	44866	Pu(\(oF8\))	8	8.1
\(C2/c\)	\(-\)	\(-\)	20	8.2
\(Immm\)	\(-\)	\(-\)	22	8.2
\(Cmmm\)	\(-\)	\(-\)	18	8.2
\(P4_22_12\)	\(-\)	\(-\)	8	8.3
\(R\bar3m\)	\(-\)	\(-\)	27	8.3
\(Cm\)	(52496)	(Tb)	12	8.3
\(Cmce\)	\(-\)	\(-\)	16	8.4
\(Cmcm\)	\(-\)	\(-\)	16	8.4
\(Cm\)	\(-\)	\(-\)	16	8.4
\(Cm\)	(43573)	(La)	16	8.7
\(P6_3/mmc\)	43573	La	4	9.5
\(Cmcm\)	43539	Ga(\(Cmcm\))	4	9.6
\(P6_3/mmc\)	52496	Tb	6	9.8
\(C2/m\)	2284	O\(_2\)(\(mS4\))	20	10.1
\(Pnma\)	165995	MnP	8	10.8
\(P6_3/mmc\)	164724	O\(_2\)	4	11.2
\(R3c\)	(12173)	(Ga(II))	18	11.7
\(P2_1/c\)	189806	Bi(\(P2_1/c\))	8	12.3
\(R\bar3m\)	52497	Sm	9	12.9
\(I\bar43d\)	109012	Li(\(cI16\))	16	15.0
\(P6_3/mmc\)	1425	HCP	2	16.1
\(P3_221\)	281124	S	9	22.0
\(P2_12_12\)	16817	H\(_2\)O(III,IX)	12	23.8

Table 5 presents a list of the local minimum structures in the elemental Ga. The global minimum structure is the Ga(I)-type structure with the space group of \(Cmce\), which is the experimental structure observed at low temperatures [55], [65]. One of the local minimum structures, exhibiting the fourth lowest energy value and the space group of \(Cmcm\), corresponds to a metastable structure determined through first-principle prediction [66].

In the elemental Ga, there are numerous competing close-packed and open structures. The close-packed structures include In-, face-centered-cubic (FCC)-, La-, and Sm-type structures, while the open structures with coordination numbers of seven and eight correspond to Ga(I)- and Ga(\(Cmcm\))-type structures. Intermediate structures between the open and close-packed structures, namely Pu(\(oF8\))-, Ga(II)-, and Li(\(cI16\))-types, are also found as local minimum structures.

Experimental studies reported Ga(\(Cmcm\))- [67], In- (Ga(III)-) [68], Ga(II)- [68], and Ga(\(R\bar3m\))(\(\delta\)Ga)-type [69] structures. The first three structures are present in the list of local minimum structures. Although the Ga(II)-type structure is not equivalent to but close to a local minimum structure with a different space group, \(R3c\). The Ga(\(R\bar3m\))(\(\delta\)Ga)-type structure requires at least 22 atoms to represent, which cannot be discovered in the current procedure.

4.2.4 In↩︎

fig: — Figure 8: (a) Distribution of the local minimum structures and equilibrium prototype structures in In. These structures are mapped into a two-dimensional plane using an MDS. The open and closed circles show the local minimum structures and prototype structures, respectively. (b) Average coordination numbers of the local minimum structures. The cutoff radius is given as 1.2 times the nearest neighbor distance.

A total of 474 local minimum structures with relative energy values less than \(\theta =\) 50 meV/atom have been discovered in the elemental In. Among them, 99 structures are found for \(\theta = 5\) meV/atom. Figure 8 (a) shows the distribution of the local minimum structures and equilibrium prototype structures in the elemental In. Some local minimum structures are close to the equilibrium prototype structures, while many others are far from them. Figure 8 (b) shows the distribution of average coordination numbers for the local minimum structures in the elemental In. The global minimum structure is 12-coordinated, and the local minimum structures also exhibit large coordination numbers ranging from 10 to 12.

Table 6: Global and local minimum structures in In.The DFT relative energy \(\Delta E\) is expressed in the unit of meV/atom.\(Z\) denotes the number of atoms included in the unitcell.
Space group	ICSD-ID	Prototype	\(Z\)	\(\Delta E\)
\(C2/m\)	(12174)	(In)	20	0.0
	(57392)	(In(\(Fmmm\)))
	(41824)	(Ce)
\(I4/mmm\)	12174	In	2	0.1
\(C2/m\)	54338	Pr	4	0.4
\(C2/m\)	(12174)	(In)	16	0.4
\(C2/m\)	41824	Ce	2	0.6
\(P\bar1\)	(41824)	(Ce)	1	1.0
\(Pmmn\)	(12174)	(In)	10	1.2
\(C2/m\)	\(-\)	\(-\)	24	1.6
\(C2/m\)	\(-\)	\(-\)	18	1.7
\(R\bar3m\)	\(-\)	\(-\)	24	1.8
\(C2/m\)	\(-\)	\(-\)	24	1.8
\(P6_3/mmc\)	\(-\)	\(-\)	8	1.8
\(Cmcm\)	\(-\)	\(-\)	20	1.9
\(R\bar3m\)	\(-\)	\(-\)	27	1.9
\(P\bar6m2\)	\(-\)	\(-\)	12	1.9
\(C2/m\)	\(-\)	\(-\)	20	1.9
\(R\bar3m\)	\(-\)	\(-\)	24	2.0
\(C2/m\)	\(-\)	\(-\)	20	2.0
\(P\bar3m1\)	\(-\)	\(-\)	9	2.1
\(C2/m\)	\(-\)	\(-\)	22	2.1
\(C2/m\)	\(-\)	\(-\)	20	2.2
\(R\bar3m\)	\(-\)	\(-\)	18	2.2
\(P6_3/mmc\)	43573	La	4	2.3
\(C2/m\)	\(-\)	\(-\)	18	2.3
\(C2/m\)	\(-\)	\(-\)	16	2.3
\(R\bar3m\)	\(-\)	\(-\)	30	2.3
\(P\bar3m1\)	\(-\)	\(-\)	7	2.4
\(R3m\)	\(-\)	\(-\)	36	2.4
\(R3m\)	\(-\)	\(-\)	33	2.4
\(P6_3/mmc\)	52496	Tb	6	2.6
\(R\bar3m\)	52497	Sm	9	5.3
\(P6_3/mmc\)	1425	HCP	2	5.7
\(I\bar43d\)	109012	Li(\(cI16\))	16	8.7
\(P1\)	(5248)	(BCC)	11	10.3
\(P3_221\)	281124	S	9	38.3

Table 6 provides a list of the local minimum structures present in the elemental In. The global minimum structure has the \(C2/m\) space group, is 12-coordinated, and is almost identical to the In-type [70] and In(\(Fmmm\))-type [71] structures. However, the high-pressure In(\(Fmmm\))-type structure, reported in the literature, is quite different from the global minimum structure in terms of the lattice constants. The local minimum structure with the second lowest energy can be identified with the In-type structure, which is the experimental structure at room temperature [55], [70]. Table 6 shows that there are many local minimum structures with low relative energy values, including La-, Tb-, Sm-, and hexagonal-close-packed (HCP)-type structures. In the literature, the In- and In(\(Fmmm\))-type structures are included in the ICSD. They are found in the list of local minimum structures.

4.2.5 La↩︎

fig: — Figure 9: (a) Distribution of the local minimum structures and equilibrium prototype structures in La. These structures are mapped into a two-dimensional plane using an MDS. The open and closed circles show the local minimum structures and prototype structures, respectively. (b) Average coordination numbers of the local minimum structures. The cutoff radius is given as 1.2 times the nearest neighbor distance.

In the case of the elemental La, 253 local minimum structures have been identified, all of which have relative energy values less than \(\theta =\) 75 meV/atom. Among these structures, 73 structures have been discovered for \(\theta = 15\) meV/atom. Figure 9 (a) shows the distribution of the local minimum structures and equilibrium prototype structures in La. Many local minimum structures are distributed around the equilibrium prototype structures. Figure 9 (b) depicts the distribution of average coordination numbers for the local minimum structures in La. The global minimum structure and many local minimum structures are 12-coordinated, while some local minimum structures have smaller coordination numbers ranging from six to eight.

Table 7: Global and local minimum structures in La.The DFT relative energy \(\Delta E\) is expressed in the unit of meV/atom.\(Z\) denotes the number of atoms included in the unitcell.
Space group	ICSD-ID	Prototype	\(Z\)	\(\Delta E\)
\(P6_3/mmc\)	43573	La	4	0.0
\(R\bar3m\)	\(-\)	\(-\)	27	0.3
\(R\bar3m\)	\(-\)	\(-\)	15	0.7
\(P\bar6m2\)	\(-\)	\(-\)	10	1.2
\(R3m\)	\(-\)	\(-\)	33	1.3
\(R\bar3m\)	\(-\)	\(-\)	33	1.4
\(R\bar3m\)	\(-\)	\(-\)	18	1.6
\(P\bar3m1\)	\(-\)	\(-\)	7	1.7
\(R\bar3m\)	\(-\)	\(-\)	24	1.9
\(R\bar3m\)	\(-\)	\(-\)	33	1.9
\(R3m\)	\(-\)	\(-\)	36	2.1
\(R\bar3m\)	\(-\)	\(-\)	21	2.3
\(R\bar3m\)	\(-\)	\(-\)	27	2.3
\(P\bar3m1\)	\(-\)	\(-\)	9	2.4
\(R\bar3m\)	\(-\)	\(-\)	21	2.8
\(P6_3/mmc\)	\(-\)	\(-\)	8	2.8
\(P6_3/mmc\)	52496	Tb	6	2.9
\(P6_3/mmc\)	\(-\)	\(-\)	10	2.9
\(P\bar3m1\)	\(-\)	\(-\)	8	3.0
\(P3m1\)	\(-\)	\(-\)	9	3.4
\(R3m\)	\(-\)	\(-\)	33	3.4
\(R3m\)	\(-\)	\(-\)	30	3.7
\(P3m1\)	\(-\)	\(-\)	12	3.8
\(R3m\)	\(-\)	\(-\)	24	4.1
\(R3m\)	\(-\)	\(-\)	36	4.1
\(P3m1\)	\(-\)	\(-\)	10	4.4
\(P3m1\)	\(-\)	\(-\)	11	4.4
\(R3m\)	\(-\)	\(-\)	33	4.6
\(Fm\bar3m\)	20502	FCC	4	4.7
\(R3m\)	\(-\)	\(-\)	27	4.8
\(P\bar3m1\)	\(-\)	\(-\)	9	5.0
\(R\bar3m\)	52497	Sm	9	6.2
\(P6_3/mmc\)	1425	HCP	2	29.5
\(P6/mmm\)	52521	CaHg\(_2\)	3	29.7

The local minimum structures in the elemental La are listed in Table 7. The 12-coordinated La-type structure is the global minimum structure, which is consistent with the experimental structure at low temperatures [55], [72]. Additionally, several close-packed local minimum structures, such as the FCC-, Sm-, and HCP-type structures, have low energy values near the global minimum energy. Among these structures, the FCC-type structure is recognized as a high-temperature polymorph [55], [72]. Although the BCC-type structure is also known as a high-temperature structure [55], [72], it is not found in the structure list since it is dynamically unstable at zero temperature. Moreover, the Sn(\(tI2\))-type structure [73] is not included in the structure list.

4.2.6 P↩︎

fig: — Figure 10: (a) Distribution of the local minimum structures and equilibrium prototype structures in P. These structures are mapped into a two-dimensional plane using an MDS. The open and closed circles show the local minimum structures and prototype structures, respectively. (b) Average coordination numbers of the local minimum structures. The cutoff radius is given as 1.2 times the nearest neighbor distance.

In this study, 276 local minimum structures with relative energy values less than \(\theta =\) 75 meV/atom have been identified for the elemental P. In particular, 112 structures are discovered for \(\theta = 15\) meV/atom. Figure 10 (a) shows the distribution of the local minimum structures and equilibrium prototype structures in P. Most of the local minimum structures are located far from the equilibrium prototype structures. Figure 10 (b) presents the distribution of average coordination numbers for the local minimum structures in P. The global minimum structure and most of the local minimum structures have three-coordinated configurations.

Table 8: Global and local minimum structures in P.The DFT relative energy \(\Delta E\) is expressed in the unit of meV/atom.\(Z\) denotes the number of atoms included in the unitcell.
Space group	ICSD-ID	Prototype	\(Z\)	\(\Delta E\)
\(Cmce\)	23836	black-P	8	0.0
\(Ibam\)	\(-\)	\(-\)	16	3.0
\(C2/m\)	\(-\)	\(-\)	20	3.1
\(P\bar1\)	\(-\)	\(-\)	12	3.1
\(C2/m\)	\(-\)	\(-\)	12	4.1
\(P\bar1\)	\(-\)	\(-\)	12	4.1
\(P\bar1\)	\(-\)	\(-\)	8	4.5
\(Cm\)	\(-\)	\(-\)	24	4.6
\(C2/m\)	\(-\)	\(-\)	24	4.7
\(C2/m\)	\(-\)	\(-\)	24	4.8
\(P\bar1\)	\(-\)	\(-\)	6	5.3
\(C2/m\)	\(-\)	\(-\)	20	5.5
\(Cccm\)	\(-\)	\(-\)	8	5.9
\(P2/m\)	\(-\)	\(-\)	12	6.1
\(C2/m\)	\(-\)	\(-\)	16	6.3
\(C2/m\)	\(-\)	\(-\)	16	6.3
\(P2/m\)	\(-\)	\(-\)	12	6.3
\(P\bar1\)	\(-\)	\(-\)	12	7.2
\(Cm\)	\(-\)	\(-\)	24	7.2
\(Pm\)	\(-\)	\(-\)	12	7.3
\(C2/m\)	\(-\)	\(-\)	20	7.4
\(C2/m\)	\(-\)	\(-\)	20	7.7
\(P\bar1\)	\(-\)	\(-\)	8	7.7
\(C2/m\)	\(-\)	\(-\)	24	8.0
\(P\bar3m1\)	(1425)	(HCP)	2	10.4
\(P6_3mc\)	31170	C(\(P6_3mc\))	4	10.5
\(R\bar3m\)	29123	Graphite(3R)	6	10.7
\(I2_13\)	187643	N	8	35.2
\(R\bar3m\)	9859	As	6	50.3

Table 8 lists the local minimum structures in the elemental P. The global minimum structure is the three-coordinated black-P-type structure, also known as the most stable experimental structure at room temperature [55], [74]. The structure list includes many other three-coordinated local minimum structures such as the C(\(P6_3mc\))-, graphite(3R)-, and As-type structures. However, only the As-type structure is known as a high-pressure polymorph [55], [75]. Several other experimental structures such as the SC- [55], [76], Ca(III)- [77], P(\(Amm2\))-types [77] have also been reported as high-pressure polymorphs in the literature. Among these structures, only the SC-type structure [55], [76] is included in the list of local minimum structures obtained using the structure search using the MLP, but it shows an energy value larger than the threshold.

4.2.7 Sb↩︎

fig: — Figure 11: (a) Distribution of the local minimum structures and equilibrium prototype structures in Sb. These structures are mapped into a two-dimensional plane using an MDS. The open and closed circles show the local minimum structures and prototype structures, respectively. (b) Average coordination numbers of the local minimum structures. The cutoff radius is given as 1.2 times the nearest neighbor distance.

In the elemental Sb, 302 structures in the elemental Sb that have relative energy values less than \(\theta =\) 75 meV/atom have been discovered. However, for \(\theta = 15\) and 30 meV/atom, only two and 24 structures have been found, respectively. Figure 11 (a) shows the distribution of the local minimum structures and equilibrium prototype structures in Sb. Most of the local minimum structures are far from the equilibrium prototype structures. Figure 11 (b) shows the distribution of average coordination numbers for the local minimum structures in Sb. The elemental Sb prefers open structures, and the global minimum structure is six-coordinated. The coordination numbers of the local minimum structures range from three to six.

Table 9: Global and local minimum structures in Sb.The DFT relative energy \(\Delta E\) is expressed in the unit of meV/atom.\(Z\) denotes the number of atoms included in the unitcell.
Space group	ICSD-ID	Prototype	\(Z\)	\(\Delta E\)
\(R\bar3m\)	9859	As	6	0.0
\(Pm\)	\(-\)	\(-\)	12	12.9
\(R\bar3m\)	\(-\)	\(-\)	36	17.3
\(P6_3mc\)	\(-\)	\(-\)	12	17.3
\(P\bar3m1\)	\(-\)	\(-\)	8	19.3
\(P\bar3m1\)	\(-\)	\(-\)	10	20.1
\(R\bar3m\)	\(-\)	\(-\)	30	20.6
\(P3m1\)	\(-\)	\(-\)	10	20.6
\(P6_3mc\)	\(-\)	\(-\)	8	23.9
\(Imma\)	\(-\)	\(-\)	24	24.3
\(R\bar3m\)	\(-\)	\(-\)	24	24.3
\(R\bar3m\)	\(-\)	\(-\)	18	24.4
\(R\bar3m\)	\(-\)	\(-\)	24	24.4
\(P1\)	\(-\)	\(-\)	8	24.4
\(C2/m\)	\(-\)	\(-\)	12	24.5
\(P1\)	\(-\)	\(-\)	8	24.6
\(R3m\)	\(-\)	\(-\)	24	24.7
\(P\bar3m1\)	\(-\)	\(-\)	8	25.8
\(Cmce\)	\(-\)	\(-\)	24	26.7
\(Pmma\)	\(-\)	\(-\)	10	27.4
\(P3m1\)	\(-\)	\(-\)	12	27.8
\(R3m\)	\(-\)	\(-\)	36	27.9
\(Cccm\)	\(-\)	\(-\)	24	28.1
\(C2/m\)	\(-\)	\(-\)	16	29.4
\(P6_3mc\)	31170	C(\(P6_3mc\))	4	40.0
\(Pm\bar3m\)	43211	SC	1	46.4
\(Cmce\)	23836	black-P	8	47.2
\(Pnma\)	165995	MnP	8	51.0

Table 9 provides a list of the local minimum structures observed in the elemental Sb. The global minimum structure is the six-coordinated As-type structure, which is also the experimental structure at low temperatures [55], [78]. In addition, the coordination numbers for all atoms in the As-type structure are obtained as six, similar to the elemental Bi. Other structures such as the C(\(P6_3mc\))-, SC-, black-P- and MnP-type structures are also identified. The SC-type structure is the only known high-pressure polymorph [79] among these structures. Additionally, the BCC- [80], HCP- [79], Sb(\(mP4\))- [64], Bi(\(I4/mcm\))- [81], and Bi(III)-type structures [81] are known as high-pressure polymorphs and have been experimentally observed. Among these structures, the BCC-, HCP-, Sb(\(mP4\))-, and Bi(\(I4/mcm\))-type structures are included in the list of local minimum structures obtained using the random structure search using the MLP.

4.2.8 Sn↩︎

fig: — Figure 12: (a) Distribution of the local minimum structures and equilibrium prototype structures in Sn. These structures are mapped into a two-dimensional plane using an MDS. The open and closed circles show the local minimum structures and prototype structures, respectively. (b) Average coordination numbers of the local minimum structures. The cutoff radius is given as 1.2 times the nearest neighbor distance.

There are 310 local minimum structures in the elemental Sn, each with relative energy values lower than \(\theta =\) 75 meV/atom. However, only one, four, and thirteen structures are discovered for \(\theta =\) 15, 30, and 45 meV/atom, respectively. Figure 12 (a) illustrates the distribution of the local minimum structures and equilibrium prototype structures in Sn. Most of the local minimum structures are located around the equilibrium prototype structures. Figure 12 (b) shows the distribution of average coordination numbers for the local minimum structures in Sn. The global minimum structure and the local minimum structures with low energy values are four-coordinated, while the coordination number of the local minimum structures ranges from four to twelve.

Table 10: Global and local minimum structures in Sn.The DFT relative energy \(\Delta E\) is expressed in the unit of meV/atom.\(Z\) denotes the number of atoms included in the unitcell.
Space group	ICSD-ID	Prototype	\(Z\)	\(\Delta E\)
\(Fd\bar3m\)	28857	Diamond-C(\(cF8\))	8	0.0
\(P6_3/mmc\)	30101	Wurtzite-ZnS(2H)	4	16.3
\(P6_122\)	\(-\)	\(-\)	6	29.8
\(P6_522\)	\(-\)	\(-\)	6	29.8
\(P6/mmm\)	52456	Ca\(_{.15}\)Sn\(_{.85}\)	1	37.0
\(Cmmm\)	(52456)	(Ca\(_{.15}\)Sn\(_{.85}\))	10	37.4
\(Cmm2\)	(52456)	(Ca\(_{.15}\)Sn\(_{.85}\))	22	37.5
\(P\bar3m1\)	(52456)	(Ca\(_{.15}\)Sn\(_{.85}\))	3	37.8
\(Pmma\)	(52456)	(Ca\(_{.15}\)Sn\(_{.85}\))	8	37.8
\(Pmna\)	(52456)	(Ca\(_{.15}\)Sn\(_{.85}\))	10	37.8
\(P1\)	(40037)	(\(\beta\)-Sn)	8	38.4
\(Pmc2_1\)	(40037)	(\(\beta\)-Sn)	8	38.9
\(I4_1/amd\)	40037	\(\beta\)-Sn	4	39.0
\(Fdd2\)	\(-\)	\(-\)	48	45.6
\(Cm\)	\(-\)	\(-\)	24	46.0
\(P2_1/c\)	\(-\)	\(-\)	10	47.6
\(I\bar43d\)	109012	Li(\(cI16\))	16	48.0
\(I\bar43d\)	109012	Li(\(cI16\))	16	48.0
\(C2/c\)	\(-\)	\(-\)	16	48.9
\(C2/m\)	\(-\)	\(-\)	24	49.8
\(Cmce\)	89414	Si(\(oS16\))	16	54.0
\(Cm\)	(1425)	(HCP)	24	54.8
\(Cmcm\)	(12174)	(In)	4	56.0
\(I4/mmm\)	12174	In	2	56.4
	(236711)	(I(\(Immm\)))
\(I4/mmm\)	24622	Sn(\(tI2\))	2	56.8
\(I4/mmm\)	181908	Si(\(I4/mmm\))	8	67.1
\(Pm\bar3m\)	43211	SC	1	74.2

The local minimum structures in the elemental Sn are listed in Table 10. The global minimum structure is the four-coordinated diamond-type structure, which is the experimental structure at low temperatures [55], [82]. The local minimum structures also include the wurtzite-, \(\beta\)-Sn-, Li(\(cI16\))-, Si(\(oS16\))-, In-, and Sn(\(tI2\))-type structures. Experimentally, the \(\beta\)-Sn- [55], [83], Sn(\(tI2\))- [84], I(\(Immm\))- [85], and BCC-type [85] structures have been reported as high-pressure polymorphs, and the \(\beta\)-Sn-type structure is also known as the high-temperature structure [55]. Among these structures, the \(\beta\)-Sn-, Sn(\(tI2\))-, and I(\(Immm\))-type structures are included in the current structure list.

4.2.9 Te↩︎

fig: — Figure 13: (a) Distribution of the local minimum structures and equilibrium prototype structures in Te. These structures are mapped into a two-dimensional plane using an MDS. The open and closed circles show the local minimum structures and prototype structures, respectively. (b) Average coordination numbers of the local minimum structures. The cutoff radius is given as 1.2 times the nearest neighbor distance.

In the elemental Te, 339 local minimum structures with relative energy values less than \(\theta =\) 100 meV/atom have been discovered. On the other hand, only seven structures are discovered for \(\theta =\) 20 meV/atom. Figure 13 (a) shows the distribution of the local minimum structures and equilibrium prototype structures in Te. Most of the local minimum structures are far from the equilibrium prototype structures. Figure 13 (b) shows the distribution of average coordination numbers for the local minimum structures in Te. The global minimum structure is two-coordinated, and the coordination numbers of the local minimum structures range from two to six.

Table 11: Global and local minimum structures in Te.The DFT relative energy \(\Delta E\) is expressed in the unit of meV/atom.\(Z\) denotes the number of atoms included in the unitcell.
Space group	ICSD-ID	Prototype	\(Z\)	\(\Delta E\)
\(P3_121\)	23059	\(\gamma\)-Se	3	0.0
\(P3_221\)	653048	Te	3	0.0
\(P2\)	\(-\)	\(-\)	9	18.9
\(C2/c\)	(23059)	(\(\gamma\)-Se)	12	19.9
\(P2_1/c\)	(23059)	(\(\gamma\)-Se)	12	19.9
\(P\bar1\)	(23059)	(\(\gamma\)-Se)	9	19.9
\(C2/c\)	(23059)	(\(\gamma\)-Se)	12	20.0
\(P2\)	\(-\)	\(-\)	12	20.7
\(R\bar3\)	27495	S\(_6\)	18	23.4
\(Pm\)	\(-\)	\(-\)	6	23.8
\(P1\)	\(-\)	\(-\)	12	24.3
\(P2_1/c\)	\(-\)	\(-\)	12	24.3
\(C2/m\)	\(-\)	\(-\)	18	24.4
\(P2_12_12\)	\(-\)	\(-\)	6	24.5
\(P2\)	\(-\)	\(-\)	12	24.5
\(C2/m\)	\(-\)	\(-\)	24	25.0
\(P2/m\)	\(-\)	\(-\)	9	26.5
\(C2/m\)	\(-\)	\(-\)	24	27.5
\(C2/m\)	\(-\)	\(-\)	12	28.0
\(P1\)	\(-\)	\(-\)	12	28.4
\(C2\)	\(-\)	\(-\)	18	29.4
\(P2/m\)	\(-\)	\(-\)	9	30.0
\(R\bar3m\)	(43211)	(SC)	3	40.8

Table 11 shows the local minimum structures in the elemental Te. The global minimum structures are the two-coordinated \(\gamma\)-Se-type chiral structures, which are the experimental structures at low temperatures [55], [86]. However, most of the local minimum structures do not have assigned prototype structures. In the literature, As- [87], Sb\(_2\)Te\(_3\)- [88], GaSb- [89], Te(\(mP4\))- [87], and Po(\(hR1\))-types [87] are known as high-pressure structures. The first four structures are included in the list of local minimum structures obtained through the random structure search using MLP, showing energy values greater than the threshold.

5 Conclusion↩︎

This study has developed an efficient iterative procedure utilizing polynomial MLPs to enumerate both globally stable and metastable structures with high accuracy. The procedure involves updating the MLP and conducting a random structure search using the updated MLP. This approach has enabled the creation of polynomial MLP models that are both reliable and efficient for structure enumeration.

The current procedure has been applied to nine different elemental systems: As, Bi, Ga, In, La, P, Sb, Sn, and Te. In each of these systems, metastable structures that compete energetically with the globally stable structures have been identified. The study demonstrates the importance of exhaustively enumerating metastable structures with low energy values to achieve a comprehensive search for globally stable structures in such systems. By employing this procedure, globally stable structures consistent with experimental data at low temperatures have been identified. Additionally, many previously unknown metastable structures that compete with the globally stable structures have been discovered, as well as several metastable structures that correspond to experimental observations and prototype structures reported in other systems. Therefore, this procedure will be valuable for conducting structure searches in complex systems characterized by potential energy surfaces with numerous local minima.

This work was supported by a Grant-in-Aid for Scientific Research (B) (Grant Number 22H01756) and a Grant-in-Aid for Scientific Research on Innovative Areas (Grant Number 19H05787) from the Japan Society for the Promotion of Science (JSPS).

References↩︎

[1]

S. Lorenz, A. Groß, and M. Scheffler, http://dx.doi.org/http://dx.doi.org/10.1016/j.cplett.2004.07.076.

[2]

J. Behler and M. Parrinello, Phys. Rev. Lett. 98, 146401 (2007).

[3]

J. Behler, J. Chem. Phys. 134, 074106 (2011).

[4]

J. Han, L. Zhang, R. Car, and E. Weinan, Commun. Comput. Phys. 23, 629 (2018).

[5]

N. Artrith and A. Urban, http://dx.doi.org/10.1016/j.commatsci.2015.11.047.

[6]

N. Artrith, A. Urban, and G. Ceder, http://dx.doi.org/10.1103/PhysRevB.96.014112.

[7]

A. P. Bartók, M. C. Payne, R. Kondor, and G. Csányi, Phys. Rev. Lett. 104, 136403 (2010).

[8]

W. J. Szlachta, A. P. Bartók, and G. Csányi, http://dx.doi.org/10.1103/PhysRevB.90.104108.

[9]

A. P. Bartók, J. Kermode, N. Bernstein, and G. Csányi, http://dx.doi.org/10.1103/PhysRevX.8.041048.

[10]

Z. Li, J. R. Kermode, and A. De Vita, http://dx.doi.org/10.1103/PhysRevLett.114.096405.

[11]

A. Glielmo, P. Sollich, and A. De Vita, http://dx.doi.org/10.1103/PhysRevB.95.214302.

[12]

A. Seko, A. Takahashi, and I. Tanaka, http://dx.doi.org/10.1103/PhysRevB.90.024101.

[13]

A. Seko, A. Takahashi, and I. Tanaka, http://dx.doi.org/10.1103/PhysRevB.92.054113.

[14]

A. Takahashi, A. Seko, and I. Tanaka, http://dx.doi.org/10.1103/PhysRevMaterials.1.063801.

[15]

A. Thompson, L. Swiler, C. Trott, S. Foiles, and G. Tucker, http://dx.doi.org/ https://doi.org/10.1016/j.jcp.2014.12.018.

[16]

M. A. Wood and A. P. Thompson, J. Chem. Phys. 148, 241721 (2018).

[17]

C. Chen, Z. Deng, R. Tran, H. Tang, I.-H. Chu, and S. P. Ong, http://dx.doi.org/ 10.1103/PhysRevMaterials.1.043603.

[18]

A. V. Shapeev, http://dx.doi.org/10.1137/15M1054183.

[19]

T. Mueller, A. Hernandez, and C. Wang, http://dx.doi.org/10.1063/1.5126336.

[20]

A. Khorshidi and A. A. Peterson, Comput. Phys. Commun. 207, 310 (2016).

[21]

G. Ferré, J.-B. Maillet, and G. Stoltz, http://dx.doi.org/10.1063/1.4930541.

[22]

S. A. Ghasemi, A. Hofstetter, S. Saha, and S. Goedecker, http://dx.doi.org/ 10.1103/PhysRevB.92.045131.

[23]

V. Botu and R. Ramprasad, http://dx.doi.org/10.1002/qua.24836.

[24]

R. Freitas and Y. Cao, http://dx.doi.org/10.1557/s43579-022-00221-5.

[25]

R. Drautz, http://dx.doi.org/10.1103/PhysRevB.99.014104.

[26]

V. L. Deringer, C. J. Pickard, and G. Csányi, http://dx.doi.org/10.1103/PhysRevLett.120.156001.

[27]

E. V. Podryabinkin, E. V. Tikhonov, A. V. Shapeev, and A. R. Oganov, http://dx.doi.org/10.1103/PhysRevB.99.064114.

[28]

K. Gubaev, E. V. Podryabinkin, G. L. Hart, and A. V. Shapeev, http://dx.doi.org/ https://doi.org/10.1016/j.commatsci.2018.09.031.

[29]

S. Kharabadze, A. Thorn, E. A. Koulakova, and A. N. Kolmogorov, http://dx.doi.org/10.1038/s41524-022-00825-4.

[30]

H. Wakai, A. Seko, and I. Tanaka, http://dx.doi.org/https://doi.org/10.2109/jcersj2.23053.

[31]

A. Thorn, D. Gochitashvili, S. Kharabadze, and A. N. Kolmogorov, http://dx.doi.org/10.1039/D3CP02817H.

[32]

A. R. Oganov, C. J. Pickard, Q. Zhu, and R. J. Needs, Nature Rev. Mater. 4, 331 (2019).

[33]

C. J. Pickard and R. J. Needs, http://dx.doi.org/10.1088/0953-8984/23/5/053201.

[34]

Y. Wang, J. Lv, L. Zhu, and Y. Ma, http://dx.doi.org/ https://doi.org/10.1016/j.cpc.2012.05.008.

[35]

C. W. Glass, A. R. Oganov, and N. Hansen, http://dx.doi.org/https://doi.org/10.1016/j.cpc.2006.07.020.

[36]

C. Boender and A. Rinnooy Kan, Mathematical Programming 37, 59 (1987).

[37]

A. Seko, A. Togo, and I. Tanaka, http://dx.doi.org/10.1103/PhysRevB.99.214108.

[38]

A. Seko, http://dx.doi.org/10.1103/PhysRevB.102.174104.

[39]

A. Seko, http://dx.doi.org/10.1063/5.0129045.

[40]

M. El-Batanouny and F. Wooten, http://dx.doi.org/10.1017/CBO9780511755736(Cambridge University Press, 2008).

[41]

G. Bergerhoff and I. D. Brown, in Crystallographic Databases, edited by F. H. Allen et al.(International Union of Crystallography, Chester, 1987).

[42]

P. E. Blöchl, Phys. Rev. B 50, 17953 (1994).

[43]

J. P. Perdew, K. Burke, and M. Ernzerhof, Phys. Rev. Lett. 77, 3865 (1996).

[44]

G. Kresse and J. Hafner, Phys. Rev. B 47, 558 (1993).

[45]

G. Kresse and J. Furthmüller, Phys. Rev. B 54, 11169 (1996).

[46]

G. Kresse and D. Joubert, Phys. Rev. B 59, 1758 (1999).

[47]

E. M. Hendrix, G. Boglárka, et al., Introduction to nonlinear and global optimization, Vol. 37(Springer, 2010).

[48]

IUCr, International Tables for Crystallography, Volume A: Space Group Symmetry, 5th ed., International Tables for Crystallography (Kluwer Academic Publishers, Dordrecht, Boston, London, 2002).

[49]

M. Valle and A. R. Oganov, Acta Crystallogr. A 66, 507 (2010).

[50]

https://github.com/sekocha/lammps-polymlp-packageA. Seko, lammps-polymlp-package, https://github.com/sekocha/lammps-polymlp-package.

[51]

https://sekocha.github.ioA. Seko, Polynomial Machine Learning Potential Repository at Kyoto University, https://sekocha.github.io.

[52]

H. Wakai, A. Seko, H. Izuta, T. Nishiyama, and I. Tanaka, http://dx.doi.org/ 10.1103/PhysRevB.109.214207.

[53]

I. Borg and P. J. Groenen, Modern multidimensional scaling: Theory and applications(Springer Science & Business Media, 2005).

[54]

D. Schiferl and C. S. Barrett, J. Appl. Crystallogr. 2, 30 (1969).

[55]

D. A. Young, Phase diagrams of the elements(Univ of California Press, 1991).

[56]

P. M. Smith, A. J. Leadbetter, and A. J. Apling, Phil. Mag. 31, 57 (1975).

[57]

W. P. Davey, http://dx.doi.org/10.1103/PhysRev.25.753.

[58]

P. Cucka and C. S. Barrett, Acta Crystallogr. 15, 865 (1962).

[59]

R. Jaggi, in Helvetica Physica Acta, Vol. 37(1964) p. 618.

[60]

L. G. Akselrud, M. Hanflandm, and U. Schwarz, http://dx.doi.org/doi:10.1524/ncrs.2003.218.jg.447.

[61]

P. Schaufelberger, H. Merx, and M. Contre, High Temp. High Press. 5, 221 (1973).

[62]

O. Degtyareva, M. I. McMahon, and R. J. Nelmes, in Mater. Sci. Forum, Vol. 378(Trans Tech Publ, 2001) pp. 469–475.

[63]

W. Chaimayo, L. F. Lundegaard, I. Loa, G. W. Stinton, A. R. Lennie, and M. I. McMahon, High Press. Res. 32, 442 (2012).

[64]

S. S. Kabalkina, T. N. Kolobyanina, and L. F. Vereshchagin, Sov. Phys. JETP 31, 259 (1970).

[65]

B. D. Sharma and J. Donohue, http://dx.doi.org/doi:10.1524/zkri.1962.117.4.293.

[66]

M. de Koning, A. Antonelli, and D. A. C. Jara, http://dx.doi.org/10.1103/PhysRevB.80.045209.

[67]

A. Defrain, H. Curien, and A. Rimsky, http://dx.doi.org/10.3406/bulmi.1961.5486.

[68]

L. Bosio, http://dx.doi.org/10.1063/1.435841.

[69]

L. Bosio, H. Curien, M. Dupont, and A. Rimsky, http://dx.doi.org/ https://doi.org/10.1107/S0567740873002530.

[70]

A. W. Hull and W. P. Davey, http://dx.doi.org/10.1103/PhysRev.17.549.

[71]

K. Takemura and H. Fujihisa, http://dx.doi.org/10.1103/PhysRevB.47.8465.

[72]

F. H. Spedding, J. J. Hanak, and A. H. Daane, J. Less-common Met. 3, 110 (1961).

[73]

K. Nomura, H. Hayakawa, and S. Ono, J. Less-common Met. 52, 259 (1977).

[74]

A. Brown and S. Rundqvist, Acta Crystallogr. 19, 684 (1965).

[75]

J. C. Jamieson, Science 139, 1291 (1963).

[76]

T. Kikegawa and H. Iwasaki, Acta Crystallogr. B 39, 158 (1983).

[77]

M. Marqués, G. J. Ackland, L. F. Lundegaard, S. Falconi, C. Hejny, M. I. McMahon, J. Contreras-Garcı́a, and M. Hanfland, http://dx.doi.org/ 10.1103/PhysRevB.78.054120.

[78]

C. S. Barrett, P. Cucka, and K. J. A. C. Haefner, Acta Crystallogr. 16, 451 (1963).

[79]

L. F. Vereschchagin and S. S. Kabalkina, Sov. Phys. JETP 20, 274 (1965).

[80]

K. Aoki, S. Fujiwara, and M. Kusakabe, Solid State Commun. 45, 161 (1983).

[81]

U. Schwarz, L. Akselrud, H. Rosner, A. Ormeci, Y. Grin, and M. Hanfland, http://dx.doi.org/ 10.1103/PhysRevB.67.214101.

[82]

H. E. Swanson, Standard X-ray diffraction powder patterns, Vol. 25(US Department of Commerce, National Bureau of Standards, 1953).

[83]

J. A. Lee and G. V. Raynor, Proc. Phys. Soc. B 67, 737 (1954).

[84]

J. D. Barnett, V. E. Bean, and H. T. Hall, J. Appl. Phys. 37, 875 (1966).

[85]

A. Salamat, R. Briggs, P. Bouvier, S. Petitgirard, A. Dewaele, M. E. Cutler, F. Corà, D. Daisenberger, G. Garbarino, and P. F. McMillan, http://dx.doi.org/ 10.1103/PhysRevB.88.104104.

[86]

A. J. Bradley, http://dx.doi.org/10.1080/14786442408634511.

[87]

K. Aoki, O. Shimomura, and S. Minomura, J. Phys. Soc. Japan 48, 551 (1980).

[88]

M. Takumi, T. Masamitsu, and K. Nagata, J. Phys.: Condens. Matter 14, 10609 (2002).

[89]

G. C. Vezzoli, Z. Krist. - Cryst. Mater. 134, 305 (1971).

Globally-stable and metastable crystal structure enumeration using polynomial machine learning potentials in elemental As, Bi, Ga, In, La, P, Sb, Sn, and Te