Allegro Model¶
Allegro hyperparameters¶
The core hyperparameters of the Allegro model include
r_maxis the cutoff radius used for the strictly local Allegro model.l_maxgoverns the angular resolution of the tensor features. Reasonablel_maxvalues to try include1,2, and3, where raisingl_maxmay improve accuracy at the cost of speed. NOTE that the computational cost does not increase linearly withl_max, but rather scales polynomially due to the \(O(\ell_\text{max}^6)\) scaling of the Clebsch-Gordan tensor products. Raisingl_maxwill also increase the number of tensor paths taken, and the growth of paths is also tied to the choice ofnum_layers.num_layersis the number of Allegro layers, which corresponds to the body-ordering (num_layers=1corresponds to three-body tensor features,num_layers=2corresponds to four-body tensor features, and so on). A Clebsch-Gordan tensor product occurs at each Allegro layer. It is usually appropriate to use 1, 2, or 3 layers.num_scalar_featuresandnum_tensor_featurescorrespond to the number of scalar track channels and tensor track channels repsectively. They are separate parameters to set in the Allegro model because of its two-track system. It is often useful to keepnum_tensor_featuressmall and try to raisenum_scalar_featuresto improve the learning capacity of the model. Fornum_scalar_features, 16, 32, 64, 128, 256 are good options to try depending on the dataset. Fornum_tensor_features, 8, 16, 32, 64 are good options to try depending on the dataset.Each Allegro layer contains a neural network or multilayer perceptron (MLP). These are governed by the
allegro_mlpparameters.allegro_mlp_hidden_layers_depthis the depth and is defaulted to1. One could try reducing it to0(making the MLP a linear layer), or raising it to2or3(it is unhelpful to go beyond3).allegro_mlp_hidden_layers_widthis the width and can be defaulted to a value that is a multiple of 16 or 32 (for performance) and larger thannum_scalar_features(maybe 2 to 4 times as large to start).allegro_mlp_nonlinearityissiluby default (which is recommended).
The above core hyperparameters are the most important to set correctly for most use cases. The following are some advanced hyperparameters that can be tuned, but is discouraged unless one is comfortable with the process of hyperparameter tuning.
The initial scalar embedding has some level of configurability. For
radial_chemical_embed, refer to theTwoBodyBesselScalarEmbed()below as a starting point (it’s usually fine to just use the defaults). The output features of the radial-chemical embedding module is then put through a scalar embedding MLP.radial_chemical_embed_dimis the dimensionality of the feature vector output by the radial-chemical module, that is used as input to the scalar embedding MLP. It is typical to makeradial_chemical_embed_dimdefault tonum_scalar_features, but it can be tuned to other values.scalar_embed_mlp_hidden_layers_depthis defaulted to1.scalar_embed_mlp_hidden_layers_widthcan also be defaulted tonum_scalar_features, and tuned as desired.scalar_embed_mlp_nonlinearityis defaulted tosilu.paritydetermines whether to use the full set of allowed irreps (i.e. the default behavior oftrue), or to use a set restricted to spherical harmonic irreps (i.e. thefalseoption).tp_path_channel_couplingdetermines whether the tensor product weights couple paths and channels or not. The default oftrueis expected to be more expressive.After all the Allegro layers comes the readout MLP to energy predictions. These are governed by the
readout_mlpparameters. The following default behavior is recommended as a starting point, but can be tuned as desired.readout_mlp_hidden_layers_depthis defaulted to1.readout_mlp_hidden_layers_widthcan be defaulted tonum_scalar_features.readout_mlp_nonlinearityis defaulted tosilu.
API¶
- allegro.model.AllegroModel(l_max: int, parity: bool = True, **kwargs)[source]¶
Allegro model that predicts energies and forces (and stresses if cell is provided).
- Parameters:
seed (int) – seed for reproducibility
model_dtype (str) –
float32orfloat64r_max (float) – cutoff radius
per_edge_type_cutoff (Dict) – one can optionally specify cutoffs for each edge type [must be smaller than
r_max] (defaultNone)type_names (Sequence[str]) – list of atom type names
l_max (int) – maximum order \(\ell\) to use in spherical harmonics embedding, 1 is baseline (fast), 2 is more accurate, but slower, 3 highly accurate but slow
parity (bool) – whether to include features with odd mirror parity (default
True)radial_chemical_embed – an Allegro-compatible two-body radial-chemical embedding module, e.g.
allegro.nn.TwoBodyBesselScalarEmbedtwo_body_mlp_hidden_layers_depth (int) – number of hidden layers of two-body MLP (default
1)two_body_mlp_hidden_layers_width (int) – depth of hidden layers of two-body MLP
two_body_mlp_nonlinearity (str) –
silu,mish,gelu, orNone(defaultsilu)scalar_embed_output_dim (int) – output dimension of the scalar embedding module (default
Nonewill usenum_scalar_features)num_layers (int) – number of Allegro layers
num_scalar_features (int) – multiplicity of scalar features in the Allegro layers
num_tensor_features (int) – multiplicity of tensor features in the Allegro layers
allegro_mlp_hidden_layers_depth (int) – number of hidden layers in the Allegro scalar MLPs (default
1)allegro_mlp_hidden_layers_width (int) – width of hidden layers in the Allegro scalar MLPs (reasonable to set it to be the same as
num_scalar_features)allegro_mlp_nonlinearity (str) –
silu,mish,gelu, orNone(defaultsilu)tp_path_channel_coupling (bool) – whether Allegro tensor product weights couple the paths with the channels or not,
Trueis expected to be more expressive thanFalse(defaultTrue)readout_mlp_hidden_layers_depth (int) – number of hidden layers in the readout MLP (default
1)readout_mlp_hidden_layers_width (int) – width of hidden layers in the readout MLP (reasonable to set it to be the same as
num_scalar_features)readout_mlp_nonlinearity (str) –
silu,mish,gelu, orNone(defaultsilu)avg_num_neighbors (float/Dict[str, float]) – used to normalize edge sums for better numerics (default
None)per_type_energy_scales (float/List[float]) – per-atom energy scales, which could be derived from the force RMS of the data (default
None)per_type_energy_shifts (float/List[float]) – per-atom energy shifts, which should generally be isolated atom reference energies or estimated from average pre-atom energies of the data (default
None)per_type_energy_scales_trainable (bool) – whether the per-atom energy scales are trainable (default
False)per_type_energy_shifts_trainable (bool) – whether the per-atom energy shifts are trainable (default
False)pair_potential (torch.nn.Module) – additional pair potential term, e.g. :class:
nequip.nn.pair_potential.ZBL(defaultNone)do_derivatives (bool) – whether to compute forces and stresses via autograd (default
True)
- allegro.nn.TwoBodyBesselScalarEmbed(type_names: Sequence[str], num_bessels: int = 8, bessel_trainable: bool = False, polynomial_cutoff_p: int = 6, module_output_dim: int = 64, forward_weight_init: bool = True, scalar_embed_field: str = 'edge_embedding', irreps_in=None) SequentialGraphNetwork[source]¶
Two-body Bessel scalar embedding.
The radial edge lengths are encoded with a Bessel basis, which is then projected to
two_body_embedding_dim. The center-neighbor atom types are embedded with weights to the sametwo_body_embedding_dim. The radial embedding and center-neighbor type embedding are multiplied.This module can be used for the
scalar_embedargument of theAllegroModelin the config as follows.model: _target_: allegro.model.AllegroModel # other Allegro model parameters scalar_embed: _target_: allegro.nn.TwoBodyBesselScalarEmbed num_bessels: 8 bessel_trainable: false polynomial_cutoff_p: 6
- class allegro.nn.TwoBodySplineScalarEmbed(type_names: Sequence[str], num_splines: int = 16, spline_span: int = 12, module_output_dim: int = 64, forward_weight_init: bool = True, scalar_embed_field: str = 'edge_embedding', edge_type_field: str = 'edge_type_flat', norm_length_field: str = 'normed_edge_lengths', irreps_in=None)[source]¶
Two-body spline scalar embedding.
This module can be used for the
scalar_embedargument of theAllegroModelin the config as follows.model: _target_: allegro.model.AllegroModel # other Allegro model parameters scalar_embed: _target_: allegro.nn.TwoBodySplineScalarEmbed num_splines: 16 spline_span: 12