Model Packaging

The NequIP packaging system creates portable, version-independent model archives using PyTorch’s torch.package infrastructure.

Overview

Packaging converts checkpoint files (.ckpt) into package files (.nequip.zip) that solve the portability problem inherent in checkpoint files. While checkpoint files are tied to specific software versions and may not load in different environments, package files bundle both the model and the code needed to run it, making them largely version-independent.

A package file contains:

  • Model weights and architecture

  • Snapshot of the implementation code

  • Metadata and configuration

  • Example data for compilation

For user-facing information on packaging workflows and CLI usage, see the packaging workflow and package files overview in the user guide.

Developer Notes

Package Format Versioning

The packaging system uses _CURRENT_NEQUIP_PACKAGE_VERSION to track when the packaging mechanism has changed. This counter is incremented whenever breaking changes are made to the package format, as these changes represent the main barrier to maintaining backwards compatibility of packaged models. The ModelFromPackage() loader includes logic to handle different package format versions based on this counter.

Dependency Management

The packaging system handles dependencies by categorizing Python modules into three types:

  • Internal: Core NequIP code (nequip, e3nn) - gets packaged with the model

  • External: Large libraries (numpy, triton) - expected to be available in the target environment

  • Mock: Optional dependencies (matplotlib) - imports are allowed but runtime usage raises errors

Developers of NequIP extension packages can register custom dependencies using the module registration system. This function allows extension packages to properly categorize their dependencies, ensuring they are handled correctly during packaging. Libraries with custom C++/CUDA ops or large stable third-party libraries should typically be registered as external, while optional dependencies can be mocked.

nequip.scripts._package_utils.register_libraries_as_external_for_packaging(extern_modules: Iterable[str] | None = None, mock_modules: Iterable[str] | None = None) None[source]

Register a library as “external” or mocked for packaging.

Registers an entire top-level library as “extern” for packaging. This prevents any code from that library from being included in the package file.

Two primary types of libraries should be registered as external:

  1. Libraries that provide custom C++ or CUDA ops in PyTorch, for example OpenEquivariance.

  2. Large and stable third-party, non-PyTorch libraries like NumPy.

NequIP extension packages should never be registered as extern, and issues that seem to suggest that doing so is necessary should almost certainly be solved through refactoring the code to make it compatible with being interned.

Warning

Registering a library as extern means that a compatible version of that library must be installed in the environment where the package is run or used. This significantly complicates dependency management for packaged models and should be avoided as much as possible.

Mocking libraries is useful for libraries that are not required to run the model, but are used in the code that is packaged. This allows code that imports the mocked module to be packaged, but if any code actually tries to use the mocked module, it will raise an error. For example, we mock matplotlib by default.

Tip

Refactoring code to avoid unnecessary imports in packaged code is always preferred over registering libraries as external or mock modules.

See _DEFAULT_EXTERNAL_MODULES and _DEFAULT_MOCK_MODULES for the defaults.

Parameters:
  • extern_modules (Optional[Iterable[str]]) – libraries to register as external modules

  • mock_modules (Optional[Iterable[str]]) – libraries to register as mock modules

Repackaging Support

The system includes complex logic to handle creating packages from other packages while maintaining proper importer chains. During repackaging, shared importers ensure all models come from the same source, which is required by PyTorch’s packaging infrastructure.

Sharp Edges with Model Modifiers

nequip-package will pick up files as long as there are no errors loading the files. However, certain coding patterns can cause loading errors that prevent packaging. Common pitfalls include:

External dependencies not installed at package-time: Top-level imports of optional dependencies will cause packaging to fail if those dependencies aren’t available during packaging. Use lazy imports instead:

# bad: top-level import
from openequivariance import TensorProductConv  # fails if not installed

# good: lazy import in __init__
def __init__(self, ...):
    super().__init__()
    from openequivariance import TensorProductConv
    self.tp_conv = TensorProductConv(...)

Triton/GPU dependencies: Triton decorators like @triton.autotune cause errors if the code is loaded on machines without GPUs. Wrap GPU-dependent code in conditional blocks:

# hide GPU-dependent code to allow packaging on CPU-only systems
if torch.cuda.is_available():
    @triton.autotune(...)
    def gpu_kernel(...):
        # GPU implementation

Type checking with isinstance(): PyTorch’s torch.package creates isolated type systems where types are not shared between packages and the loading environment (see torch.package documentation). This means isinstance(obj, MyClass) checks will fail when comparing objects loaded from a package against class definitions in the environment.

To support packaged models, use class attribute identifiers instead of isinstance() checks:

# bad: isinstance check fails across package boundaries
if isinstance(module, SomeNequIPNNModule):
    # ...

# good: attribute-based check works with torch.package
if hasattr(module, '_is_some_nequip_nn_module') and module._is_some_nequip_nn_module:
    # ...

Classes should define identifying attributes:

from typing import Final

class SomeNequIPNNModule:
    _is_some_nequip_nn_module: Final[bool] = True
    # ...

Note: isinstance() checks are fine when both the class definition and instance come from the same package (e.g., model modifiers checking their own module types).