Optimize your Azure CycleCloud HPC clusters with a custom Slurm image for secure, internet-free environments and faster scaling. This step-by-step guide covers VM setup, package installation, repo configuration, image capturing, and template modification to streamline HPC deployments efficiently. Unique :

Speed Up Your Azure HPC Clusters with a Custom Slurm Image
If you’re managing HPC clusters with Azure CycleCloud, you know how crucial security and startup speed are. Microsoft just shared a neat way to build a custom Slurm image that locks down your environment and slashes node startup times. Let’s break down what you need to know.
What’s New?
The new approach lets admins pre-install Slurm packages directly onto an Almalinux HPC image. This means your cluster nodes don’t waste time downloading and installing Slurm during scaling. Plus, it supports locked-down environments with no internet access—perfect for sensitive workloads.
“Many CycleCloud users or admins need to run their HPC cluster in a secure environment without internet access.”
Using Azure CycleCloud 8.7.1 and Slurm 23.110-2, you create a standalone VM with all required packages. Then, capture this as a custom image for your cluster nodes.
Major Updates and Steps
Prepare Your VM
- Create an Azure VM with Almalinux HPC image gen 2 (8.10).
- Download and install Slurm packages from the official GitHub release.
- Add Slurm and Munge users with specific user IDs for consistency.
- Install CycleCloud dependencies like Chef-required packages (xfsdump, cryptsetup, etc.).
- Adjust OS repositories to enable only those needed in your locked environment.
Generalize and Capture the Image
- Run
waagent --deprovision+user --force
to clean user data. - Use Azure CLI to generalize the VM and capture it as a managed image.
- Remember to select a shared image gallery, not just a managed image, for CycleCloud compatibility.
3. Modify Your CycleCloud Template
- Prevent CycleCloud from reinstalling Slurm by setting
slurm.do_install = false
in your cluster template. - Update user IDs in the template to match your custom image setup.
- Import the updated template and create your cluster using the custom image resource ID.
Why It Matters
This method drastically reduces cluster node startup times by skipping redundant Slurm installs. Also, it ensures your HPC environment stays secure without internet access. If you can briefly connect to the internet during initial startup, CycleCloud caches project data, smoothing future launches.
“If you want to know more about that command… Deprovision or generalize a VM before creating an image.”
For CycleCloud versions earlier than 8.7.1, a patch resolves issues with the slurm.do_install = false
flag. Newer versions include this fix by default.
Final Thoughts
Building a custom Slurm image for Azure CycleCloud is a game-changer for HPC admins aiming for speed and security. Follow the step-by-step guide, tweak your templates, and enjoy faster, locked-down cluster deployments. This is a must-try for anyone running HPC workloads on Azure.
From the New blog articles in Microsoft Community Hub