VMware Cloud Foundation 9.1: Safely Shutting Down an Entire VCF Services Runtime Cluster

With VMware Cloud Foundation 9.1, Broadcom introduced a new automated method for safely shutting down all nodes within a VCF Services Runtime cluster. While this may sound like a small operational improvement, anyone who has ever performed a full management domain maintenance window knows how important a coordinated shutdown process can be.

As more VCF services are delivered through the VCF Services Runtime platform, traditional „power off the VMs“ approaches become increasingly risky. Kubernetes workloads, platform services, databases, and supporting components all have dependencies that should be handled gracefully.

To address this, Broadcom now provides a supported shutdown utility that orchestrates the entire process.

Why a Graceful Shutdown Matters

Modern VMware Cloud Foundation environments contain multiple interconnected services that continuously exchange data and maintain cluster state.

An unplanned or improperly coordinated shutdown can potentially result in:

  • Incomplete database transactions
  • Kubernetes workload inconsistencies
  • Service startup delays
  • Failed platform services after reboot
  • Additional troubleshooting during recovery

For planned maintenance windows, power outages, datacenter migrations, or infrastructure upgrades, a controlled shutdown process is strongly recommended.

Introducing the VCF Services Runtime Shutdown Utility

Starting with VCF Services Runtime 9.1, administrators can use the following utility:

.\vcf_services_runtime_shutdown.sh

The script performs an orchestrated shutdown of the entire runtime cluster by communicating directly with the VCF Services Runtime APIs.

During execution, the utility:

  • Performs pre-flight validation checks
  • Discovers runtime cluster nodes
  • Identifies active workloads
  • Handles service dependencies automatically
  • Preserves recovery information
  • Coordinates shutdown sequencing
  • Powers down nodes in the correct order

This significantly reduces the operational risk associated with manual shutdown procedures.

Where to Download the Script

The shutdown utility is provided directly by Broadcom and can be downloaded from the official Knowledge Base article:

Broadcom KB 440874

The article contains:

  • The latest version of the shutdown script
  • Detailed prerequisites
  • Usage examples
  • Supported command-line parameters
  • Recovery guidance

After downloading the script, make it executable:

chmod +x vcf_services_runtime_shutdown.sh

Prerequisites

Before executing the script, ensure the following tools are available:

curl
jq
govc

You will also need:

  • Network access to a VCF Services Runtime Control Plane node
  • Connectivity to port 5480
  • Appropriate administrative credentials
  • Access to the runtime kubeconfig file

The script can be executed from any management workstation that satisfies these requirements.

Performing a Dry Run

Before initiating an actual shutdown, Broadcom recommends validating the environment using the dry-run option.

Example:

./vcf_services_runtime_shutdown.sh \
  --node-ip <CONTROL_PLANE_IP> \
  --dry-run \
  --kubeconfig <kubeconfig-file>

A dry run allows administrators to verify connectivity, permissions, and cluster discovery without impacting running services.

This is especially useful when performing the procedure for the first time or when preparing for a large maintenance event.

Typical Use Cases

The shutdown utility is particularly useful for:

  • Datacenter power maintenance
  • UPS replacement projects
  • Infrastructure migrations
  • Hardware refresh activities
  • Full-environment backup operations
  • Disaster recovery testing
  • Planned maintenance windows

In each scenario, the automated workflow provides a safer alternative to manually shutting down runtime nodes.

Benefits Compared to Manual Shutdown Procedures

Historically, VMware administrators often relied on lengthy shutdown runbooks that required services to be powered down in a very specific sequence.

These procedures were:

  • Time-consuming
  • Error-prone
  • Difficult to troubleshoot
  • Dependent on administrator experience

The new Runtime Shutdown Utility simplifies operations considerably.

Traditional ApproachAutomated Runtime Shutdown
Manual service sequencingFully orchestrated workflow
Dependency tracking requiredDependencies handled automatically
Higher risk of human errorConsistent execution
Longer maintenance windowsFaster operational workflow
More troubleshootingPredictable recovery process

For organizations operating production VCF environments, this can significantly reduce operational complexity.

Operational Best Practices

Before initiating a shutdown, verify that:

  • No lifecycle operations are currently running
  • SDDC Manager tasks have completed
  • Backup jobs have finished successfully
  • Active maintenance activities are complete
  • Stakeholders have been notified

Remember that all services hosted within the VCF Services Runtime cluster will become unavailable during the shutdown process.

Proper communication and planning remain essential.

Recovery Considerations

Broadcom has also published a companion article that explains how to recover services if the VCF Management Services Runtime cluster does not fully recover after startup.

Recovery KB:

Broadcom KB 440862

While most environments should start cleanly after a controlled shutdown, it is always worth bookmarking the recovery documentation before beginning any maintenance activity.

Final Thoughts

The introduction of the vcf_services_runtime_shutdown.sh utility is one of those operational improvements that administrators will immediately appreciate.

Instead of relying on manual runbooks and shutdown sequences, VMware Cloud Foundation 9.1 now provides a supported, automated, and repeatable process for safely shutting down an entire VCF Services Runtime cluster.

If you are operating VCF 9.1 or later, this procedure should become part of your standard maintenance and datacenter shutdown runbook. It reduces operational risk, simplifies maintenance activities, and helps ensure a predictable startup experience when services are brought back online.

References

Broadcom Knowledge Base Article 440874
https://knowledge.broadcom.com/external/article/440874/how-to-safely-shutdown-all-nodes-within.html

Broadcom Recovery Procedure for VCF Management Services Runtime
https://knowledge.broadcom.com/external/article/440862/vcf-management-services-cluster-or-the-v.html

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert