Aks upgrade

From UVOO Tech Wiki
Revision as of 17:37, 4 February 2025 by Busk (talk | contribs) (Created page with "# AKS Upgrade ``` # Azure Kubernetes Service (AKS) Upgrade Guide Current Version: 1.29.7 - Upgrading to Newer Versions ## Table of Contents - [Prerequisites](#prerequisites)...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

AKS Upgrade

# Azure Kubernetes Service (AKS) Upgrade Guide
Current Version: 1.29.7 - Upgrading to Newer Versions

## Table of Contents
- [Prerequisites](#prerequisites)
- [Pre-upgrade Checklist](#pre-upgrade-checklist)
- [Upgrade Process](#upgrade-process)
- [Rollback Procedure](#rollback-procedure)
- [Monitoring and Verification](#monitoring-and-verification)
- [Troubleshooting](#troubleshooting)

## Prerequisites

### Required Access and Permissions
- Azure CLI installed and configured
- kubectl installed and configured
- Cluster Admin access
- Access to Azure subscription
- Required RBAC permissions for resource groups

### Environment Validation
```bash
# Verify current cluster version (should be 1.29.7)
az aks show --resource-group <resource-group> --name <cluster-name> --output table

# Check for available upgrades from 1.29.7
az aks get-upgrades --resource-group <resource-group> --name <cluster-name>

# Review the supported versions for upgrade
az aks get-versions --location <location> --output table

# Check if the target version is supported for direct upgrade
# Note: AKS supports upgrading across two minor versions at most

Version Compatibility Matrix

Current version: 1.29.7 Potential upgrade paths: - Minor version upgrades (recommended):

 - 1.29.x (patch updates within same minor version)
 - 1.30.x (next minor version)
 - 1.31.x (next minor version)
 - 1.32.x (next minor version)
 - 1.33.x (future minor version when available)

- Major version upgrades:

 - Not applicable until version 2.x is released

Note: Always check the AKS version support policy for the most current information about supported versions and upgrade paths.

Pre-upgrade Checklist

1. Resource Assessment

  • [ ] Document current node pool configuration
  • [ ] List all namespaces and their resource usage
  • [ ] Review PodDisruptionBudgets (PDBs)
  • [ ] Check StorageClass configurations
  • [ ] Validate StatefulSet configurations
  • [ ] Review CustomResourceDefinitions (CRDs)

2. Workload Analysis

# Get all workload types across namespaces
kubectl get all --all-namespaces -o wide

# Check for deprecated APIs
kubectl get apiservices

# Verify PodDisruptionBudgets
kubectl get pdb --all-namespaces

3. Backup Critical Components

  • Export all Kubernetes resources
# Backup all resources
kubectl get all --all-namespaces -o yaml > pre_upgrade_backup.yaml

# Backup specific resource types
kubectl get statefulset,pv,pvc,configmap,secret --all-namespaces -o yaml > critical_resources_backup.yaml

Upgrade Process

1. Pre-upgrade Steps

# Validate cluster health
az aks show -g <resource-group> -n <cluster-name> --query 'provisioningState'

# Check node status
kubectl get nodes
kubectl describe nodes

# Verify pod health
kubectl get pods --all-namespaces -o wide

2. Control Plane Upgrade

# Start the upgrade to the target version (replace <target-version> with the desired version)
az aks upgrade \
    --resource-group <resource-group> \
    --name <cluster-name> \
    --kubernetes-version <target-version> \
    --control-plane-only

# Monitor upgrade progress
az aks show -g <resource-group> -n <cluster-name> --query 'currentKubernetesVersion'

3. Node Pool Upgrade

# Upgrade node pools one at a time
az aks nodepool upgrade \
    --resource-group <resource-group> \
    --cluster-name <cluster-name> \
    --name <nodepool-name> \
    --kubernetes-version 1.29.7

# Monitor node pool status
az aks nodepool show \
    --resource-group <resource-group> \
    --cluster-name <cluster-name> \
    --name <nodepool-name> \
    --query provisioningState

4. Post-upgrade Validation

# Verify cluster version
kubectl version --short

# Check node versions
kubectl get nodes -o wide

# Verify all pods are running
kubectl get pods --all-namespaces | grep -v "Running\|Completed"

# Validate StorageClass functionality
kubectl get storageclass

# Check StatefulSet status
kubectl get statefulset --all-namespaces

Rollback Procedure

Emergency Rollback Steps

# Rollback control plane
az aks upgrade \
    --resource-group <resource-group> \
    --name <cluster-name> \
    --kubernetes-version <previous-version> \
    --control-plane-only

# Rollback node pools
az aks nodepool upgrade \
    --resource-group <resource-group> \
    --cluster-name <cluster-name> \
    --name <nodepool-name> \
    --kubernetes-version <previous-version>

Monitoring and Verification

Key Metrics to Monitor

  • Node health status
  • Pod scheduling and distribution
  • Resource utilization
  • API server response times
  • etcd performance
  • Network connectivity

Monitoring Commands

# Monitor node conditions
kubectl get nodes -w

# Watch pod status across all namespaces
kubectl get pods --all-namespaces -w

# Check system pod status
kubectl get pods -n kube-system

# Monitor events
kubectl get events --all-namespaces --sort-by='.metadata.creationTimestamp'

Troubleshooting

Common Issues and Solutions

  1. Pod Scheduling Issues
# Check pod status and events
kubectl describe pod <pod-name> -n <namespace>

# Verify node capacity and allocation
kubectl describe node <node-name>
  1. Storage Issues
# Check PV/PVC status
kubectl get pv,pvc --all-namespaces
kubectl describe pv <pv-name>
kubectl describe pvc <pvc-name> -n <namespace>
  1. Network Issues
# Verify network policies
kubectl get networkpolicies --all-namespaces

# Check service endpoints
kubectl get endpoints -A

Support Resources

Notes

  • Maintain at least N+1 node capacity during upgrades
  • Schedule upgrades during low-traffic periods
  • Consider using cluster autoscaler during upgrade
  • Monitor application-specific metrics
  • Keep backup of all critical configurations
  • Test upgrade procedure in dev/staging environment first ```