Extending Kubernetes with Custom Resource Definitions (CRDs): A Step-by-Step Guide

As cloud-native technologies continue to revolutionize the way we build and deploy applications, Kubernetes has emerged as a pivotal platform for container orchestration. One of the more powerful features of Kubernetes is its Custom Resource Definitions (CRDs), which allow you to extend the Kubernetes API to meet your specific needs. In this blog post, we will explore how to create and manage CRDs in Kubernetes, using real-world examples and commands to demonstrate the process step-by-step.

What is a Custom Resource Definition (CRD)?

Custom Resource Definitions (CRDs) enable developers to define their own resources (Custom Resources) in Kubernetes. This extension mechanism allows you to store and manage bespoke configurations and orchestrations in a Kubernetes-native way, leveraging Kubernetes' powerful lifecycle management capabilities.

Use Case: Managing Database Backups

Imagine you want to manage database backups for your applications using Kubernetes. You can create a Custom Resource Definition for database backups, allowing you to define and manage backup configurations as Kubernetes resources. Here's how you can achieve that.

Step 1: Define the Custom Resource Definition (CRD)

Create a CRD YAML file named databasebackup-crd.yaml to define the schema for the database backup custom resource:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databasebackups.example.com
spec:
  group: example.com
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                databaseName:
                  type: string
                backupSchedule:
                  type: string
                backupRetention:
                  type: integer
  scope: Namespaced
  names:
    plural: databasebackups
    singular: databasebackup
    kind: DatabaseBackup
    shortNames:
    - dbbackup

Apply the CRD to your Kubernetes cluster:

kubectl apply -f databasebackup-crd.yaml

Step 2: Create a Custom Resource

Now that the CRD is defined, you can create custom resources based on it. Create a YAML file named my-database-backup.yaml to define a database backup resource:

apiVersion: example.com/v1
kind: DatabaseBackup
metadata:
  name: my-database-backup
spec:
  databaseName: my-database
  backupSchedule: "0 0 * * *"
  backupRetention: 7

Apply the custom resource to your Kubernetes cluster:

kubectl apply -f my-database-backup.yaml

Step 3: Implement a Controller

To manage the lifecycle of the custom resources, you need to implement a custom controller. A controller listens for changes to custom resources and triggers the desired operations. Below is a simplified example of a Python-based controller using the Kubernetes Python client:

from kubernetes import client, config, watch

def main():
    config.load_kube_config()
    api = client.CustomObjectsApi()
    w = watch.Watch()

    for event in w.stream(api.list_namespaced_custom_object, 
                          group="example.com", version="v1", 
                          namespace="default", plural="databasebackups"):
        cr_object = event['object']
        cr_event_type = event['type']
        
        db_name = cr_object['spec']['databaseName']
        schedule = cr_object['spec']['backupSchedule']
        retention = cr_object['spec']['backupRetention']
        
        print(f"Event: {cr_event_type}")
        print(f"Managing backup for database: {db_name}")
        print(f"Schedule: {schedule}, Retention: {retention} days")
        
        # Add logic here to handle backups based on the event type and custom resource data

if __name__ == '__main__':
    main()

Deploy this controller as a Kubernetes deployment to continuously watch and manage your database backups:

Step 4: Monitoring and Observability

Ensure your CRD and controller are monitored for health and performance. Use Kubernetes built-in observability tools and additional monitoring solutions like Prometheus and Grafana to track the state of your custom resources and controllers.

# Example: Exposing custom metrics using Prometheus
# Define ServiceMonitor for your custom controller
kubectl apply -f <(echo '
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: dbbackup-controller-monitor
spec:
  selector:
    matchLabels:
      app: dbbackup-controller
  endpoints:
    - port: metrics
') 

Lessons Learned and Best Practices

While extending Kubernetes with CRDs offers tremendous flexibility, it also introduces complexity. Here are a few lessons learned and best practices:

  • Thorough Validation: Ensure your CRD schemas are thoroughly validated to prevent runtime issues.
  • Robust Error Handling: Implement robust error handling in your controllers to manage failures gracefully.
  • Resource Constraints: Be mindful of resource constraints and performance impacts when dealing with a large number of custom resources.
  • Continuous Monitoring: Continuously monitor your CRDs and controllers to ensure they are functioning as expected.

Conclusion

Custom Resource Definitions (CRDs) are a powerful way to extend Kubernetes to meet the specific needs of your applications. By defining custom resources and implementing controllers to manage their lifecycle, you can leverage Kubernetes' robust orchestration capabilities for your bespoke requirements. This blog post provided a step-by-step guide to creating and managing CRDs, along with practical examples and lessons learned. We encourage you to experiment with CRDs to unlock new capabilities within your Kubernetes environment.

Have you implemented CRDs in your Kubernetes cluster? Share your experiences and insights in the comments below!