Taking Control of Your Database with Kubernetes Operators

Continues after advertising..

Managing stateful applications like databases on Kubernetes can be challenging. Operators are putting their practical knowledge into solutions that automate management tasks.

Using the MySQL operator simplifies the deployment and expansion of a production-ready MySQL cluster. It also enables high availability by replicating pods that host the database and enables backups and recovery.

InnoDB Cluster

Continues after advertising..

Haroon Khan is a data professional who builds and maintains databases that support high-performance applications. He has over a decade of experience with a variety of technologies, including MySQL, Microsoft SQL Server, DB2, and open source tools.

He is passionate about helping companies adopt a modern data architecture that can handle large data requirements. InnoDB Cluster is a critical component in this effort, providing improved write scalability. However, implementing it can be challenging if you need to familiarize yourself with its various components.

Continues after advertising..

Additionally, it may require multiple tools to manage the topology. This solution may not be suitable for your application if you don’t want to invest in the development/integration/testing time for each technology. This is where Haroon’s Kubernetes Compass shines.

He believes that scaling your MySQL with InnoDB Cluster should be separate from scaling its complexity. That’s why he advocates integrating with MySQL Operator Kubernetes. This powerful combination combines the power of InnoDB Cluster with the elegance of Kubernetes, simplifying management and orchestration all at once.

Announcement

If an instance is rebooted, it will leave the group and will need to rejoin the group to be added back to the default replica set. To rejoin an instance, you can use the cluster.rejoinInstance() command, which takes a URI as a parameter. This command ensures that the instance is ONLINE and can be added to the group.

It also checks that it is not a Split Brain (by comparing its executed GTIDs with executed/purged GTIDs of other ONLINE members). The command may fail if an instance is unreachable, but you can try again later.

PodDisruptionBudget

Continues after advertising..

A Pod Disruption Budget (PDB) specifies the minimum number or percentage of Pods in a collection that should always be up. It is a valuable tool when dealing with applications that require high availability, such as quorum-based applications or web front ends.

Pod interruption budgets help ensure that node operations only reduce your service by simultaneously draining a few pod instances. They work together with the Horizontal Pod Autoscaler to protect against unnecessary deployment downtime.

To use a PDB, you need to create a YAML file defining the desired availability for your application. You can also define a PodDisruptionBudgetStatus object to track the current status of your PDB. This can be useful when dealing with changing situations, as the status of your PDB may lag behind the actual state.

Pod outage budgets are useful for applications that need to maintain their availability while cluster upgrades occur. For example, a telecommunications company might want to ensure that its VoIP services remain available during system maintenance or patching.

This allows telcos to implement features that can improve the end-user experience without sacrificing availability. PodDisruptionBudget is also useful for software-as-a-service (SaaS) providers who must balance availability with system updates and maintenance. This is especially true when a SaaS provider needs to maintain availability while launching a new feature or deploying updates to their service.

Replica sets

ReplicationSets are Kubernetes control objects that maintain a specified number of pod replicas. They are great for handling simple scalability needs and can help ensure high availability and resilience.

ReplicaSets can also handle changes to the desired state, such as a pod failing or manually deleting a pod, by creating new pods to replace them or terminating excess replicas to bring the system back to the desired state.

ReplicaSets are configured in a YAML manifest file, including specifications for the number of replicas and the pod model. They also support the use of pod labels to identify and select the pods they manage.

For example, a ReplicationSet could acquire all pods with the label layer at (frontend) and the environment at (prod). A ReplicationSet can also track changes to the desired state by observing the oplog or primary write concern log.

When a primary fails in a replica set, the secondary that chooses to become the new primary catches up by applying the logged operations to its datasets. Additionally, a ReplicaSet can pre-warm the caches of eligible secondary replicas by mirroring read queries.

This reduces the impact of primary elections after a primary outage or during planned maintenance. ReplicaSets can also be deployed across multiple nodes and availability zones to enhance redundancy.

Backups

There are many solutions for running stateful databases on Kubernetes, but most require day-2 features like backups and updates.

Kubernetes Operators fill this gap by automating and simplifying complex, domain-specific operations, including deploying, scaling, and updating databases.

The best MySQL operators are designed to be fully integrated with existing infrastructure-as-code tools and CI/CD pipelines to enable scalable, secure, and production-ready database deployments.

Oracle's MySQL Operator is a self-healing solution that supports MySQL InnoDB clusters on Kubernetes. It uses a Kubernetes StatefulSet to manage MySQL server instances and assigns them a PersistentVolumeChain for storage.

The operator also deploys a MySQL router pod to route queries through the cluster. The operator is currently generally available and distributed under a universally permissive license close to the MIT license.

The Bitpoke MySQL Operator is an open-source, easy-to-use MySQL operator for Kubernetes that requires only Helm and Kubectl to be installed. It is designed to be a simple, stable, and production-ready solution for Kubernetes that provides monitoring, replication, backup, upgrades, and other features. The operator uses a declarative YAML file to deploy and scale MySQL clusters on Kubernetes.

Unlike a traditional stored procedure, Nova-conductor's abstraction prevents direct access to the database. This reduces the risk of attacks on the data. However, it is still possible to attack the database through buffer overflows, memory leaks, and malicious software.

Rolar para cima