Many compliance-minded organizations are seeking to capitalize on the benefits of production-grade Kubernetes in air-gapped environments. An air-gap, air-gapping, or disconnected network is a security measure to ensure that a computer network is physically isolated from insecure networks, like the public internet or insecure local networks. Although air-gapped environments can have varying degrees of internet connectivity, a majority of them have zero internet access.
Air-gapped environments can offer many benefits and advantages to security-conscious organizations such as federal agencies, law enforcement agencies, and military organizations. Because the network is offline, air-gapped environments can keep critical systems and sensitive information safe from potential data theft or security breaches. Organizations can vet the container images that they allow to run on their clusters to reduce the risk of a malicious attack, giving them another layer of protection. Organizations are also not exposed to rate limiting on the downloads of these images. Finally, they can operate in low bandwidth environments, or with poor Internet connections, ensuring the continuous availability of their mission-critical applications.
However, while air-gapped environments offer many security and workflow advantages, they also introduce two new challenges when utilizing production-grade Kubernetes.
Challenge One: Set-Up is Manual and Time Consuming
Organizations looking to implement Kubernetes are faced with a myriad of choices from the cloud-native landscape. To start, organizations not only have to determine the base Kubernetes distribution to be used, but choose the supporting services for production-ready operations, such as security, networking, storage, observability, and developer tooling. The cloud-native landscape moves fast and is littered with technological choices in areas such as these, making it difficult to correctly make decisions that protect the organization’s long-term investment in these areas.
One of the benefits of Kubernetes is that it radically simplifies and automates the infrastructure and application deployment by leveraging declarative APIs, resources, and software repositories. Running Kubernetes in offline, air-gapped environments means having private registries and repositories in place for Kubernetes, Docker, and all of the open-source components an organization needs to run Kubernetes in production.
The organization’s environment will need its own container image repository, Helm charts repository, OS package repository, and addon repository. It’ll need to configure all of its Docker images to pull from its internal registries and repositories. And, all of its software and open-source components will need to be tightly integrated, secured, tested for vulnerabilities, and made locally accessible to its application and deployment environment.
This is not only a very manual process, but it requires a number of steps, making it difficult to build a robust production platform to support mission-critical workloads. And most organizations can’t afford to spend months on the engineering effort necessary to set this up.
Although cloud providers offer services out-of-the-box that are accessible in air-gapped environments, many aren’t built air-gapped from the start. Instead, they come with a predetermined set of services, tools, and distributions that an organization is unable to customize or install on its local network to meet its needs, inhibiting its freedom of choice. As a result, organizations are left with holes in their capabilities and delayed production deployments. And this challenge grows in complexity when software needs to be upgraded, and as projects and demand multiply.
Challenge Two: A Lack of Two-Way Connectivity
Kubernetes simplifies and automates many of the operational tasks by providing a communication path between the control plane and clusters. However, in an air-gapped environment, the control plane may not have easy access to an organization’s clusters because they are behind a firewall, NAT gateway, Proxy, or within a DMZ. When two-way connectivity isn’t available, there is no way for the organization to keep clusters running in line with the specifications it set them up with, which can lead to an increase in failures, downtime, and operational costs.
Operators need to have centralized visibility of the organization’s entire stack of services. There might be dozens, even hundreds, of pieces that need to work together to form one application. And when something fails, it’s usually at the intersection of all of these pieces. So it’s not only important to get all of that data from a centralized location but to have a consistent way to manage and obtain insights about your infrastructure.
Without two-way connectivity, it becomes increasingly difficult for the organization to know where clusters exist, how they are performing, and to govern the usage and versions of cloud-native software to support application efforts. If a cluster goes down, the organization can’t troubleshoot problems without losing valuable time. It can’t easily obtain insights on cluster performance to deliver better resource utilization. And, if there are dozens of software versions in use and teams that need access to them, it becomes incredibly challenging to create consistency across clusters. All of which leads to a list of operational responsibilities that are ever-expanding in scope and complexity.
Both Kubernetes and air-gapped environments have a lot to offer but can leave a ton of value on the table for compliance-minded organizations. Organizations want to take advantage of the scalability and resiliency of Kubernetes to enable a broader set of advanced use cases. However, getting Kubernetes to work in an air-gapped environment is incredibly complex because of the infrastructure restrictions that limit the organization’s ability to effectively deploy and operate.