Building an Internal Kubernetes Platform

Building an Internal Kubernetes Platform

An Internal Kubernetes Platform is a bespoke service offered to employees of an organization that provides access to a Kubernetes environment. The main differentiator between an Internal Kubernetes Platform and a public Kubernetes platform such as GKE is that an Internal Kubernetes Platform is built and designed for a specific organization. However, public platforms often play the crucial role of providing a battle-tested foundation that an Internal Kubernetes Platform can be built on top of.

An Internal Kubernetes Platform can (and should) be thought of as an internal product with users of the platform (e.g. developer or application teams) being the platform’s customers. A key goal here is to abstract organizational complexities away from customers and provide a well-trodden path for using Kubernetes. For organizations that are committed to using Kubernetes at scale, investment into an Internal Kubernetes Platform is critical to achieving efficient innovation.

Our cloud native experts have worked with a number of organizations to design, develop and support Internal Kubernetes Platforms from the ground up, as well as build key extensions to existing platforms. Due to the bespoke nature of these platforms the variation in strategies is large and depends heavily on the requirements, goals and resources of the organization, however in this post we present some of our opinions on the benefits, key considerations and pitfalls of designing and building such a platform.

Benefits

There are a huge number of benefits that can come from building an Internal Kubernetes Platform; here we describe the ones that we have found to be the most impactful.

Streamlined access to accredited infrastructure

Even with the extensive capabilities offered by public Kubernetes platforms, gaining access to accredited Kubernetes infrastructure can be incredibly challenging for developers at some organizations. This could be due to security restrictions requiring that clusters are configured in a very particular way, complex network topologies requiring complicated network or proxy setups, or even more fundamental requirements such as installing and configuring kubectl according to organizational policy.

An internal platform can provide a tested, well-documented, supported and ideally self-service process for gaining access to compliant Kubernetes clusters, taking into account and abstracting away from the particular idiosyncrasies of the organization.

Consistent policy enforcement

Many organizations need to implement large numbers of controls in order to meet their security and compliance requirements. In addition, there are typically organizational policies that need to be enforced to ensure configuration standards. Managing clusters through an Internal Kubernetes Platform provides a powerful opportunity for these policies to be defined centrally and applied consistently, with input from across the organization and with guidance for customers on how to comply.

We have found that Rego and Gatekeeper coupled with OPA’s testing framework, constraint generation with Konstraint and a GitOps tool such as Flux is a powerful combination for writing, testing, generating, distributing and enforcing policies across a fleet of managed clusters.

Guidance, support and community

For very large organizations, silos of information are common. Investment into an Internal Kubernetes Platform can help centralize and distribute organizational best-practices and provide a basis for an internal community, providing guidance and support around topics such as:

  • Security and compliance
  • Application deployment archetypes
  • Networking
  • Monitoring and alerting
  • Incident response
  • Change management
  • Cost optimization

Managed addons

Public Kubernetes platforms aim to cater for a huge range of Kubernetes use cases across many organizations; an Internal Kubernetes Platform has the advantage that it only needs to cater for the needs of a single organization. This means that managed clusters can bring many extensions out of the box for customers to use that are configured specifically for that organization. Some powerful examples include:

All of these extensions can be configured or consumed through built-in or custom Kubernetes resources which can be coupled with Gatekeeper policies to allow the platform team to control which functionality to expose and support for customers.

Key considerations

Here we discuss some key considerations when building an Internal Kubernetes Platform.

Cost-benefit analysis

Building an Internal Kubernetes Platform is a significant investment for any organization. The main factor that determines whether this investment will pay off is whether the organization can benefit from the economies of scale that a platform can bring by handling common requirements for many customers; of course, for this to work, the customer base needs to be large enough for these benefits to be realized.

Some rules of thumb for deciding when such an investment would pay off are having more than 15 developers that could onboard onto the platform or having the potential for five or more customer teams, however the true turning point depends heavily on the complexity of the environment and so it is specific to an organization; for example, if it is likely to take months to provision a compliant cluster for the average developer team, investment into a centralized solution pays off much more quickly per customer compared to an organization with more lenient compliance standards.

With these factors in mind, organizations must determine whether an Internal Kubernetes Platform is the right option or whether it would be more valuable to invest into other areas such as directly into developer automation or tooling.

Level of abstraction

When designing an Internal Kubernetes Platform, perhaps the most technically significant consideration is the level of abstraction you wish to offer, mainly because once you start onboarding customers it can be very disruptive to change. A popular pattern is for the platform team to run large multi-tenanted clusters and provide self-service access to Namespaces. Another option is to offer a more serverless experience using a tool such as Knative.

Yet another model that we have found to work well is to provision a cluster per customer. This level of abstraction comes with higher levels of overhead per customer and requires potentially greater levels of investment into the automation around cluster lifecycle management, however it offers much stronger isolation between customers and opens up the opportunity for features that are much more difficult to support on multi-tenanted clusters (e.g. allowing customers to deploy and manage their own CustomResourceDefinitions and make other cluster-scoped configuration changes).

Pitfalls

Here we discuss some common pitfalls that we have helped mitigate when working with customers.

Not working with prospective customers early

The success of an Internal Kubernetes Platform is typically proportional to the number of customers that are actively using it. Early collaboration with customers is critical to ensuring the platform priorities features that bring the most value and encourage customers to onboard (and remain onboarded); the earlier these discussions happen the easier it is to build trust and create effective feedback loops.

Platform team becoming a general support team

The platform team is responsible for supporting the platform but are typically not responsible for supporting customer applications. For this reason, customers should ensure platform SLOs and guarantees are appropriate for their own requirements, but they should also have their own SLOs and on-call rota for incident response; the messaging around this shared responsibility model needs to be clear from the start to avoid unwelcome surprises.

Failure to be a product

The platform should be built and managed as an internal product and so should stand up to the same level of scrutiny as public products. This includes the following features:

  • Easy onboarding and offboarding process
  • Clear, well-written documentation with getting started guides; we have found that a great place to start is to structure the platform documentation in a similar way to open source products such as KubernetesIstio and Tekton
  • Customer discussion forums
  • Support and feature request process
  • Announcement channels
  • Feature roadmap

Failure to empower the platform team to maximize customer value

There will likely be prospective customers who have very specific feature requirements compared to the requirements of the wider organization. Such features can require high levels of investment to build and support and this may impede prioritizing the features that will bring value to the majority of customers. The platform team needs to have support from management to push back on pressure to implement such features in order to maximize total value.

That being said, this is a very difficult line to tread as it is important not to deter customers from making feature requests; feature requests and customer feedback in general is essential for determining pain points and how to prioritize work, so push back sparingly!

Summary

An Internal Kubernetes Platform can be an incredibly powerful tool for improving technical innovation and development velocity at an organization. By pushing common application requirements down to the platform, the cognitive load of developer teams can be significantly reduced and more time can be spent on building and innovating on business logic. However, such an investment is a serious undertaking and options need to be considered carefully to reduce risk.

Luke Addison is a solutions engineer for cloud native at CyberArk.