Kubernetes Infrastructure Requirements

About Kubernetes

Kubernetes is an open-source system that gives you the ability to automate the deployment and management of applications. To support our clients who have Kubernetes instances, or clusters, we have made Hyperscience available as a Kubernetes application.

Kubernetes is known for being flexible and modular in its implementation, due in large part to the packaging of its applications. No Kubernetes application is deployed on its own. Rather, each application runs with a set of libraries and dependencies in a structure called a container. Shipping applications in this way allows them to run alongside other applications and in a variety of environments without compatibility concerns.

When an application runs, its code runs in its container, which resides in a pod. Pods generally house containers that work together and share resources, but they can also have a single container. A pod represents the smallest executable—or runnable—unit in a Kubernetes cluster. External traffic can only access pods if an ingress for that pod is created.

Through the automatic creation of new pods, Kubernetes supports the scaling of resources as demand increases. Similarly, if demand decreases, Kubernetes can automatically scale those resources back. This management of resources results in a management of costs for your organization.

Pods run on nodes, which can be physical servers or virtual machines. A cluster can have many nodes, and all nodes are controlled by a master node (or master). The master manages the state of the cluster. When you interact with your cluster—for example, when configuring the desired state of an application—the master processes your commands and routes them to the appropriate node, pod, or container.

A simplified view of the elements we’ve discussed appears below.

Kubernetes overview

How the Hyperscience Platform uses Kubernetes

The Hyperscience Platform architecture is based on a workflow orchestration. Multiple task-processing units are needed to execute each of the steps/tasks inside a workflow and move the workflow execution forward. We call the task-processing units blocks. Hyperscience uses Kubernetes to orchestrate the blocks, effectively making Kubernetes a workload orchestrator. To provide this functionality, Hyperscience built the HyperOperator. The HyperOperator provides a bridge between the Hyperscience Platform and the Kubernetes-container orchestration to provide seamless workload orchestrations in our product. The AWS reference design provides high-level overview of how this process works.

Hyperscience Platform deployment includes several steps:

  1. The customer should provide information about the available infrastructure: container orchestration, file storage, database, and container image registry, and they should ensure that they meet the requirements described below.

  2. The customer should copy all required container images inside an internal container registry accessible from the Kubernetes cluster. The Hyperscience CS team member should provide access to the Hyperscience public container-image repository.

  3. Based on the input from the previous steps, the customer should populate the values.yaml used by the Helm chart and install it.

To help customers with steps 1 and 2, we built hsk8s (Hyperscience Kubernetes CLI). The tool simplifies the container-image streaming to internal container-image registries, and it eases the process of collecting a support bundle with diagnostic information when a support ticket is opened. hsk8s should be executed on a workstation with access to the Kubernetes cluster. The workstation should have Internet access, when the customer need to use gather application diagnostics data. More information can be found in the Kubernetes Troubleshooting and Tweaks.

To help with the initial deployment steps and future support of the Hyperscience Platform the customers are advised to:

  1. have a workstation with both access to the Internet (https://cloudsmith.io and https://support.hyperscience.com) and internal systems (Kubernetes cluster and container registry). This workstation could be used for the deployment and help with support cases later on.

  2. keep their values.yaml file in a source control system, like Git. This file describes all deployment parameters for Hyperscience, and it needs to be available for future application upgrades and support.

Infrastructure Requirements

Customers need to install Hyperscience on an existing Kubernetes cluster that is already configured to their specifications. Customers should reserve a namespace that is dedicated to the Hyperscience application.

An SQL database and a file store are required for the Hyperscience application's backend. It is important to note that the Hyperscience application does not install the database and the file store. We recommend using database and file store services external to the Kubernetes cluster. An example in the context of AWS is shown below.

Supported infrastructure components

Below is a list of supported Kubernetes versions, files stores, databases and container image registry:

Container orchestration

File storage

Database

Container image registry

Kubernetes v1.28

AWS S3 or compatible

PostgreSQL 12.x, 13.x, 14.x

Docker Registry HTTP API V2 compatible (AWS ECR, Google Artifact Registry, etc.)

Azure blob or compatible

MSSQL 2016, 2017, 2019

Hyperscience deprecated support for new installations of Kubernetes v1.27 as of August 2024. Existing customers running Kubernetes 1.27 are strongly advised to upgrade.

Kubernetes versions support calendar

Once a version of Kubernetes is not supported, Hyperscience may introduce breaking changes in the deployment tooling in later releases. It's important to note that Kubernetes version support calendar is not related to the Hyperscience Platform version support.

Kubernetes version

Hyperscience end of support

1.24

February 2024

1.25

June 2024

1.26

July 2024

1.27

August 2024

1.28

December 2024

PodSecurityPolicy

We have certain security features in our application that require the involvement of a second user. In order to allow for this second user, the following capabilities are required. Some flavors of Kubernetes, like Rancher and OpenShift, block those by default, and they need to be whitelisted additionally in the PodSecurityPolicy.

requiredCapabilities:
- SETUID
- SETGID
fsGroup:
  type: RunAsAny
allowPrivilegeEscalation: true
readOnlyRootFilesystem: false
runAsUser:
  type: MustRunAsNonRoot
volumes:
- configMap
- emptyDir
- persistentVolumeClaim
- secret

Reference architecture

A diagram of the different deployment components and how they interact with each other appears below. The specific services used will depend on the cloud provider, but the overall concept remains the same.

A separate namespace in the Kubernetes cluster is recommended for the Hyperscience Platform and all associated resources.

AWS ArchitectureNodes

We require at least 2 separate node groups for optimal processing-load distribution. Each node group may have one or more nodes attached to it. The node sizing will change based on your desired performance and individual workflow characteristics.

We use nodeSelector pod affinity (see Kubernetes's Assigning Pods to Nodes) to isolate the trainer from the rest of the application for performance reasons. In order for this setup to work, you need to assign one node group as the "platform" group and the other one as the "trainer" group. See AWS's Organize Amazon EKS resources with tags and GCP's Create and manage cluster and node pool labels for more information on how to apply the below tags:

  • hs-component=platform (platform node group only)

  • hs-component=trainer (trainer node group only)

Considering that such fine-grained control over nodes is necessary, we recommend choosing Kubernetes providers that offer this capability. For GKE, this means selecting a Standard cluster rather than an Autopilot one.

Minimum requirements:

Node type

Node vCPU number

Node RAM in GB

EC2 instance type

CE instance type

Number of nodes

platform

8

32

m5.2xlarge

n4-standard-8

2

trainer

16

64

m5.4xlarge

n4-standard-16

1

Docker repositories

To store Hyperscience container images, four repositories in an internal registry are required, as illustrated in the diagram above. These repositories must be accessible to the cluster's nodes through IAM permissions, enabling them to pull the images and initiate deployments.

Database

The Hyperscience Platform requires the use of a SQL database to store key application data.

After a database instance has been created in your preferred cloud provider, you need to collect the following pieces of connection-related information, which are later used to set the proper Helm Chart values (see Helm Chart for more details):

  1. DB server endpoint (including the port)

  2. DB Username

  3. DB Password

  4. DB Name

File store

The Hyperscience Platform uses S3 as the primary blob store. You will need to create an S3-compatible bucket for your Hyperscience deployments. Hyperscience pods will need access to read and write from this bucket.

Service access control

You'll need to create IAM roles, policies, and identity providers to be used by the pods via IAM roles for service accounts (see AWS's IAM roles for service accounts) or HMAC keys (see Google Cloud's HMAC keys). More details can be found in our Helm Chart article.