Understanding Kubernetes workload node objects

 

Kubernetes has a variety of objects to manage your cluster and your applications. Your applications run in workload nodes (virtual machines) and the containers are managed by the control plane.

You use manifests to tell the control plane how you want to configure your Kubernetes objects using manifests, and the control plane changes the state of the cluster to your desired state.

In other words, you tell the control plane how to configure the workload nodes with your containers, networking, security, and storage. And the control plane makes it happen.

In this article, learn the definitions of the workload objects. And learn some initial best practices to use when defining your Kubernetes objects.

The following diagram from the Kubernetes documentation (where you can also get the icons for your architecture diagrams) shows the key Kubernetes objects arranged in a workload node. 

k8s-exposed-pod

The illustration shows:

Ingress controllers (ing) route the traffic from outside Kubernetes to the Service specified. The Service routes the traffic to the Pod (pod). The Pod runs one or more containers. The container image and the number of containers to run is specified in the Deployment (dep) which defines the number of Pods it wants as replicas. A Replica Set (rs) maintain the number of stable set of replica Pods running at any given time. The Horizontal Pod Autoscaler (hpa) scales the number of Pods based on observed CPU utilization or another metric. A resource quota, defined by a ResourceQuota (quota) object, provides constraints that limit aggregate resource consumption per namespace. The Namespace (ns) creates a virtual cluster to limit resource consumption and to control who has access to the resources in the namespace.

Let’s begin by defining the Pod, Service, and Ingress. 

Pod

A Pod is a group of one or more containers, with shared storage/network resources, and a specification for how to run the containers.

Kubernetes Pods are created and destroyed to match the state of your cluster. Pods are nonpermanent resources. If you use a Deployment to run your app, it can create and destroy Pods dynamically.

Pods are most often defined as Deployment objects.

You will typically describe your Pod in a manifest as a PodSpec. It is that portion of a manifest where you define the container. For example, the following is a part of a manifest that shows how to deploy SQL container.

apiVersion: apps/v1
kind: Deployment
metadata:
name: mssql-deployment
spec:
… ommitted ..
template:
… ommitted ..
spec:
… ommitted ..
containers:
name: mssql
image: mcr.microsoft.com/mssql/server:2019-latest
ports:
containerPort: 1433
env:
name: MSSQL_PID
value: "Developer"
name: ACCEPT_EULA
value: "Y"
name: SA_PASSWORD
valueFrom:
secretKeyRef:
name: mssql
key: SA_PASSWORD
… ommitted ..

Service

A Service resource lets you expose an application running in Pods to be reachable from outside your cluster. Or you can choose to publish services only for consumption inside your cluster.

A Service exposes an application running on a set of Pods as a network service. For example, if you want access to a Pod, you would expose the Pod as a Service.

A service routes traffic based upon the Node topology of the cluster. For example, a service can specify that traffic be preferentially routed to endpoints that are on the same Node as the client, or in the same availability zone. You can define a service to route to only local endpoints.

Every Service defined in the cluster (including the DNS server itself) is assigned a DNS name. By default, a client Pod’s DNS search list will include the Pod’s own namespace and the cluster’s default domain. For example:

“Normal” (not headless) Services are assigned a DNS A or AAAA record, depending on the IP family of the service, for a name of the form my-svc.my-namespace.svc.cluster-domain.example. This resolves to the cluster IP of the Service.

As such, the name of a Service object must be a valid DNS label name.

The following code defines a set of Pods that each listen on TCP port 9376 and carry a label app=MyApp.

apiVersion: v1
kind: Service
metadata:
name: mssql-deployment
spec:
selector:
app: mssql
ports:
protocol: TCP
port: 1433
targetPort: 1433
type: LoadBalancer

Ingress

Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster. Kubernetes as a project currently supports and maintains GCE and nginx controllers. You will probably use NGINX, which provides a reverse proxy and the Kubernetes load balancer.

For your user to be able to access the application, the cluster must have an ingress controller running.

The following sample shows a minimal ingress controller:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: minimal-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
http:
paths:
path: /testpath
pathType: Prefix
backend:
service:
name: test
port:
number: 80

(You can also integrate an an Azure Application Gateway to load balance between your AKS and virtual machines using the AKS Application Gateway Ingress Controller.)

The following illustration shows how Ingress, Pods, and Services are related.

For more information, see What Makes NGINX’s controllers different.

Deployment

The Deployment object defines the desired state of the Pods in your cluster. Use Deployments to create new ReplicaSets, or to remove existing Deployments and adopt all their resources with new Deployments. In the following example, a deployment named mssql-deployment is deployed to a single replica.

apiVersion: apps/v1
kind: Deployment
metadata:
name: mssql-deployment
spec:
replicas: 1
selector:
matchLabels:
app: mssql
template:
metadata:
labels:
app: mssql
spec:
containers:
name: mssql
image: mcr.microsoft.com/mssql/server:2019-latest
ports:
containerPort: 1433
.. environment and volume and other features omitted …

The Deployment creates a single Pod, indicated by the .spec.replicas field. And the image is specified by the .spec.spec.container.image field as mcr.microsoft.com/mssql/server:2019-latest.

ReplicaSet

A ReplicaSet maintain a stable set of replica Pods running at any given time. You define the ReplicaSet with how many Pods to maintain, and the Pod template describing how the Pod operates.

Important: Manage ReplicaSet using Deployment.

DeamonSet

When you want to run a Pod on each Node, use DeamonSet. For example, you may want to run a single Pond on each node so you can:

  • Run a cluster storage daemon on every node
  • Run a logs collection daemon on every node
  • Run a node monitoring daemon on every node

For more information, see DeamonSet.

Limits

Within a namespace, a Pod or Container can consume as much CPU and memory as defined by the namespace’s resource quota. There is a concern that one Pod or Container could monopolize all available resources.

You may want to restrict resource consumption and creation on a namespace basis.

A LimitRange is a policy to constrain resource allocations (to Pods or Containers) in a namespace.

PersistentVolumes

When Kubernetes restarts a Pod, which it can do at any time, the current state of the Pod is lost. Any data stored in files in the Pod are lost. In addition, Pods want to share files. Kubernetes supports Volumes.

A Docker volume is a directory on disk or in another container. Docker provides volume drivers, but the functionality is somewhat limited.

At its core, a volume is just a directory, possibly with some data in it, which is accessible to the containers in a pod. How that directory comes to be, the medium that backs it, and the contents of it are determined by the particular volume type used.

A Kubernetes volume can be specified as a directory on the local machine, awsElasticBlockStore, azureDisk, azureFiles, nfs, glusterfs or local volume that is a mounted local storage device such as a disk, partition or directory. This means your can decide where your data resides when you deploy the Kubernetes manifest.

The PersistentVolume subsystem provides an API for users and administrators to abstracts details of how storage is provided from how it is consumed.

It has two parts.

  • PersistentVolume(PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes.
  • PersistentVolumeClaim(PVC) is a request for storage by a user.

See Configure a Pod to Use a PersistentVolume for Storage for the steps in creating a PersistentVolume.

Labels

Labels are key/value pairs that are attached to objects, such as Pods.

Each object can have a set of key/value labels defined. Each Key must be unique for a given object.

Use labels to select groups of resources to watch or list.

The following illustration shows the concepts of labels among several other Kubernetes objects.

In the illustration, the colors of the labels show how the Services are related to the Pods.

Secrets

Kubernetes Secrets let you store and manage sensitive information, such as passwords, OAuth tokens, and ssh keys. Storing confidential information in a Secret is safer and more flexible than putting it verbatim in a Pod definition or in a container image.

  • SSH. The builtin type io/ssh-auth is provided for storing data used in SSH authentication. When using this Secret type, you will have to specify a ssh-privatekey key-value pair in the data (or stringData) field. as the SSH credential to use.
  • TLS. Kubernetes provides a builtin Secret type io/tls for to storing a certificate and its associated key that are typically used for TLS
  • The data used in the application may need to be encrypted at rest. See Encrypting Secret Data at Rest.

Jobs

A Job creates one or more Pods and ensures that a specified number of them successfully terminate.

A simple case is to create one Job object in order to reliably run one Pod to completion. The Job object will start a new Pod if the first Pod fails or is deleted (for example due to a node hardware failure or a node reboot).

You can specify Jobs to run:

  • Non-Parallel jobs
  • Parallel Jobs with a fixed completion count
  • Parallel Jobs with a work queue

You can create CronJob to run on repeating schedule.

Initial best practices

The following are best practices to start out in developing your applications for Kubernetes:

Each Pod begins by having a single container. A Pod is deployed using a Deployment object. A Service is deployed for each Pod that needs access to other Pods or outside Kubernetes. Labels should be applied to every Kubernetes resource object. Initially, we will not need to use namespaces but should implement them when we want to reuse names for multiple deployments or to implement authorization policies (especially Azure RBAC).  

Be aware of the constraints on resources within your namespace and begin setting memory resource limits on Pods and containers and CPU resources. Configure quality of service for Pods.

Summary

In this article you saw some snippets of PodSpec, Deployment, Ingress. You learned some high level definitions of some Kubernetes objects. And you learned about some best practices in getting started with Kubernetes.