Kubernetes Basics Explained
Kubernetes has become the de‑facto standard for orchestrating containers at scale, but its terminology can feel like a foreign language at first. In this guide we’ll demystify the core building blocks, walk through a few hands‑on examples, and show how real teams use Kubernetes to power everything from simple web apps to massive data pipelines.
What Is Kubernetes?
At its heart, Kubernetes (often shortened to K8s) is an open‑source platform that automates the deployment, scaling, and management of containerized workloads. Think of it as a sophisticated traffic controller that decides where each container should run, how many copies to keep alive, and how they talk to each other.
Unlike a single Docker host, a Kubernetes cluster consists of multiple machines—called nodes—working together as a single logical unit. This distributed design gives you high availability, fault tolerance, and the ability to grow or shrink resources on demand.
Why Use Kubernetes?
- Self‑healing: Failed containers are automatically restarted or rescheduled.
- Horizontal scaling: Increase or decrease replicas with a single command or metric‑based rule.
- Declarative configuration: Desired state is stored in YAML/JSON, and the control plane continuously works to match reality.
- Extensibility: Custom resources and operators let you embed domain‑specific logic directly into the cluster.
Pro tip: Start with a small “single‑node” cluster (e.g., Minikube or Kind) to experiment safely before moving to a multi‑node production setup.
Core Concepts at a Glance
Before diving into code, let’s get familiar with the key abstractions you’ll encounter daily. These concepts form the language of every Kubernetes manifest you’ll write.
- Node: A worker machine (VM or bare metal) that runs containers.
- Pod: The smallest deployable unit; one or more tightly coupled containers sharing network and storage.
- ReplicaSet: Ensures a specified number of pod replicas are running.
- Deployment: A higher‑level controller that manages ReplicaSets and provides rolling updates.
- Service: An abstract way to expose a set of pods as a network service.
- ConfigMap & Secret: Store configuration data and sensitive information respectively.
- Ingress: Rules for external HTTP/S traffic routing into the cluster.
Understanding how these pieces fit together makes it easier to reason about cluster behavior and troubleshoot issues.
Kubernetes Architecture
The control plane is the brain of the cluster. It runs a set of components that maintain the desired state, schedule workloads, and expose the API server.
- kube-apiserver: Central REST endpoint that all components interact with.
- etcd: Distributed key‑value store where the cluster’s configuration and state are persisted.
- kube-scheduler: Assigns pods to nodes based on resource availability and constraints.
- kube-controller-manager: Runs controllers (e.g., Deployment, Node) that reconcile actual state with desired state.
On each node, the kubelet ensures containers described by pod specs are running, while the kube-proxy handles network routing for Services.
Control Plane vs. Data Plane
The control plane (API server, scheduler, controller manager) is typically run on dedicated master nodes for high reliability. The data plane consists of the worker nodes where your application containers actually execute. Separating these concerns lets you scale compute resources independently of management overhead.
Pods: The Basic Execution Unit
A pod is more than just a container; it provides a shared network namespace (IP address and ports) and optional shared storage volumes. All containers in a pod can communicate via localhost, which simplifies multi‑container patterns such as sidecars.
Here’s a minimal pod manifest that runs an Nginx web server:
apiVersion: v1
kind: Pod
metadata:
name: nginx-demo
spec:
containers:
- name: nginx
image: nginx:1.25-alpine
ports:
- containerPort: 80
Deploy it with kubectl apply -f pod.yaml and then inspect the pod’s IP using kubectl get pod nginx-demo -o wide. You can curl that IP from any node in the same cluster.
Sidecar Pattern
Sidecars are auxiliary containers that augment the main container’s functionality—think log shippers, metrics collectors, or authentication proxies. Because they share the same pod, they have low‑latency access to the main container’s filesystem and network.
Pro tip: Keep sidecar containers lightweight and stateless. If they need persistent data, mount a shared volume so both containers can read/write safely.
Deployments: Managing Rollouts
While you can create pods directly, most production workloads use Deployments. A Deployment abstracts away the underlying ReplicaSet and gives you declarative updates, rollbacks, and scaling with a single YAML file.
Below is a complete Deployment that runs a simple Python Flask API, exposes it on port 5000, and ensures three replicas are always available:
apiVersion: apps/v1
kind: Deployment
metadata:
name: flask-api
labels:
app: flask
spec:
replicas: 3
selector:
matchLabels:
app: flask
template:
metadata:
labels:
app: flask
spec:
containers:
- name: flask
image: python:3.11-slim
command: ["python", "-m", "flask", "run", "--host=0.0.0.0"]
env:
- name: FLASK_APP
value: app.py
ports:
- containerPort: 5000
volumeMounts:
- name: app-code
mountPath: /app
volumes:
- name: app-code
configMap:
name: flask-app-config
---
apiVersion: v1
kind: ConfigMap
metadata:
name: flask-app-config
data:
app.py: |
from flask import Flask
app = Flask(__name__)
@app.route('/')
def hello():
return "Hello from Flask on Kubernetes!"
Apply the file with kubectl apply -f deployment.yaml. Kubernetes creates a ReplicaSet, which in turn spawns three pods that run the Flask container. If you update the image tag or change the replica count, the Deployment performs a rolling update without downtime.
Rolling Updates & Rollbacks
To trigger a new version, simply edit the image field (e.g., python:3.11-slim → python:3.12-slim) and re‑apply. Kubernetes will replace pods incrementally, respecting the maxUnavailable and maxSurge defaults. If something goes wrong, roll back with kubectl rollout undo deployment/flask-api.
Services: Stable Networking
Pods are ephemeral; their IP addresses can change whenever they are recreated. A Service provides a stable endpoint (a virtual IP) that forwards traffic to the current set of pods matching a label selector.
Here’s a Service that exposes the Flask Deployment on port 80 using a ClusterIP (internal only) and a NodePort (external access on each node’s port 30080):
apiVersion: v1
kind: Service
metadata:
name: flask-service
spec:
selector:
app: flask
type: NodePort
ports:
- protocol: TCP
port: 80 # Service port
targetPort: 5000 # Pod container port
nodePort: 30080 # Port on each node
After applying, you can reach the API at http://<node-ip>:30080/. Inside the cluster, other pods can call http://flask-service (the Service name resolves via DNS).
Service Types Overview
- ClusterIP: Default, reachable only inside the cluster.
- NodePort: Exposes the Service on a static port on each node.
- LoadBalancer: Provisions an external load balancer (supported by cloud providers).
- ExternalName: Maps a Service to an external DNS name.
ConfigMaps and Secrets
Hard‑coding configuration values inside container images makes updates painful. ConfigMaps let you inject non‑sensitive data (e.g., feature flags, URLs) into pods as environment variables or files.
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
LOG_LEVEL: "debug"
API_ENDPOINT: "https://api.example.com"
Reference the ConfigMap in a pod:
envFrom:
- configMapRef:
name: app-config
Secrets work the same way but store data in base64‑encoded form and are kept separate from regular ConfigMaps. For example, a database password:
apiVersion: v1
kind: Secret
metadata:
name: db-secret
type: Opaque
data:
username: bXl1c2Vy # echo -n "myuser" | base64
password: c2VjcmV0 # echo -n "secret" | base64
Mount the secret as environment variables or as a volume, and Kubernetes will ensure the values are not written to pod logs.
Pro tip: Enablekubectl get secretoutput masking with--show-labelsand avoid exposing raw base64 strings in CI pipelines.
Persistent Storage with PersistentVolumes
Stateless containers are easy to scale, but many applications need durable storage (databases, logs, media). Kubernetes abstracts storage through PersistentVolumes (PV) and PersistentVolumeClaims (PVC).
A typical workflow:
- Cluster admin creates a
PersistentVolumethat points to a physical storage backend (e.g., AWS EBS, GCE PD, NFS). - Developers request storage by creating a
PersistentVolumeClaimthat specifies size and access mode. - The control plane binds a matching PV to the PVC, and pods can mount the claim as a volume.
Example PVC for a 10Gi PostgreSQL data directory:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pg-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
Attach it to a pod:
volumeMounts:
- name: pgdata
mountPath: /var/lib/postgresql/data
volumes:
- name: pgdata
persistentVolumeClaim:
claimName: pg-data
Dynamic Provisioning
Most managed Kubernetes services support StorageClasses, which enable on‑the‑fly PV creation when a PVC is submitted. Simply add storageClassName: standard (or the cloud‑specific name) to the claim, and the cluster will provision the underlying disk automatically.
Helm: The Package Manager for Kubernetes
Writing raw YAML for every component can become repetitive. Helm introduces templating, versioning, and dependency management, allowing you to package an entire application stack as a chart.
A minimal Chart.yaml for our Flask app might look like:
apiVersion: v2
name: flask-api
description: A simple Flask API chart
type: application
version: 0.1.0
appVersion: "1.0"
The values.yaml file provides default configuration that users can override at install time:
replicaCount: 3
image:
repository: python
tag: "3.11-slim"
pullPolicy: IfNotPresent
service:
type: NodePort
port: 80
nodePort: 30080
Templates reference these values with Go‑style syntax, e.g., {{ .Values.image.repository }}:{{ .Values.image.tag }}. Install the chart with helm install flask-api ./flask-api, upgrade with helm upgrade, and roll back with helm rollback. Helm also tracks release history, making disaster recovery straightforward.
Pro tip: Store Helm charts in a private OCI registry (e.g., GitHub Packages) to enforce version control and access policies across teams.
Real‑World Use Case: Blue/Green Deployments
Imagine an e‑commerce site that must avoid any downtime during releases. Using Deployments and Services, you can implement a blue/green strategy without external tooling.
- Create two Deployments:
frontend-blueandfrontend-green, each with its own label (version: blueorversion: green). - Define a single Service that selects pods based on a label selector
app: frontendandversion: blue(initially). - When the new version is ready, update the Service selector to
version: green. Traffic shifts instantly, and you can monitor the new version before decommissioning the old one.
This approach leverages native Kubernetes objects, keeps the process declarative, and eliminates the need for complex load balancer tricks.
Step‑by‑Step Manifest Snippet
# Blue deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend-blue
spec:
replicas: 4
selector:
matchLabels:
app: frontend
version: blue
template:
metadata:
labels:
app: frontend
version: blue
spec:
containers:
- name: web
image: myshop/web:1.0
# Green deployment (new version)
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend-green
spec:
replicas: 4
selector:
matchLabels:
app: frontend
version: green
template:
metadata:
labels:
app: frontend
version: green
spec:
containers:
- name: web
image: myshop/web:2.0
# Service that points to blue initially
apiVersion: v1
kind: Service
metadata:
name: frontend-svc
spec:
selector:
app: frontend
version: blue # Change to "green" to switch traffic
ports:
- port: 80
targetPort: 8080
After testing the green pods, run kubectl patch service frontend-svc -p '{"spec":{"selector":{"app":"frontend","version":"green"}}}' to flip traffic.
Observability: Metrics, Logs, and Traces
Running containers without insight is like flying blind. Kubernetes integrates seamlessly with the CNCF observability