Docker images and containers: understanding the relationship

7-IV-26

You pulled ubuntu, you pulled nginx, you’ve been running containers and destroying them. But here’s the thing — do you actually know what an image is? Not the one-liner (“a template”), but what’s actually happening on your disk? Why does pulling node:22 take a while the first time and almost nothing the second? Why can you run ten containers from the same image and none of them affect each other?

If you got through lesson 2 with a vague intuition of how this works, this lesson is about turning that intuition into a mental model you can rely on. Because everything in Docker — Dockerfiles, image caching, volumes, multi-stage builds — makes a lot more sense once you understand what’s happening underneath.

What is a Docker image, really?

Let’s start with what an image is not. It’s not a file. It’s not a zip archive. It’s not a snapshot of a running system.

A Docker image is a stack of layers. Each layer is a set of filesystem changes — files added, files modified, files deleted. When you stack them all together, you get a complete filesystem that a container can run. Think of it like a Git commit history: each commit (layer) describes what changed, and the full history gives you the current state of the project.

This is called a union filesystem (OverlayFS is the most common implementation on Linux). Docker merges these read-only layers into a single coherent view, and that’s what the container sees as its filesystem.

Let’s make it concrete. Imagine an image built like this:

Layer 1 (Base): Ubuntu 22.04 — adds ~80MB of OS files
Layer 2: apt-get install python3 — adds Python interpreter
Layer 3: pip install flask — adds Flask and its dependencies
Layer 4: COPY app.py /app/ — adds your application code

Each layer only stores the diff from the previous one. And here’s where it gets efficient: if you have ten different images that all start from ubuntu:22.04, they all share that first layer on disk. Docker doesn’t download or store it ten times — it references the same layer. Pull a new image that shares base layers with something you already have, and Docker will say “already exists” for those layers.

docker pull python:3.12

3.12: Pulling from library/python
fa9b7e77b6bf: Already exists       ← base layers shared with other images
48be9699aae1: Already exists
...
b0c4ce42c5d0: Pull complete        ← new layers specific to python:3.12
Digest: sha256:...
Status: Downloaded newer image for python:3.12

That’s the layer cache doing its job.

Exploring layers

You can inspect exactly what layers an image contains:

docker image inspect nginx

This dumps a large JSON object. The relevant section is RootFS.Layers, which lists the SHA256 hashes of each layer. You can also see a higher-level view with:

docker image history nginx

IMAGE          CREATED        CREATED BY                                      SIZE
a7be6198544f   2 weeks ago    CMD ["nginx" "-g" "daemon off;"]               0B
<missing>      2 weeks ago    STOPSIGNAL SIGQUIT                             0B
<missing>      2 weeks ago    EXPOSE 80                                      0B
<missing>      2 weeks ago    COPY /etc/nginx /etc/nginx /                   4.61kB
<missing>      2 weeks ago    RUN /bin/sh -c apt-get update && apt-get ...   91.4MB
<missing>      2 weeks ago    ENV NGINX_VERSION=1.27.4                       0B
<missing>      2 weeks ago    FROM debian:bookworm-slim                      0B

Read it bottom to top: start with the base debian:bookworm-slim, install packages, copy Nginx config, set the startup command. Each line is a layer (or a metadata instruction that adds 0 bytes).

Image tags: this is not optional knowledge

When you run docker pull nginx, you’re actually running docker pull nginx:latest. The :latest is a tag, and latest is the default. Tags are mutable pointers — today’s nginx:latest is not the same image as nginx:latest from six months ago. It just points to the newest stable build.

In production, you never use latest. You pin to a specific version:

docker pull nginx:1.27.4         # Specific patch version — deterministic
docker pull nginx:1.27           # Minor version — gets patches
docker pull nginx:1              # Major version — gets minor updates too
docker pull nginx:latest         # ❌ Fine for local experiments, not for prod

Why? Because latest means your deployment changes every time you pull. Upgrading Nginx should be a deliberate decision, not a side effect of running docker pull.

There’s also a pattern you’ll see often: variant tags. The same version with different base images:

nginx:1.27.4-alpine    # Alpine Linux base (~5MB), minimal, fast to pull
nginx:1.27.4           # Debian base (~180MB), more compatible
nginx:1.27.4-slim      # Reduced Debian, middle ground

Alpine is popular for production because of its tiny size. The tradeoff: it uses musl libc instead of glibc, which can cause subtle compatibility issues with some software. For most things it’s fine; for others it’s a frustrating debugging session (because C library mismatches).

The container lifecycle

An image is static. A container is alive — and it has a lifecycle.

When you run docker run, the container doesn’t just appear in a running state. It goes through a series of states:

Created → Running → (Paused) → Stopped → Removed

Created: the container exists but hasn’t started yet. You can do this explicitly with docker create, or it’s an implicit step inside docker run.

Running: the main process is executing. The container consumes CPU, RAM, and filesystem resources.

Paused: the process is suspended (SIGSTOP). The container still exists in memory but isn’t doing anything. docker pause and docker unpause. Rarely used, but useful for debugging — freeze a container in place and inspect its state.

Stopped: the main process has exited. The container still exists as a stopped entity — its filesystem is preserved, its logs are still accessible. You can restart it with docker start.

Removed: the container is gone. Filesystem, logs, everything. You can’t bring it back.

# Walk through the lifecycle manually
docker create --name demo nginx          # Created
docker start demo                        # Running
docker pause demo                        # Paused
docker unpause demo                      # Running again
docker stop demo                         # Stopped (SIGTERM → SIGKILL after 10s)
docker rm demo                           # Removed

Or skip the ceremony and do everything at once:

docker run --rm nginx    # Creates, starts, and removes when done

The writable layer: why containers don’t modify images

Here’s a key detail that confuses people. If images are read-only, how can a container write files?

When Docker creates a container from an image, it adds a thin writable layer on top of all the read-only image layers. This is where everything the container writes goes — log files, temporary data, any apt-get install you do inside the container. The image layers beneath are never touched.

┌─────────────────────────────┐
│   Container writable layer  │  ← changes here only
├─────────────────────────────┤
│   Image layer 4 (app.py)    │  read-only
├─────────────────────────────┤
│   Image layer 3 (Flask)     │  read-only
├─────────────────────────────┤
│   Image layer 2 (Python)    │  read-only
├─────────────────────────────┤
│   Image layer 1 (Ubuntu)    │  read-only
└─────────────────────────────┘

When the container is removed, the writable layer disappears with it. The image is exactly as it was. Start another container from the same image and you get a fresh writable layer — a clean slate.

This is what makes containers ephemeral by design. They’re not meant to be permanent homes for data. Anything you want to persist beyond a container’s lifetime goes in a volume — but that’s a topic for later in the course.

Ten containers, one image

Let’s make this tangible. You can have ten containers running from a single image simultaneously:

# Start ten nginx containers, each on a different port
for i in $(seq 1 10); do
  docker run -d -p "808${i}:80" --name "nginx-${i}" nginx
done

docker ps

CONTAINER ID   IMAGE   COMMAND                  PORTS                  NAMES
a1b2c3d4e5f6   nginx   "/docker-entrypoint.…"   0.0.0.0:8081->80/tcp   nginx-1
b2c3d4e5f6a7   nginx   "/docker-entrypoint.…"   0.0.0.0:8082->80/tcp   nginx-2
...

Each one is completely isolated. Writing to the filesystem of nginx-1 doesn’t affect nginx-2. They all share the read-only Nginx image layers. On disk, the overhead of ten containers is ten tiny writable layers — not ten full copies of Nginx.

Clean up when done:

for i in $(seq 1 10); do
  docker stop "nginx-${i}" && docker rm "nginx-${i}"
done

Or more directly:

docker rm -f $(docker ps -aq --filter name=nginx-)

Pulling, listing, and removing images

A few commands you’ll use constantly:

# Download an image without running it
docker pull node:22-alpine

# List all locally available images
docker images

REPOSITORY   TAG          IMAGE ID       CREATED        SIZE
nginx        latest       a7be6198544f   2 weeks ago    192MB
node         22-alpine    f7d2a4e85d2c   3 weeks ago    141MB
ubuntu       22.04        3db8720ecbf5   4 weeks ago    77.9MB
python       3.12         b3a18c9e2f1d   3 weeks ago    1.02GB

# Remove an image (only works if no containers — running or stopped — reference it)
docker rmi nginx

# Force remove (also removes derived containers)
docker rmi -f nginx

# Remove all unused images
docker image prune
docker image prune -a    # Also removes images with no running containers (more aggressive)

⚠️ You can’t remove an image while a container (even a stopped one) references it. Remove the container first, then the image.

Naming containers: don’t skip this

If you don’t name your containers, Docker assigns random adjective-noun combinations (ecstatic_hopper, confident_kepler). Endearing, but useless for anything beyond a quick experiment.

# Always name your containers
docker run -d --name web-server nginx
docker run -d --name api-backend node:22-alpine node app.js
docker run -d --name db-primary postgres:16

Named containers are easier to reference in every subsequent command:

docker logs web-server
docker exec -it web-server bash
docker stop web-server

And when you use Docker Compose (coming later), naming becomes even more important because services discover each other by name over Docker’s internal network.

The image-container relationship is the foundation of everything Docker does. Layers explain the cache behavior, the efficiency of pulling images, and why modifying a container doesn’t affect others sharing the same image. The lifecycle explains why containers are designed to be disposable — and why data persistence requires a different mechanism.

In the next lesson we’ll go hands-on with interactive containers and exec: running shells inside containers, copying files in and out, and understanding when to use -it vs -d and why the difference matters.

Never stop coding!

💡 Challenge: Pull three different images (nginx, node:22-alpine, python:3.12-slim). Create two containers from nginx, stop one, remove the other. View the logs of the remaining container. Check what layers nginx and python:3.12-slim have in common using docker image inspect. Clean everything up with docker system prune.

Mastering Docker from Scratch

Docker

Docker images and containers: understanding the relationship