40+ Docker Interview Questions 2025: Images, Dockerfile & Multi-Stage Builds

·22 min read
dockercontainersdevopsdockerfiledocker-composeinterview-preparation

Docker has become essential knowledge for developers. Whether you're deploying applications, setting up local development environments, or working with CI/CD pipelines, interviewers expect you to understand containerization fundamentals.

This guide covers the Docker questions that actually come up in interviews—from basic concepts to production-ready Dockerfile patterns.

Table of Contents

  1. Docker Fundamentals Questions
  2. Image and Container Questions
  3. Dockerfile CMD vs ENTRYPOINT Questions
  4. Layer Caching Questions
  5. Multi-Stage Build Questions
  6. COPY vs ADD Questions
  7. Docker Compose Questions
  8. Volume and Bind Mount Questions
  9. Docker Networking Questions
  10. Environment Variables and Secrets Questions
  11. Docker Commands Questions
  12. Dockerfile Best Practices Questions

Docker Fundamentals Questions

Understanding Docker fundamentals is essential for any developer interview involving containerization.

What is Docker and why is it important for modern development?

Docker is a platform that packages applications and their dependencies into standardized units called containers. Unlike virtual machines that virtualize hardware, Docker containers share the host operating system kernel, making them lightweight and fast to start.

The importance of Docker stems from solving the "works on my machine" problem. By packaging everything an application needs—code, runtime, libraries, and system tools—into a container, developers ensure consistent behavior across development, testing, and production environments.

Key benefits:

  • Consistency: Same container runs identically everywhere
  • Isolation: Applications don't interfere with each other
  • Portability: Move containers between any Docker host
  • Efficiency: Containers share OS kernel, using fewer resources than VMs

What is the difference between Docker and a virtual machine?

Docker containers and virtual machines both provide isolation, but they achieve it through fundamentally different approaches. Virtual machines include a full operating system with its own kernel, running on a hypervisor that virtualizes hardware. Docker containers share the host's kernel and only package the application layer.

This architectural difference has significant implications. VMs typically use gigabytes of memory and take minutes to boot because they load an entire OS. Containers use megabytes and start in seconds because they only initialize the application process.

AspectDocker ContainerVirtual Machine
SizeMBsGBs
StartupSecondsMinutes
Resource usageShares host kernelFull OS per VM
IsolationProcess-levelHardware-level
PortabilityHigh (any Docker host)Lower (hypervisor dependent)

Image and Container Questions

The distinction between images and containers is foundational to understanding Docker.

What is the difference between a Docker image and a container?

This is often the first Docker question in an interview. Many candidates give vague answers like "an image is a template and a container is running," but interviewers expect more depth.

An image is a read-only, layered filesystem containing everything needed to run an application: code, runtime, libraries, environment variables, and configuration. Images are built from Dockerfiles and can be shared via registries like Docker Hub. Because images are immutable, they provide reproducibility—the same image produces the same behavior every time.

A container is a running instance of an image. When you start a container, Docker adds a thin writable layer on top of the image layers using a copy-on-write mechanism. This is where runtime changes like logs, temp files, and application state are stored. The underlying image remains unchanged, allowing you to create multiple containers from one image, each with its own isolated writable layer.

# Image: the blueprint
docker pull node:20-alpine
 
# Container: running instance
docker run -d --name my-app node:20-alpine
 
# Multiple containers from same image
docker run -d --name my-app-2 node:20-alpine
docker run -d --name my-app-3 node:20-alpine

What are Docker image layers and how do they work?

Docker images are built as a series of layers, where each layer represents an instruction in the Dockerfile. This layered architecture enables efficient storage and transfer because layers can be shared between images. If two images use the same base layer, Docker stores it only once.

When Docker builds an image, it creates a read-only layer for each instruction. These layers stack on top of each other. The union filesystem makes them appear as a single coherent filesystem to the container.

Layer creation example:

FROM node:20-alpine     # Layer 1: base image
WORKDIR /app            # Layer 2: sets working directory
COPY package.json ./    # Layer 3: copies package.json
RUN npm install         # Layer 4: installs dependencies
COPY . .                # Layer 5: copies source code

What is the Docker build context?

The build context is the set of files and directories that Docker sends to the daemon when building an image. When you run docker build ., the dot specifies the current directory as the build context. Docker packages everything in that directory and sends it to the daemon.

Understanding build context matters for two reasons. First, a large build context slows down builds because all files must be transferred. Second, only files within the build context can be accessed by COPY and ADD instructions—you cannot copy files from outside the context.

The .dockerignore file excludes files from the build context, similar to .gitignore. This speeds up builds and prevents accidentally including sensitive files or large directories like node_modules.

# .dockerignore
node_modules
.git
*.log
.env
dist

Dockerfile CMD vs ENTRYPOINT Questions

Understanding the difference between CMD and ENTRYPOINT is crucial for writing flexible Dockerfiles.

What is the difference between CMD and ENTRYPOINT?

This trips up many developers because both seem to "run a command." The key difference lies in how they handle arguments passed at runtime.

CMD provides default arguments that can be completely overridden when running the container. If you specify a command after docker run, it replaces CMD entirely.

FROM node:20-alpine
CMD ["npm", "start"]
# Uses CMD default
docker run my-app
 
# Overrides CMD entirely - runs npm test instead
docker run my-app npm test

ENTRYPOINT defines the main executable that always runs. Arguments passed to docker run are appended to ENTRYPOINT rather than replacing it.

FROM node:20-alpine
ENTRYPOINT ["node"]
CMD ["app.js"]
# Runs: node app.js
docker run my-app
 
# Runs: node server.js (CMD overridden, ENTRYPOINT stays)
docker run my-app server.js

What is the best practice for combining CMD and ENTRYPOINT?

The recommended pattern uses ENTRYPOINT for the main executable and CMD for default arguments. This gives users flexibility to change arguments while keeping the core process consistent.

ENTRYPOINT ["node"]
CMD ["app.js"]

This lets users change the script file while keeping node as the process. For production applications, you might use an entrypoint script that handles initialization:

ENTRYPOINT ["/docker-entrypoint.sh"]
CMD ["start"]

The entrypoint script can perform tasks like running database migrations, validating environment variables, or setting up configuration before executing the CMD.

What is the difference between shell form and exec form?

Dockerfile instructions like CMD, ENTRYPOINT, and RUN can be written in two forms. The exec form uses JSON array syntax and executes commands directly. The shell form uses a plain string and runs commands through a shell.

Exec form is preferred because it doesn't invoke a shell, which means the process receives signals properly and PID 1 is your application, not /bin/sh. This matters for graceful shutdown handling.

# Exec form (preferred) - runs node directly as PID 1
CMD ["node", "app.js"]
 
# Shell form - runs /bin/sh -c "node app.js"
CMD node app.js

The shell form is useful when you need shell features like variable expansion or piping, but for most cases exec form provides cleaner signal handling.


Layer Caching Questions

Understanding layer caching separates junior from senior Docker users.

How does Docker layer caching work?

Docker executes each Dockerfile instruction and caches the result as a layer. On subsequent builds, if an instruction and all previous layers haven't changed, Docker reuses the cached layer instead of executing the instruction again.

The cache invalidation rule is critical: when any layer changes, all subsequent layers must rebuild. Docker cannot skip to a later layer if an earlier one changed.

How do you optimize a Dockerfile for layer caching?

The key principle is ordering instructions from least frequently changed to most frequently changed. This maximizes cache hits and minimizes rebuild time.

A common mistake is copying all source files before installing dependencies:

# Bad: cache busted on every code change
FROM node:20-alpine
WORKDIR /app
COPY . .
RUN npm install
CMD ["npm", "start"]

Every code change copies new files, invalidating the npm install cache. The optimized approach separates dependency installation from source code:

# Optimized: dependencies cached separately
FROM node:20-alpine
WORKDIR /app
 
# Dependencies change less often than code
COPY package*.json ./
RUN npm ci
 
# Code changes frequently - only this layer rebuilds
COPY . .
 
CMD ["npm", "start"]

Now npm ci only reruns when package.json or package-lock.json change.

Why should you use npm ci instead of npm install in Docker?

The npm ci command is designed for automated environments like Docker builds and CI pipelines. It provides reproducible builds by installing exact versions from package-lock.json rather than resolving versions from package.json.

Unlike npm install, npm ci removes existing node_modules before installing, ensuring a clean state. It also fails if the lock file is out of sync with package.json, catching dependency issues early.

# Reproducible, clean dependency installation
COPY package*.json ./
RUN npm ci

Multi-Stage Build Questions

Multi-stage builds are essential for production-ready images.

What are multi-stage builds and why use them?

Multi-stage builds use multiple FROM statements in a single Dockerfile, allowing you to use full build toolchains without shipping them in your final image. Each FROM instruction starts a new stage, and you can copy artifacts from earlier stages into later ones.

The problem with single-stage builds is image bloat. A typical Node.js image with build tools, dev dependencies, and source code might be over a gigabyte:

# Single-stage: 1.2GB image
FROM node:20
WORKDIR /app
COPY . .
RUN npm ci && npm run build
CMD ["node", "dist/index.js"]

Multi-stage builds separate the build environment from the runtime environment:

# Stage 1: Build
FROM node:20 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
 
# Stage 2: Production (150MB)
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["node", "dist/index.js"]

How do you create minimal production images with multi-stage builds?

For the smallest possible images, install only production dependencies in the final stage rather than copying all node_modules from the build stage:

# Stage 1: Build
FROM node:20 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
 
# Stage 2: Production deps only (80MB)
FROM node:20-alpine AS production
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY --from=builder /app/dist ./dist
CMD ["node", "dist/index.js"]

This approach eliminates build tools, dev dependencies, and source code from the production image, reducing attack surface and deployment time.

What are the security benefits of multi-stage builds?

Multi-stage builds improve security by reducing the attack surface of production images. Fewer packages mean fewer potential vulnerabilities. Build tools, compilers, and dev dependencies that aren't needed at runtime don't exist in the final image.

Additionally, source code doesn't ship in production images—only compiled artifacts. This prevents accidental exposure of sensitive code or configuration that might be in the source tree.


COPY vs ADD Questions

Understanding when to use COPY versus ADD demonstrates attention to best practices.

What is the difference between COPY and ADD in a Dockerfile?

COPY does exactly what it says—copies files from the build context to the image. It's straightforward and predictable.

COPY package.json ./
COPY src/ ./src/

ADD has two extra features beyond COPY. First, it automatically extracts tar archives. Second, it can download files from URLs, though this isn't recommended.

# Extracts the tar into /app
ADD app.tar.gz /app/
 
# Downloads file (avoid this - not cached, no checksum)
ADD https://example.com/file.txt /app/

When should you use COPY versus ADD?

Best practice is to always use COPY unless you specifically need tar extraction. COPY is more explicit and its behavior is predictable—it only copies files.

The URL download feature of ADD is problematic because it doesn't cache well and provides no way to verify checksums. For downloads, use curl or wget in a RUN instruction where you can verify integrity:

# Better than ADD for downloads
RUN curl -fsSL https://example.com/file.txt -o /app/file.txt \
    && echo "expected-checksum /app/file.txt" | sha256sum -c -

Docker Compose Questions

Docker Compose is essential for local development environments.

What is Docker Compose and when would you use it?

Docker Compose defines multi-container applications in a single YAML file. Instead of running multiple docker run commands with complex options, you declare your entire application stack and start it with one command.

The primary use case is local development environments that mirror production. A typical web application needs an app server, database, and cache—Compose orchestrates all three with proper networking and volume mounts.

# docker-compose.yml
version: '3.8'
 
services:
  app:
    build: .
    ports:
      - "3000:3000"
    volumes:
      - .:/app
      - /app/node_modules
    environment:
      - NODE_ENV=development
      - DATABASE_URL=postgres://user:pass@db:5432/myapp
    depends_on:
      - db
      - redis
 
  db:
    image: postgres:15-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=pass
      - POSTGRES_DB=myapp
 
  redis:
    image: redis:7-alpine
 
volumes:
  postgres_data:

How does service discovery work in Docker Compose?

Docker Compose creates a default network for each project, and services can reach each other by service name. In the example above, the app service connects to the database using db:5432 because Compose's internal DNS resolves db to the database container's IP.

This automatic service discovery eliminates hardcoded IP addresses. You reference other services by their name in the compose file, and Docker handles the networking.

# Start all services
docker-compose up -d
 
# View logs
docker-compose logs -f app
 
# Rebuild after Dockerfile changes
docker-compose up -d --build
 
# Clean shutdown
docker-compose down
 
# Remove volumes too
docker-compose down -v

What is the purpose of depends_on in Docker Compose?

The depends_on option controls startup order—Compose starts dependencies before the dependent service. However, it only waits for containers to start, not for applications inside them to be ready.

For example, depends_on: db ensures the database container starts before the app container, but it doesn't wait for PostgreSQL to accept connections. For true readiness checking, use healthchecks:

services:
  db:
    image: postgres:15-alpine
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user"]
      interval: 5s
      timeout: 5s
      retries: 5
 
  app:
    depends_on:
      db:
        condition: service_healthy

Volume and Bind Mount Questions

Understanding data persistence options is crucial for production Docker usage.

What is the difference between volumes and bind mounts?

Both persist data outside the container's writable layer, but they work differently and serve different purposes.

Bind mounts link a host path directly to a container path. The container sees the host filesystem at that location. This is ideal for development because file changes on the host immediately appear in the container.

docker run -v /host/path:/container/path myapp
# or explicitly
docker run --mount type=bind,source=/host/path,target=/container/path myapp

Volumes are managed by Docker and stored in Docker's storage area (/var/lib/docker/volumes/). They're portable between hosts, easier to backup, and can use volume drivers for cloud storage.

docker volume create mydata
docker run -v mydata:/container/path myapp
# or explicitly
docker run --mount type=volume,source=mydata,target=/container/path myapp

When should you use volumes versus bind mounts?

The choice depends on your use case. Bind mounts excel in development where you need live code synchronization. Volumes are better for production data that needs to persist and potentially move between hosts.

Use CaseChoice
Development (code sync)Bind mount
Database dataVolume
Shared config filesBind mount
Production data persistenceVolume
CI/CD cachingVolume

What is the anonymous volume pattern in Docker Compose?

When using bind mounts for development, you often need to prevent the host's node_modules from overwriting the container's. The anonymous volume pattern preserves the container's directory:

volumes:
  - .:/app              # Bind mount for code sync
  - /app/node_modules   # Anonymous volume preserves container's modules

The second mount creates a volume at /app/node_modules inside the container, which takes precedence over the bind mount. This ensures the container uses its own installed dependencies while still syncing source code from the host.


Docker Networking Questions

Container networking is fundamental to multi-service architectures.

How do containers communicate with each other?

Docker provides several network modes, each suited to different scenarios. The default bridge network allows containers to communicate using IP addresses, but custom bridge networks enable DNS-based service discovery.

When containers join the same custom network, they can reach each other by container name. Docker's embedded DNS server resolves names to container IP addresses.

# Create custom network
docker network create mynet
 
# Containers can reach each other by name
docker run -d --name api --network mynet myapi
docker run -d --name web --network mynet myweb
 
# From 'web', can reach: http://api:3000

What are the different Docker network types?

Docker supports several network drivers, each providing different isolation and connectivity characteristics.

Bridge is the default for standalone containers. Containers on the same bridge can communicate. Custom bridge networks add DNS resolution by container name.

Host removes network isolation—the container shares the host's network stack. No port mapping needed, but no isolation either. Useful for performance-sensitive applications.

docker run --network host myapp
# App on port 3000 is directly on host:3000

None disables networking entirely, providing complete isolation for security-sensitive workloads.

Overlay enables communication across multiple Docker hosts, used in Swarm mode for distributed applications.

How does port mapping work in Docker?

Port mapping exposes container ports to the host, allowing external access. The format is host-port:container-port. The container listens on its port, and Docker forwards traffic from the host port.

# Map host port 8080 to container port 3000
docker run -p 8080:3000 myapp
 
# Map to specific interface
docker run -p 127.0.0.1:8080:3000 myapp
 
# Random host port
docker run -p 3000 myapp

A common troubleshooting issue: if you can't connect to a mapped port, verify the application inside the container binds to 0.0.0.0, not localhost. Binding to localhost inside a container means the app only accepts connections from within that container.


Environment Variables and Secrets Questions

Configuration management is critical for containerized applications.

How do you pass environment variables to Docker containers?

Environment variables can be set at build time in the Dockerfile or at runtime when starting containers. Runtime variables are more flexible and should be used for configuration that varies between environments.

# Default in Dockerfile
ENV NODE_ENV=production
ENV PORT=3000
# Override at runtime
docker run -e NODE_ENV=development -e PORT=8080 myapp
 
# From file
docker run --env-file .env myapp

In Docker Compose, environment variables can come from the compose file, an env_file, or the host environment:

services:
  app:
    environment:
      - NODE_ENV=development
    env_file:
      - .env

How do you handle secrets securely in Docker?

Secrets should never be baked into images or passed as regular environment variables in production. Even if you delete secrets in a later layer, they exist in earlier layers and can be extracted.

For development, .env files with Docker Compose are acceptable. For production, use dedicated secret management:

Docker Swarm secrets:

echo "mysecret" | docker secret create db_password -

Mount from secret manager:

volumes:
  - /run/secrets/db_password:/run/secrets/db_password:ro

Inject at runtime via orchestrators like Kubernetes (with secrets), AWS Secrets Manager, or HashiCorp Vault.

What is the ARG instruction and how does it differ from ENV?

ARG defines build-time variables that are only available during image building. They don't persist in the final image. ENV sets environment variables that persist in the image and are available to running containers.

# ARG: only available during build
ARG NODE_VERSION=20
 
FROM node:${NODE_VERSION}-alpine
 
# ENV: persists in image and container
ENV NODE_ENV=production

Use ARG for build configuration like version numbers. Use ENV for runtime configuration. You can pass ARG values at build time:

docker build --build-arg NODE_VERSION=18 -t myapp .

Docker Commands Questions

Interviewers often test your practical Docker command knowledge.

What are the essential Docker commands for managing images?

Understanding the image lifecycle commands demonstrates practical Docker experience. Images are built from Dockerfiles, can be tagged for organization, pushed to registries for sharing, and removed when no longer needed.

# Build an image with tag
docker build -t myapp:v1 .
 
# List local images
docker images
 
# Remove an image
docker rmi myapp:v1
 
# Remove unused images
docker image prune
 
# Pull from registry
docker pull node:20-alpine
 
# Push to registry
docker push myregistry/myapp:v1

What are the essential Docker commands for managing containers?

Container management involves the full lifecycle: creating, starting, stopping, inspecting, and removing containers. The distinction between create, start, and run is important—run combines create and start.

# Run container (create + start)
docker run -d -p 3000:3000 --name app myapp
 
# List running containers
docker ps
 
# List all containers (including stopped)
docker ps -a
 
# Stop container
docker stop app
 
# Remove container
docker rm app
 
# View logs
docker logs app
 
# Follow logs in real-time
docker logs -f app

How do you debug a running Docker container?

Debugging containers requires different approaches than debugging local applications. The exec command lets you run commands inside a running container, including opening an interactive shell.

# Shell into container
docker exec -it app sh
 
# Run specific command
docker exec app cat /app/config.json
 
# View full container details
docker inspect app
 
# Monitor resource usage
docker stats
 
# View processes in container
docker top app

Common troubleshooting scenarios:

  • Container exits immediately? Check docker logs and ensure the process runs in foreground
  • Can't connect to port? Verify port mapping with docker ps, check if app binds to 0.0.0.0
  • Build slow? Check .dockerignore, optimize layer order
  • Image too large? Use multi-stage builds, alpine base images

Dockerfile Best Practices Questions

Following best practices demonstrates production experience.

What are the most important Dockerfile best practices?

Writing production-ready Dockerfiles requires attention to security, performance, and maintainability. These practices represent industry consensus on building reliable container images.

PracticeExample
Use specific base image tagsnode:20-alpine not node:latest
Run as non-root userUSER node
Use multi-stage buildsSeparate build and runtime stages
Order for cache efficiencyDependencies before source code
Combine RUN commandsReduce layers with &&
Use .dockerignoreExclude node_modules, .git, logs
Prefer COPY over ADDUnless you need tar extraction
Set explicit WORKDIRWORKDIR /app
Include health checksHEALTHCHECK CMD curl -f http://localhost/health

Why should you avoid using the latest tag?

The latest tag seems convenient but creates reproducibility problems. When you build with FROM node:latest, you get whatever version is current at build time. Two builds on different days might produce different images with different Node.js versions.

Using specific tags like node:20.10-alpine ensures reproducible builds. You control when to upgrade dependencies by explicitly changing the tag.

# Bad: unpredictable version
FROM node:latest
 
# Good: explicit, reproducible
FROM node:20.10-alpine

How do you run containers as non-root users?

By default, processes in Docker containers run as root. This is a security risk—if an attacker escapes the container, they have root access to the host.

Best practice is to create and switch to a non-root user. Many official images include a non-root user you can use:

FROM node:20-alpine
 
WORKDIR /app
COPY --chown=node:node . .
 
# Switch to non-root user
USER node
 
CMD ["node", "app.js"]

The --chown flag ensures copied files are owned by the non-root user, preventing permission issues.

How do you implement health checks in Docker?

Health checks let Docker monitor whether your application is functioning correctly, not just whether the process is running. Docker periodically executes the health check command and marks containers as healthy or unhealthy.

FROM node:20-alpine
 
WORKDIR /app
COPY . .
 
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:3000/health || exit 1
 
CMD ["node", "app.js"]

Orchestrators like Kubernetes and Docker Swarm use health status to make decisions about container replacement and load balancing.


Quick Reference

TopicKey Point
Image vs ContainerImage is read-only template; container is running instance with writable layer
CMD vs ENTRYPOINTCMD provides overridable defaults; ENTRYPOINT defines fixed executable
Layer cachingOrder Dockerfile from least to most frequently changed
Multi-stageBuild in one stage, run in another for smaller images
COPY vs ADDUse COPY; ADD only for tar extraction
Volumes vs Bind mountsVolumes for production data; bind mounts for development
NetworkingCustom bridge networks enable DNS by container name
SecretsNever bake into images; use secret managers in production

Ready to ace your interview?

Get 550+ interview questions with detailed answers in our comprehensive PDF guides.

View PDF Guides