Episode 6 — Scaling Reliability Microservices Web3 / 6.3 — AWS Cloud Native Deployment

6.3.a — ECR and Container Images

In one sentence: Amazon Elastic Container Registry (ECR) is a fully managed Docker registry where you store, manage, and deploy container images — the first step in any AWS container deployment pipeline.

Navigation: ← 6.3 Overview · 6.3.b — ECS and Fargate →


1. What Is Amazon ECR?

Amazon Elastic Container Registry (ECR) is AWS's managed container image registry. Think of it as a private Docker Hub that lives inside your AWS account. Instead of pushing images to Docker Hub (public, shared), you push to ECR (private, integrated with AWS IAM, encrypted at rest).

┌───────────────────────────────────────────────┐
│                Docker Hub                      │
│    Public registry — anyone can pull           │
│    docker pull nginx:latest                    │
└───────────────────────────────────────────────┘

┌───────────────────────────────────────────────┐
│                Amazon ECR                      │
│    Private registry — IAM-controlled access    │
│    docker pull 123456789.dkr.ecr.us-east-1    │
│                .amazonaws.com/my-app:v1.2      │
└───────────────────────────────────────────────┘

Why ECR over Docker Hub?

FeatureDocker Hub (Free)Amazon ECR
PrivacyPublic by defaultPrivate by default
Pull limits100 pulls / 6 hoursUnlimited within AWS
AuthDocker credentialsIAM roles (no passwords in CI/CD)
EncryptionNot at restAES-256 at rest by default
Vulnerability scanningBasic (paid)Built-in with Amazon Inspector
IntegrationManualNative ECS, EKS, Lambda integration
Lifecycle policiesManual cleanupAutomated image expiration
CostFree (with limits)~$0.10/GB/month storage

2. ECR Concepts

Repositories

A repository in ECR holds images for a single application or service. One repository per microservice is the standard pattern.

ECR Account (123456789012)
├── user-service          ← repository
│   ├── user-service:v1.0
│   ├── user-service:v1.1
│   └── user-service:latest
├── order-service         ← repository
│   ├── order-service:v2.3
│   └── order-service:latest
└── api-gateway           ← repository
    └── api-gateway:v1.0

Image Tags

Tags identify specific versions of an image within a repository. Best practices:

  • Never rely solely on latest — it's mutable and you can't tell which version is running
  • Use semantic versions: v1.2.3
  • Use git SHA: abc1234 — ties the image to a specific commit
  • Use both: v1.2.3 + abc1234 on the same image

Image URI

Every ECR image has a fully qualified URI:

<account-id>.dkr.ecr.<region>.amazonaws.com/<repository>:<tag>

Example:
123456789012.dkr.ecr.us-east-1.amazonaws.com/user-service:v1.2.3

3. Creating an ECR Repository

Via AWS CLI

# Create a repository for the user-service
aws ecr create-repository \
  --repository-name user-service \
  --image-scanning-configuration scanOnPush=true \
  --encryption-configuration encryptionType=AES256 \
  --region us-east-1

# Output includes the repository URI:
# 123456789012.dkr.ecr.us-east-1.amazonaws.com/user-service

Via AWS Management Console

  1. Open the ECR console
  2. Click Create repository
  3. Set name: user-service
  4. Enable Scan on push
  5. Select AES-256 encryption
  6. Click Create repository

Creating Multiple Repositories (for microservices)

# Create repos for each service in your architecture
for service in user-service order-service payment-service api-gateway; do
  aws ecr create-repository \
    --repository-name "$service" \
    --image-scanning-configuration scanOnPush=true \
    --region us-east-1
done

4. Building Docker Images (Dockerfile Best Practices for Node.js)

Before pushing to ECR, you need a well-crafted Docker image. Here is a production-ready Dockerfile for a Node.js microservice:

Basic Dockerfile

# --- Stage 1: Install dependencies ---
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --only=production

# --- Stage 2: Build (if using TypeScript) ---
FROM node:20-alpine AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build

# --- Stage 3: Production image ---
FROM node:20-alpine AS production
WORKDIR /app

# Security: run as non-root user
RUN addgroup -g 1001 -S appgroup && \
    adduser -S appuser -u 1001 -G appgroup

# Copy only what we need
COPY --from=deps /app/node_modules ./node_modules
COPY --from=build /app/dist ./dist
COPY package.json ./

# Set environment
ENV NODE_ENV=production
ENV PORT=3000

# Expose port
EXPOSE 3000

# Switch to non-root user
USER appuser

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

# Start the application
CMD ["node", "dist/server.js"]

Why Multi-Stage Builds Matter

Single-stage image:
  node:20          → 1.1 GB base
  + devDependencies → 400 MB
  + source code     → 50 MB
  = ~1.55 GB final image

Multi-stage image:
  node:20-alpine   → 180 MB base
  + production deps → 80 MB
  + compiled output → 10 MB
  = ~270 MB final image   (83% smaller!)

Smaller images mean:

  • Faster pulls — ECS starts your container faster
  • Lower ECR storage costs — less GB stored
  • Smaller attack surface — fewer packages = fewer vulnerabilities
  • Faster CI/CD — less to build, push, and pull

Dockerfile Best Practices Checklist

PracticeWhy
Use alpine base images5x smaller than full Debian images
Use npm ci (not npm install)Deterministic installs from lockfile
Multi-stage buildsExclude dev dependencies and build tools
COPY package*.json firstLeverage Docker layer caching
Non-root USERSecurity — limits container privileges
.dockerignore fileExclude node_modules, .git, .env
HEALTHCHECK instructionECS/ALB can verify container health
Pin exact base image versionsnode:20.11.0-alpine not node:20-alpine

Essential .dockerignore

node_modules
npm-debug.log
.git
.gitignore
.env
.env.*
Dockerfile
docker-compose*.yml
.dockerignore
README.md
.vscode
coverage
.nyc_output

5. Building and Tagging Images

# Build the image
docker build -t user-service:v1.2.3 .

# Verify the image was created
docker images user-service
# REPOSITORY     TAG     IMAGE ID       SIZE
# user-service   v1.2.3  a1b2c3d4e5f6   268MB

# Tag for ECR (add the full ECR URI)
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
REGION=us-east-1
ECR_URI="${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com"

docker tag user-service:v1.2.3 ${ECR_URI}/user-service:v1.2.3
docker tag user-service:v1.2.3 ${ECR_URI}/user-service:latest

# Verify tags
docker images ${ECR_URI}/user-service
# REPOSITORY                                        TAG     IMAGE ID       SIZE
# 123456789012.dkr.ecr.us-east-1.../user-service   v1.2.3  a1b2c3d4e5f6   268MB
# 123456789012.dkr.ecr.us-east-1.../user-service   latest  a1b2c3d4e5f6   268MB

6. Authenticating and Pushing to ECR

ECR uses temporary tokens — you authenticate Docker with ECR before pushing. The token lasts 12 hours.

Step-by-step push workflow

# Step 1: Authenticate Docker with ECR
aws ecr get-login-password --region us-east-1 | \
  docker login --username AWS --password-stdin \
  ${ACCOUNT_ID}.dkr.ecr.us-east-1.amazonaws.com

# Output: "Login Succeeded"

# Step 2: Push the image
docker push ${ECR_URI}/user-service:v1.2.3
docker push ${ECR_URI}/user-service:latest

# Step 3: Verify in ECR
aws ecr describe-images \
  --repository-name user-service \
  --region us-east-1

# Output shows image digest, tags, size, push timestamp

Complete Build-and-Push Script

#!/bin/bash
# deploy-image.sh — Build and push a service image to ECR
set -euo pipefail

SERVICE_NAME=${1:?"Usage: $0 <service-name> <version>"}
VERSION=${2:?"Usage: $0 <service-name> <version>"}
REGION=${AWS_REGION:-us-east-1}

ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
ECR_URI="${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com"
IMAGE_URI="${ECR_URI}/${SERVICE_NAME}"

echo "==> Building ${SERVICE_NAME}:${VERSION}"
docker build -t "${SERVICE_NAME}:${VERSION}" .

echo "==> Tagging for ECR"
docker tag "${SERVICE_NAME}:${VERSION}" "${IMAGE_URI}:${VERSION}"
docker tag "${SERVICE_NAME}:${VERSION}" "${IMAGE_URI}:latest"

echo "==> Authenticating with ECR"
aws ecr get-login-password --region "${REGION}" | \
  docker login --username AWS --password-stdin "${ECR_URI}"

echo "==> Pushing to ECR"
docker push "${IMAGE_URI}:${VERSION}"
docker push "${IMAGE_URI}:latest"

echo "==> Done! Image: ${IMAGE_URI}:${VERSION}"

Usage:

chmod +x deploy-image.sh
./deploy-image.sh user-service v1.2.3

7. Image Lifecycle Policies

Without cleanup, ECR repositories grow indefinitely. Lifecycle policies automatically expire old images.

Example: Keep only the last 10 tagged images

{
  "rules": [
    {
      "rulePriority": 1,
      "description": "Keep only 10 most recent images",
      "selection": {
        "tagStatus": "tagged",
        "tagPrefixList": ["v"],
        "countType": "imageCountMoreThan",
        "countNumber": 10
      },
      "action": {
        "type": "expire"
      }
    },
    {
      "rulePriority": 2,
      "description": "Remove untagged images after 1 day",
      "selection": {
        "tagStatus": "untagged",
        "countType": "sinceImagePushed",
        "countUnit": "days",
        "countNumber": 1
      },
      "action": {
        "type": "expire"
      }
    }
  ]
}

Applying the policy

aws ecr put-lifecycle-policy \
  --repository-name user-service \
  --lifecycle-policy-text file://lifecycle-policy.json \
  --region us-east-1

Why lifecycle policies matter

  • Cost control — you pay for every GB stored in ECR
  • Security — old images may have known vulnerabilities
  • Hygiene — hundreds of unused images make debugging harder

8. Vulnerability Scanning

ECR integrates with Amazon Inspector to scan images for known CVEs (Common Vulnerabilities and Exposures).

Scan on push (recommended)

When you created the repository with --image-scanning-configuration scanOnPush=true, every image is automatically scanned when pushed.

Manual scan

aws ecr start-image-scan \
  --repository-name user-service \
  --image-id imageTag=v1.2.3 \
  --region us-east-1

Check scan results

aws ecr describe-image-scan-findings \
  --repository-name user-service \
  --image-id imageTag=v1.2.3 \
  --region us-east-1

Interpreting results

Severity Levels:
  CRITICAL   → Actively exploitable, patch immediately
  HIGH       → Serious vulnerability, patch within days
  MEDIUM     → Moderate risk, patch in next release
  LOW        → Informational, monitor
  UNDEFINED  → Not yet classified

Reducing vulnerabilities

  1. Use minimal base imagesnode:20-alpine has far fewer CVEs than node:20
  2. Update base images regularly — rebuild and push weekly
  3. Remove unnecessary packages — don't install curl, wget, git in production images
  4. Use multi-stage builds — build tools don't end up in the final image
  5. Pin exact versionsFROM node:20.11.0-alpine3.19 instead of FROM node:20-alpine

9. Cross-Account and Cross-Region Access

Cross-region replication

If you deploy in multiple AWS regions, configure ECR replication:

aws ecr put-replication-configuration \
  --replication-configuration '{
    "rules": [
      {
        "destinations": [
          {
            "region": "eu-west-1",
            "registryId": "123456789012"
          }
        ]
      }
    ]
  }' \
  --region us-east-1

Cross-account access (resource policy)

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowCrossAccountPull",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::999888777666:root"
      },
      "Action": [
        "ecr:GetDownloadUrlForLayer",
        "ecr:BatchGetImage",
        "ecr:BatchCheckLayerAvailability"
      ]
    }
  ]
}

10. Complete ECR Workflow Summary

┌─────────────────────────────────────────────────────────────────┐
│                    ECR WORKFLOW                                   │
│                                                                  │
│  1. Create repository                                            │
│     aws ecr create-repository --repository-name my-service       │
│                                                                  │
│  2. Write Dockerfile (multi-stage, alpine, non-root)             │
│                                                                  │
│  3. Build image                                                  │
│     docker build -t my-service:v1.0 .                            │
│                                                                  │
│  4. Tag for ECR                                                  │
│     docker tag my-service:v1.0 <account>.dkr.ecr.<region>       │
│       .amazonaws.com/my-service:v1.0                             │
│                                                                  │
│  5. Authenticate                                                 │
│     aws ecr get-login-password | docker login ...                │
│                                                                  │
│  6. Push                                                         │
│     docker push <ecr-uri>/my-service:v1.0                        │
│                                                                  │
│  7. Verify (scan results, image list)                            │
│     aws ecr describe-images --repository-name my-service         │
│                                                                  │
│  8. Set lifecycle policy (cleanup old images)                    │
│     aws ecr put-lifecycle-policy ...                             │
└─────────────────────────────────────────────────────────────────┘

11. Key Takeaways

  1. ECR is a private, managed Docker registry — integrated with IAM, encrypted at rest, native to ECS/EKS/Lambda.
  2. One repository per microservice — clean separation, independent lifecycle policies.
  3. Multi-stage Docker builds reduce image size by 80%+ and shrink the attack surface.
  4. Always tag with version + git SHA — never rely on latest alone for production deployments.
  5. aws ecr get-login-password gives a 12-hour token — automate this in CI/CD scripts.
  6. Lifecycle policies prevent unbounded storage growth and cost.
  7. Scan on push catches vulnerabilities before they reach production.

Explain-It Challenge

  1. A teammate asks "why can't we just use Docker Hub?" — explain three concrete advantages of ECR for an AWS-hosted microservices architecture.
  2. Your CI/CD pipeline fails with no basic auth credentials when pushing to ECR. Walk through the debugging steps.
  3. Explain to a junior developer why the production Docker image should NOT include devDependencies, TypeScript source files, or .env files.

Navigation: ← 6.3 Overview · 6.3.b — ECS and Fargate →