When should I use Cloud Run vs Cloud Functions?

Cloud Run runs containers and supports any language or runtime, handles concurrent requests, and suits web applications or APIs. Cloud Functions is event-driven, runs single-purpose code snippets, and excels at responding to GCP events like Cloud Storage uploads or Pub/Sub messages. Use Cloud Run for HTTP services needing container flexibility, Cloud Functions for lightweight event processing.

What is Workload Identity and why should I use it?

Workload Identity lets GKE pods authenticate to Google Cloud services using Kubernetes service accounts mapped to GCP service accounts. This eliminates the need for service account key files, which are security risks. It's the recommended way to access GCP APIs from GKE - more secure, easier to manage, and follows the principle of least privilege.

What are the GCP storage classes and when should I use each?

Standard is for frequently accessed data with no minimum storage duration. Nearline suits data accessed less than once per month (30-day minimum). Coldline is for quarterly access (90-day minimum). Archive is for yearly access (365-day minimum). Each tier reduces storage cost but increases retrieval cost. Use lifecycle policies to automatically transition objects between classes.

50+ GCP Interview Questions 2025: Compute, GKE, BigQuery & IAM

Q: What is the difference between GKE Autopilot and Standard mode?

GKE Standard gives you full control over node configuration, scaling, and management - you manage the nodes. GKE Autopilot is fully managed where Google handles node provisioning, scaling, and security. Autopilot charges per pod resource request rather than per node, often more cost-effective for variable workloads. Choose Standard for maximum control, Autopilot for reduced operational overhead.

Q: How does BigQuery pricing work?

BigQuery has two pricing models: on-demand charges $5 per TB scanned (first 1TB/month free), while flat-rate pricing provides dedicated slots for predictable costs. Storage costs $0.02/GB/month for active data, $0.01/GB for long-term (90+ days untouched). Optimize costs by partitioning tables, using clustering, and selecting only needed columns to reduce data scanned.

Q: How does GCP VPC differ from AWS VPC?

GCP VPCs are global by default (subnets are regional), while AWS VPCs are regional. GCP uses firewall rules at the VPC level with network tags for targeting, AWS uses security groups attached to resources. GCP has automatic mode VPCs with pre-created subnets, AWS requires explicit subnet creation. Both support peering, but GCP's global VPC simplifies multi-region architectures.

Google Cloud Platform has become a major player in cloud computing, particularly for organizations leveraging data analytics, machine learning, and Kubernetes. GCP interviews test your understanding of core services and how they fit together—not just whether you've clicked through the console.

This guide covers 50+ GCP interview questions across compute, containers, networking, data services, and security—the depth interviewers expect from cloud engineers and DevOps professionals.

GCP Fundamentals Questions
Compute Engine Questions
Serverless Compute Questions
GKE and Container Questions
IAM and Security Questions
Networking Questions
Data and Analytics Questions
Cost Optimization Questions

GCP Fundamentals Questions

Understanding GCP's organizational structure and core concepts is essential before diving into specific services.

How is GCP's resource hierarchy organized?

GCP organizes resources in a hierarchical structure that flows from organization to folders to projects to individual resources. This hierarchy determines how policies, permissions, and billing are inherited and applied throughout your cloud environment.

The organization is the root node, typically representing your company. Folders provide optional grouping for departments or environments. Projects are the fundamental organizing unit where you enable APIs, manage billing, and control IAM. Resources like VMs, storage buckets, and databases live within projects.

flowchart TB
    ORG["Organization"]
    FOLD["Folders (optional)"]
    PROJ["Projects"]
    RES["Resources<br/>(VMs, buckets, etc.)"]
 
    ORG --> FOLD
    FOLD --> PROJ
    PROJ --> RES

How does GCP's resource hierarchy differ from AWS?

In AWS, accounts serve as the primary boundary for resources, billing, and permissions. Each AWS account is relatively isolated, and organizations use AWS Organizations to group accounts together. GCP takes a different approach with projects as the primary unit within a single organization.

GCP's model makes multi-project architectures more straightforward for large organizations. A single organization can have thousands of projects with centralized IAM and billing. Folders provide additional grouping that AWS doesn't have natively—you'd need AWS Control Tower or custom solutions to achieve similar organizational structures.

Key differences:

Aspect	GCP	AWS
Primary boundary	Project	Account
Grouping mechanism	Folders	Organizations/OUs
Billing scope	Organization-wide	Per account
IAM inheritance	Top-down through hierarchy	Per account with cross-account roles

What are labels in GCP and how do you use them?

Labels are key-value pairs that you attach to GCP resources for organization, filtering, and cost allocation. Unlike the resource hierarchy which is structural, labels provide flexible metadata that can categorize resources across organizational boundaries.

You might label resources by environment (prod, staging, dev), team (platform, data, frontend), cost center, or application. Labels appear in billing exports, letting you break down costs by any dimension you define. They also enable filtering in the console and CLI—find all production resources across projects, or all resources owned by a specific team.

# Add labels when creating resources
gcloud compute instances create my-vm \
    --labels=env=prod,team=platform,cost-center=engineering
 
# Filter resources by label
gcloud compute instances list --filter="labels.env=prod"
 
# Labels appear in billing exports for cost allocation

What is the difference between regions and zones in GCP?

Regions are independent geographic areas containing multiple zones, while zones are isolated locations within a region. This hierarchy provides both geographic distribution and fault isolation for your applications.

A region like us-central1 represents a geographic area (Iowa, USA) and contains multiple zones (us-central1-a, us-central1-b, etc.). Zones within a region have low-latency network connections but are isolated from each other's failures—a power outage in one zone doesn't affect others. Regional resources span zones automatically, while zonal resources exist in a single zone.

Choosing regions and zones:

Deploy across multiple zones for high availability within a region
Deploy across multiple regions for disaster recovery
Choose regions close to your users for lower latency
Consider data residency requirements for compliance

Compute Engine Questions

Compute Engine provides virtual machines in GCP, comparable to AWS EC2. Understanding VM options and configurations is fundamental for any GCP interview.

What are the different machine type families in Compute Engine?

GCP offers several machine type families optimized for different workloads, each with distinct price-performance characteristics. Choosing the right family can significantly impact both cost and application performance.

General-purpose machines (E2, N2, N2D) balance compute, memory, and networking for most workloads. Compute-optimized machines (C2, C2D) provide the highest per-core performance for CPU-intensive applications. Memory-optimized machines (M2, M3) offer high memory-to-CPU ratios for in-memory databases. Accelerator-optimized machines (A2, G2) include GPUs for machine learning and graphics workloads.

Family	Use Case	Key Feature
E2	Cost-effective general purpose	Lowest cost, shared-core options
N2/N2D	Balanced workloads	Good price-performance
C2/C2D	CPU-intensive	Highest per-core performance
M2/M3	Memory-intensive	Up to 12TB RAM
A2/G2	ML and graphics	NVIDIA GPUs attached

What are preemptible VMs and when should you use them?

Preemptible VMs are short-lived compute instances available at 60-91% discount compared to regular VMs. Google can terminate them at any time with 30 seconds notice, and they automatically shut down after 24 hours. This significant cost reduction comes with the trade-off of uncertain availability.

Use preemptible VMs for fault-tolerant workloads that can handle interruption—batch processing jobs, CI/CD build agents, distributed data processing, rendering farms, or development environments. Don't use them for databases, user-facing services, or any workload requiring guaranteed availability.

# Create a preemptible instance
gcloud compute instances create batch-worker \
    --machine-type=n2-standard-4 \
    --preemptible \
    --no-restart-on-failure \
    --maintenance-policy=terminate
 
# Spot VMs (newer, similar concept without 24-hour limit)
gcloud compute instances create spot-worker \
    --machine-type=n2-standard-4 \
    --provisioning-model=SPOT \
    --instance-termination-action=STOP

What is the difference between preemptible VMs and Spot VMs?

Spot VMs are the evolution of preemptible VMs with some key differences. Both offer significant discounts for interruptible compute capacity, but Spot VMs provide more flexibility in how you handle termination.

Preemptible VMs always terminate after 24 hours and must be deleted when preempted. Spot VMs have no maximum runtime and can be configured to either stop or delete when capacity is reclaimed. Spot VMs also support live migration in some cases, whereas preemptible VMs never do. For new workloads, Google recommends Spot VMs as the modern replacement.

What is live migration and why does it matter?

Live migration moves running VMs between physical hosts without visible downtime to your applications. When Google needs to perform hardware maintenance, update host software, or respond to impending hardware failures, live migration keeps your VMs running uninterrupted.

This feature sets GCP apart from some other cloud providers where maintenance events cause brief outages. Your VM's memory, network connections, and local state transfer seamlessly to a new host. Live migration is enabled by default for most VM types and works transparently—you don't need to configure anything, and your applications continue running during the migration.

When would you use sole-tenant nodes?

Sole-tenant nodes provide dedicated physical servers where only your VMs run, isolating your workloads from other customers at the hardware level. This physical isolation addresses compliance requirements and licensing considerations that shared infrastructure cannot satisfy.

Use sole-tenant nodes when you have bring-your-own-license (BYOL) software with per-core or per-socket licensing that requires dedicated hardware. They're also necessary for compliance frameworks requiring physical isolation, such as certain financial or healthcare regulations. The trade-off is higher cost and reduced flexibility compared to shared infrastructure.

Serverless Compute Questions

GCP offers multiple serverless compute options, each suited to different use cases. Understanding when to use each is a common interview topic.

What is Cloud Run and when should you use it?

Cloud Run is a fully managed container platform that runs stateless containers without requiring you to manage infrastructure. It automatically scales from zero to thousands of instances based on incoming requests, and you pay only for the compute time you actually use.

Cloud Run excels for HTTP-driven workloads like REST APIs, web applications, and microservices. Because it runs containers, you can use any language, framework, or binary—anything that can be containerized. The concurrency model lets a single instance handle multiple concurrent requests, making it more efficient than function-as-a-service for many workloads.

# Cloud Run service configuration
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: api-service
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "1"  # Avoid cold starts
        autoscaling.knative.dev/maxScale: "100"
    spec:
      containerConcurrency: 80  # Requests per instance
      containers:
        - image: gcr.io/my-project/api:latest
          resources:
            limits:
              cpu: "2"
              memory: "1Gi"

What is the difference between Cloud Run and Cloud Functions?

Cloud Run and Cloud Functions are both serverless, but they target different use cases and have different execution models. The choice depends on whether you need container flexibility or event-driven simplicity.

Cloud Run runs containers, supports any runtime, and handles concurrent requests within a single instance. It's ideal for web applications, APIs, and services where you need full control over your runtime environment. Cloud Functions runs code snippets triggered by events, supports specific runtimes (Node.js, Python, Go, Java), and processes one event per instance in Gen 1 (Gen 2 supports concurrency).

Aspect	Cloud Run	Cloud Functions
Unit of deployment	Container	Function code
Trigger types	HTTP (primarily)	HTTP, Pub/Sub, Storage, Firestore, etc.
Concurrency	Multiple requests per instance	One per instance (Gen 1)
Cold starts	Generally faster	Can be slower for infrequent invocations
Best for	Web apps, APIs, microservices	Event processing, webhooks, automation

How do you handle cold starts in Cloud Run?

Cold starts occur when Cloud Run needs to spin up a new instance to handle a request, adding latency while the container initializes. For latency-sensitive applications, several strategies can minimize or eliminate cold start impact.

The most direct solution is setting a minimum instance count so at least one instance is always warm and ready. You can also optimize your container startup time by using smaller base images, lazy-loading dependencies, and minimizing initialization work. For applications that can tolerate occasional latency, properly configured health checks ensure traffic only routes to fully-initialized instances.

# Set minimum instances to avoid cold starts
gcloud run deploy my-service \
    --image=gcr.io/my-project/my-image \
    --min-instances=1 \
    --max-instances=100
 
# Startup probe ensures traffic waits for initialization
# In your service.yaml:
# startupProbe:
#   httpGet:
#     path: /healthz
#   initialDelaySeconds: 0
#   periodSeconds: 1
#   failureThreshold: 30

How do you process Cloud Storage uploads automatically?

Cloud Functions excels at responding to GCP events like file uploads. When a file lands in a Cloud Storage bucket, a function can automatically process it—resizing images, parsing data files, or triggering downstream workflows.

The function receives event metadata including the bucket name and object path. You download the file, process it, and optionally write results back to storage. Handle idempotency carefully since cloud functions may execute multiple times for the same event in rare cases.

// Cloud Function triggered by Cloud Storage upload
const sharp = require('sharp');
const { Storage } = require('@google-cloud/storage');
 
exports.processImage = async (event, context) => {
  const storage = new Storage();
  const bucket = storage.bucket(event.bucket);
  const file = bucket.file(event.name);
 
  // Skip if already processed (avoid infinite loops)
  if (event.name.startsWith('processed/')) return;
 
  const [buffer] = await file.download();
 
  const processed = await sharp(buffer)
    .resize(800, 600)
    .jpeg({ quality: 80 })
    .toBuffer();
 
  await bucket.file(`processed/${event.name}`).save(processed);
  console.log(`Processed: ${event.name}`);
};

What is Cloud Run Jobs and how does it differ from Cloud Run services?

Cloud Run Jobs runs containers to completion without serving HTTP requests, designed for batch processing, data migrations, scheduled tasks, and other run-to-completion workloads. Unlike Cloud Run services which stay running to handle requests, Jobs execute once and terminate.

Jobs support parallelism—you can run multiple instances of your container simultaneously to process work faster. They integrate with Cloud Scheduler for cron-like scheduling and can be triggered manually or programmatically. Use Jobs when you need container flexibility for batch work that doesn't fit the request-response model.

# Create a Cloud Run Job
gcloud run jobs create data-processor \
    --image=gcr.io/my-project/processor:latest \
    --tasks=10 \
    --parallelism=5 \
    --max-retries=3
 
# Execute the job
gcloud run jobs execute data-processor
 
# Schedule with Cloud Scheduler
gcloud scheduler jobs create http daily-process \
    --schedule="0 2 * * *" \
    --uri="https://us-central1-run.googleapis.com/apis/run.googleapis.com/v1/namespaces/my-project/jobs/data-processor:run" \
    --http-method=POST \
    --oauth-service-account-email=scheduler@my-project.iam.gserviceaccount.com

GKE and Container Questions

Google Kubernetes Engine is GCP's managed Kubernetes service and often receives significant focus in cloud interviews.

What is the difference between GKE Autopilot and Standard mode?

GKE offers two operational modes with fundamentally different management models. Standard mode gives you full control over node configuration and management, while Autopilot abstracts away the node layer entirely, letting you focus purely on workloads.

In Standard mode, you create and manage node pools, choose machine types, configure autoscaling, and handle node security and updates. In Autopilot, Google manages all node infrastructure automatically—you just deploy pods, and GKE provisions appropriate nodes transparently. Autopilot enforces security best practices and charges per pod resource request rather than per node.

Aspect	Standard	Autopilot
Node management	You manage	Google manages
Pricing model	Per node (VM cost)	Per pod resources
Customization	Full control	Limited (security-hardened)
Node access	SSH available	No node access
Best for	Complex/custom workloads	Simplified operations

# Create Autopilot cluster (simpler)
gcloud container clusters create-auto my-cluster \
    --region=us-central1
 
# Create Standard cluster (more control)
gcloud container clusters create my-cluster \
    --region=us-central1 \
    --num-nodes=3 \
    --machine-type=e2-standard-4 \
    --enable-autoscaling \
    --min-nodes=1 \
    --max-nodes=10

What is Workload Identity and why is it the recommended authentication method?

Workload Identity provides a secure way for GKE workloads to authenticate to Google Cloud APIs without managing service account keys. It maps Kubernetes service accounts to GCP service accounts, allowing pods to inherit cloud permissions based on their Kubernetes identity.

Service account key files are a security risk—they can be leaked, don't expire automatically, and are difficult to rotate across distributed systems. Workload Identity eliminates keys entirely. When a pod uses a mapped Kubernetes service account, GKE automatically provides GCP credentials through the metadata server, following the same pattern as Compute Engine instance authentication.

# Kubernetes ServiceAccount with Workload Identity annotation
apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-app
  annotations:
    iam.gke.io/gcp-service-account: my-app@my-project.iam.gserviceaccount.com
---
# Pod using the ServiceAccount
apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  serviceAccountName: my-app
  containers:
    - name: app
      image: gcr.io/my-project/my-app

# Bind Kubernetes SA to GCP SA
gcloud iam service-accounts add-iam-policy-binding \
    my-app@my-project.iam.gserviceaccount.com \
    --role=roles/iam.workloadIdentityUser \
    --member="serviceAccount:my-project.svc.id.goog[default/my-app]"

What are node pools in GKE and why would you use multiple pools?

Node pools are groups of nodes within a GKE cluster that share the same configuration—machine type, disk size, labels, and taints. A single cluster can have multiple node pools with different configurations, allowing you to optimize for different workload requirements.

You might have a general-purpose pool for most workloads, a high-memory pool for databases and caching, and a GPU pool for machine learning jobs. Node pools can scale independently and have different autoscaling configurations. Taints and tolerations ensure pods schedule to appropriate pools.

# Create cluster with default pool
gcloud container clusters create my-cluster \
    --num-nodes=3 \
    --machine-type=e2-standard-4
 
# Add high-memory pool for databases
gcloud container node-pools create high-mem-pool \
    --cluster=my-cluster \
    --machine-type=n2-highmem-8 \
    --num-nodes=2 \
    --node-taints=workload=database:NoSchedule
 
# Add GPU pool for ML workloads
gcloud container node-pools create gpu-pool \
    --cluster=my-cluster \
    --machine-type=n1-standard-8 \
    --accelerator=type=nvidia-tesla-t4,count=1 \
    --num-nodes=0 \
    --enable-autoscaling \
    --min-nodes=0 \
    --max-nodes=5

What is Artifact Registry and how does it differ from Container Registry?

Artifact Registry is GCP's universal package repository that stores container images, language packages (npm, Maven, Python), and OS packages. It replaces Container Registry with additional features and broader format support.

Container Registry (gcr.io) only stores Docker images and has limited access control. Artifact Registry supports multiple formats, provides fine-grained IAM permissions at the repository level, integrates vulnerability scanning, and offers regional and multi-regional options. Google recommends migrating from Container Registry to Artifact Registry for new projects.

# Create Docker repository in Artifact Registry
gcloud artifacts repositories create my-repo \
    --repository-format=docker \
    --location=us-central1 \
    --description="Docker images"
 
# Configure Docker authentication
gcloud auth configure-docker us-central1-docker.pkg.dev
 
# Push image (note the different domain)
docker tag my-image us-central1-docker.pkg.dev/my-project/my-repo/my-image:v1
docker push us-central1-docker.pkg.dev/my-project/my-repo/my-image:v1

How do you implement a rolling update strategy in GKE?

Rolling updates gradually replace old pods with new ones, ensuring your application remains available throughout the deployment. Kubernetes manages this process based on parameters you configure, balancing update speed against availability guarantees.

The maxSurge parameter controls how many extra pods can exist during the update, while maxUnavailable controls how many pods can be unavailable. A typical configuration allows one extra pod and zero unavailable, ensuring capacity never drops below 100% during updates. Readiness probes ensure traffic only routes to pods that are actually ready to serve.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # One extra pod during update
      maxUnavailable: 0  # Never reduce below desired count
  template:
    spec:
      containers:
        - name: app
          image: gcr.io/my-project/my-app:v2
          readinessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5

IAM and Security Questions

Identity and Access Management is fundamental to GCP security. Interviewers often probe deeply into IAM concepts and best practices.

How does GCP IAM work?

GCP IAM follows a simple model: it defines who (principal) can do what (role) on which resource. Policies bind principals to roles at specific points in the resource hierarchy, and permissions flow down from organization to folders to projects to individual resources.

Principals can be Google accounts, service accounts, Google groups, or domains. Roles are collections of permissions—predefined roles cover common use cases, while custom roles let you define precise permission sets. The principle of least privilege guides IAM design: grant only the permissions necessary for each principal's function.

# Grant a predefined role at project level
gcloud projects add-iam-policy-binding my-project \
    --member="user:developer@example.com" \
    --role="roles/compute.instanceAdmin"
 
# Grant at resource level (more restrictive)
gcloud storage buckets add-iam-policy-binding gs://my-bucket \
    --member="serviceAccount:my-app@my-project.iam.gserviceaccount.com" \
    --role="roles/storage.objectViewer"

What is the difference between primitive, predefined, and custom roles?

GCP offers three categories of roles with increasing specificity. Understanding when to use each type is essential for implementing least-privilege access control.

Primitive roles (Owner, Editor, Viewer) are legacy roles with broad permissions across all services—avoid them for production. Predefined roles are service-specific and granular, like roles/storage.objectViewer or roles/compute.networkAdmin, designed for common use cases. Custom roles let you select individual permissions to create precisely-scoped access when predefined roles are too broad.

Role Type	Example	Use Case
Primitive	`roles/editor`	Quick prototyping (avoid in production)
Predefined	`roles/bigquery.dataViewer`	Most production scenarios
Custom	Your defined permissions	When predefined roles grant too much

# Create custom role with specific permissions
gcloud iam roles create limitedStorageReader \
    --project=my-project \
    --title="Limited Storage Reader" \
    --description="Can only list and read objects" \
    --permissions=storage.objects.get,storage.objects.list

What are service accounts and when should you use them?

Service accounts are special accounts for applications and services rather than humans. They authenticate workloads to GCP APIs and can be used to run Compute Engine instances, GKE pods, Cloud Functions, and other services with specific permissions.

Every GCP project has default service accounts that are often overprivileged. Best practice is to create dedicated service accounts for each application with only the permissions it needs. Avoid service account keys when possible—use Workload Identity for GKE, attached service accounts for Compute Engine, and the default credentials chain for other services.

# Create dedicated service account
gcloud iam service-accounts create my-app-sa \
    --display-name="My Application Service Account"
 
# Grant specific permissions
gcloud projects add-iam-policy-binding my-project \
    --member="serviceAccount:my-app-sa@my-project.iam.gserviceaccount.com" \
    --role="roles/storage.objectViewer"
 
gcloud projects add-iam-policy-binding my-project \
    --member="serviceAccount:my-app-sa@my-project.iam.gserviceaccount.com" \
    --role="roles/pubsub.publisher"
 
# Attach to Compute Engine instance (no key needed)
gcloud compute instances create my-vm \
    --service-account=my-app-sa@my-project.iam.gserviceaccount.com \
    --scopes=cloud-platform

How do you implement service account impersonation?

Service account impersonation lets you temporarily act as a service account without downloading its key. This is useful for local development, CI/CD pipelines, and administrative tasks where you need elevated permissions temporarily.

Your user account needs the roles/iam.serviceAccountTokenCreator role on the target service account. Then you can generate short-lived credentials or configure tools to impersonate the account. This is more secure than key files because credentials are temporary and auditable.

# Grant impersonation permission
gcloud iam service-accounts add-iam-policy-binding \
    my-app-sa@my-project.iam.gserviceaccount.com \
    --member="user:developer@example.com" \
    --role="roles/iam.serviceAccountTokenCreator"
 
# Use impersonation for gcloud commands
gcloud auth application-default login \
    --impersonate-service-account=my-app-sa@my-project.iam.gserviceaccount.com
 
# Or set for current session
gcloud config set auth/impersonate_service_account \
    my-app-sa@my-project.iam.gserviceaccount.com

What are IAM conditions and when would you use them?

IAM conditions add contextual rules to policy bindings, granting access only when specified conditions are met. This enables fine-grained access control based on attributes like time, resource properties, or request context.

Common uses include temporary access (grant permissions only during business hours or until a specific date), resource-based restrictions (access only resources with certain tags), and IP-based restrictions (access only from corporate network). Conditions use Common Expression Language (CEL) syntax.

# Grant access only during business hours
gcloud projects add-iam-policy-binding my-project \
    --member="user:contractor@example.com" \
    --role="roles/compute.instanceAdmin" \
    --condition='expression=request.time.getHours("America/New_York") >= 9 && request.time.getHours("America/New_York") < 17,title=business-hours-only'
 
# Grant access until a specific date
gcloud projects add-iam-policy-binding my-project \
    --member="user:contractor@example.com" \
    --role="roles/storage.objectViewer" \
    --condition='expression=request.time < timestamp("2025-03-01T00:00:00Z"),title=temporary-access'

Networking Questions

GCP networking has unique characteristics, particularly the global VPC model. Understanding these differences is important for cloud interviews.

How does GCP VPC differ from AWS VPC?

GCP VPCs are global by default, which fundamentally changes how you design multi-region architectures. In AWS, VPCs are regional, requiring peering or Transit Gateway to connect resources across regions. In GCP, a single VPC spans all regions automatically—instances in different regions can communicate directly using internal IPs.

GCP uses firewall rules at the VPC level with network tags or service accounts for targeting, while AWS uses security groups attached to individual resources. GCP's approach is more centralized and can be simpler for consistent policies, while AWS provides more instance-specific control.

Aspect	GCP VPC	AWS VPC
Scope	Global	Regional
Subnets	Regional	Zonal (AZ-specific)
Firewall model	VPC-level rules with tags	Security groups per resource
Cross-region communication	Direct (same VPC)	Requires peering/Transit Gateway

How do you create a multi-region network in GCP?

Because GCP VPCs are global, creating a multi-region network is straightforward—you simply create subnets in different regions within the same VPC. Instances in different regions can communicate directly using internal IP addresses without any peering configuration.

Custom mode VPCs give you explicit control over subnet creation and IP ranges. Enable Private Google Access on each subnet so instances without public IPs can still reach GCP APIs. The global VPC model significantly simplifies multi-region architectures compared to other cloud providers.

# Create custom VPC (global by default)
gcloud compute networks create my-vpc --subnet-mode=custom
 
# Create subnets in different regions
gcloud compute networks subnets create us-subnet \
    --network=my-vpc \
    --region=us-central1 \
    --range=10.0.1.0/24 \
    --enable-private-ip-google-access
 
gcloud compute networks subnets create eu-subnet \
    --network=my-vpc \
    --region=europe-west1 \
    --range=10.0.2.0/24 \
    --enable-private-ip-google-access
 
gcloud compute networks subnets create asia-subnet \
    --network=my-vpc \
    --region=asia-east1 \
    --range=10.0.3.0/24 \
    --enable-private-ip-google-access
 
# Instances in all three regions can now communicate directly

How do firewall rules work in GCP?

GCP firewall rules control traffic at the VPC level, specifying what traffic is allowed or denied based on source, destination, protocol, and port. Rules can target instances using network tags or service accounts, providing flexibility in how you apply security policies.

Every VPC has implied rules: deny all ingress and allow all egress. You create explicit rules to allow necessary traffic. Rules have priorities (0-65535, lower is higher priority), and the first matching rule determines whether traffic is allowed. This differs from AWS security groups which are stateful and attached to individual instances.

# Allow HTTP/HTTPS to instances tagged 'web'
gcloud compute firewall-rules create allow-web-traffic \
    --network=my-vpc \
    --allow=tcp:80,tcp:443 \
    --target-tags=web \
    --source-ranges=0.0.0.0/0 \
    --priority=1000
 
# Allow internal communication within VPC
gcloud compute firewall-rules create allow-internal \
    --network=my-vpc \
    --allow=tcp,udp,icmp \
    --source-ranges=10.0.0.0/8 \
    --priority=1000
 
# Allow SSH only from specific IP (admin access)
gcloud compute firewall-rules create allow-ssh-admin \
    --network=my-vpc \
    --allow=tcp:22 \
    --target-tags=allow-ssh \
    --source-ranges=203.0.113.0/24 \
    --priority=900

What load balancer should you use for a global web application?

The Global External HTTP(S) Load Balancer is the right choice for web applications serving users worldwide. It provides a single anycast IP address that routes users to the nearest healthy backend, SSL termination at the edge, integration with Cloud CDN, and Cloud Armor for DDoS protection.

This load balancer operates at Layer 7, understanding HTTP and HTTPS traffic. It can route based on URL paths, host headers, and other request attributes. The anycast IP means the same address resolves to different points of presence globally, minimizing latency for users everywhere.

# Create health check
gcloud compute health-checks create http http-health-check \
    --port=80 \
    --request-path=/health
 
# Create backend service
gcloud compute backend-services create web-backend \
    --global \
    --protocol=HTTP \
    --health-checks=http-health-check \
    --port-name=http
 
# Create URL map (routing rules)
gcloud compute url-maps create web-map \
    --default-service=web-backend
 
# Create HTTPS proxy with SSL certificate
gcloud compute target-https-proxies create web-proxy \
    --url-map=web-map \
    --ssl-certificates=my-cert
 
# Create forwarding rule (the public IP)
gcloud compute forwarding-rules create web-rule \
    --global \
    --target-https-proxy=web-proxy \
    --ports=443

What is Private Google Access and why is it important?

Private Google Access enables instances without public IP addresses to reach Google APIs and services through internal networks. Without it, a VM without a public IP cannot access Cloud Storage, BigQuery, or other GCP services—even within GCP's network.

This feature is essential for security-conscious architectures where you don't want instances exposed to the internet. Enable it at the subnet level, and instances in that subnet can access google.com services using their internal IP as the source. Combined with Cloud NAT for outbound internet access, you can run workloads entirely without public IPs.

# Enable Private Google Access on a subnet
gcloud compute networks subnets update my-subnet \
    --region=us-central1 \
    --enable-private-ip-google-access
 
# Verify it's enabled
gcloud compute networks subnets describe my-subnet \
    --region=us-central1 \
    --format="get(privateIpGoogleAccess)"

Data and Analytics Questions

GCP's data and analytics services, particularly BigQuery, are major differentiators. These questions come up frequently in interviews.

What is BigQuery and how does it differ from traditional data warehouses?

BigQuery is a serverless, highly-scalable data warehouse that separates storage from compute. Unlike traditional data warehouses where you provision and manage clusters, BigQuery automatically allocates resources for each query—you don't manage infrastructure, and you pay per query or for reserved capacity.

The columnar storage format is optimized for analytical queries that scan specific columns across billions of rows. BigQuery's architecture allows it to scan terabytes in seconds by parallelizing work across thousands of nodes transparently. Standard SQL support means analysts can use familiar syntax without learning a new query language.

How does BigQuery pricing work?

BigQuery has two pricing models to fit different usage patterns. On-demand pricing charges $5 per TB of data scanned, with the first 1TB per month free. Flat-rate pricing provides dedicated query capacity (slots) for predictable monthly costs, better for heavy, consistent usage.

Storage costs $0.02/GB/month for active data (queried in last 90 days) and $0.01/GB/month for long-term storage. Understanding this model is crucial for cost optimization—the amount of data scanned directly impacts cost, making query optimization and table design essential.

Pricing Model	Cost	Best For
On-demand	$5/TB scanned	Variable usage, exploration
Flat-rate	Fixed monthly (slots)	Heavy, predictable usage
Storage (active)	$0.02/GB/month	Frequently queried data
Storage (long-term)	$0.01/GB/month	Data not queried in 90+ days

How do you optimize BigQuery costs and performance?

Cost optimization in BigQuery revolves around reducing the amount of data scanned per query. Partitioning divides tables by date or integer range, so queries can skip irrelevant partitions entirely. Clustering sorts data within partitions by specified columns, improving filter efficiency and reducing bytes scanned.

Always select only the columns you need—SELECT * scans every column and dramatically increases cost. Use the query validator or --dry-run flag to preview bytes scanned before running expensive queries. For large analytical workloads, consider materialized views to pre-compute common aggregations.

-- Create partitioned and clustered table
CREATE TABLE my_dataset.events
PARTITION BY DATE(event_timestamp)
CLUSTER BY user_id, event_type
AS SELECT * FROM raw_events;
 
-- Query uses partition pruning (only scans one day)
SELECT user_id, event_type, COUNT(*)
FROM my_dataset.events
WHERE DATE(event_timestamp) = '2025-01-15'  -- Partition filter
  AND event_type = 'purchase'               -- Clustering benefits this
GROUP BY user_id, event_type;
 
-- Check bytes scanned before running
-- Use --dry_run with bq command or check validator in console
bq query --dry_run \
    'SELECT * FROM my_dataset.events WHERE DATE(event_timestamp) = "2025-01-15"'

What are the Cloud Storage classes and when should you use each?

Cloud Storage offers four storage classes with different price and access characteristics. Standard storage is for frequently accessed data with no minimum storage duration. Nearline is for data accessed less than once per month with a 30-day minimum. Coldline suits quarterly access patterns with a 90-day minimum. Archive is for yearly access with a 365-day minimum.

Each tier reduces storage cost but increases retrieval cost. Use lifecycle policies to automatically transition objects between classes as they age—start in Standard, move to Nearline after 30 days, Coldline after 90 days, and delete after a year. This automation optimizes costs without manual intervention.

Class	Storage Cost	Retrieval Cost	Min Duration	Use Case
Standard	$0.020/GB	Free	None	Active data
Nearline	$0.010/GB	$0.01/GB	30 days	Monthly backups
Coldline	$0.004/GB	$0.02/GB	90 days	Quarterly archives
Archive	$0.0012/GB	$0.05/GB	365 days	Compliance archives

# Create bucket with lifecycle policy
cat > lifecycle.json << 'EOF'
{
  "rule": [
    {
      "action": {"type": "SetStorageClass", "storageClass": "NEARLINE"},
      "condition": {"age": 30}
    },
    {
      "action": {"type": "SetStorageClass", "storageClass": "COLDLINE"},
      "condition": {"age": 90}
    },
    {
      "action": {"type": "SetStorageClass", "storageClass": "ARCHIVE"},
      "condition": {"age": 180}
    },
    {
      "action": {"type": "Delete"},
      "condition": {"age": 365}
    }
  ]
}
EOF
 
gcloud storage buckets update gs://my-bucket --lifecycle-file=lifecycle.json

What is Pub/Sub and when would you use it?

Pub/Sub is a fully managed messaging service for asynchronous communication between services. Publishers send messages to topics, and subscribers receive messages through subscriptions. This decouples producers from consumers, enabling scalable, fault-tolerant architectures.

Use Pub/Sub for event-driven systems, data streaming pipelines, and service integration. It provides at-least-once delivery (messages may be delivered multiple times, so design for idempotency), supports both push and pull delivery models, and scales automatically to handle millions of messages per second.

# Publisher
from google.cloud import pubsub_v1
 
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path('my-project', 'user-events')
 
data = '{"event": "user_signup", "user_id": "123"}'
future = publisher.publish(topic_path, data.encode('utf-8'))
print(f'Published message ID: {future.result()}')
 
# Subscriber
subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path('my-project', 'user-events-sub')
 
def callback(message):
    print(f'Received: {message.data}')
    # Process message idempotently
    message.ack()
 
streaming_pull_future = subscriber.subscribe(subscription_path, callback=callback)

Cost Optimization Questions

Cost optimization is a practical concern that interviewers use to assess real-world cloud experience.

What are the main strategies for reducing GCP costs?

GCP cost optimization combines resource right-sizing, commitment-based discounts, and architectural decisions. Right-sizing means matching resource allocation to actual usage—use the Recommender API to identify oversized instances. Committed use discounts provide 1-3 year commitments for predictable workloads at significant savings.

For variable workloads, use preemptible or spot VMs for fault-tolerant jobs. Serverless services like Cloud Run and BigQuery eliminate idle capacity costs. Storage lifecycle policies automatically move data to cheaper tiers. Implement cost allocation using labels and set up budget alerts to catch unexpected spending.

How do committed use discounts work?

Committed use discounts (CUDs) provide significant savings (up to 57% for 3 years) in exchange for committing to minimum resource usage. You commit to a certain amount of vCPU and memory in a region, and GCP automatically applies the discount to matching usage.

CUDs are use-it-or-lose-it—you pay the committed amount whether you use the resources or not. Analyze your baseline usage carefully before committing. Resource-based commitments apply to Compute Engine and GKE, while spend-based commitments cover services like Cloud SQL and Cloud Run.

# View commitment options
gcloud compute commitments list-regions
 
# Create a commitment (example: 3-year commitment)
gcloud compute commitments create my-commitment \
    --region=us-central1 \
    --resources=vcpu=100,memory=400GB \
    --plan=36-month
 
# View active commitments
gcloud compute commitments list

How do you track and allocate GCP costs across teams?

Labels provide the foundation for cost allocation in GCP. Apply consistent labels to all resources indicating team, project, environment, and cost center. These labels appear in billing exports, enabling detailed cost breakdown and chargeback to teams.

Export billing data to BigQuery for custom analysis and dashboards. Set up budget alerts at the project or label level to notify teams when spending approaches limits. The Cloud Billing Reports in the console provide quick insights, while BigQuery exports enable sophisticated analysis and integration with business intelligence tools.

# Create resources with cost allocation labels
gcloud compute instances create my-vm \
    --labels=team=platform,env=prod,cost-center=eng-123
 
# Export billing to BigQuery (via console or API)
# Then query for cost breakdown

-- BigQuery cost analysis by team
SELECT
  labels.value AS team,
  SUM(cost) AS total_cost
FROM `my-project.billing_export.gcp_billing_export_v1_*`
CROSS JOIN UNNEST(labels) AS labels
WHERE labels.key = 'team'
  AND _PARTITIONTIME >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY team
ORDER BY total_cost DESC;

Quick Reference

Topic	Key Points
Resource Hierarchy	Organization → Folders → Projects → Resources
Compute Options	Compute Engine (VMs), Cloud Run (containers), Cloud Functions (events), GKE (Kubernetes)
GKE Modes	Standard (you manage nodes), Autopilot (Google manages)
IAM Model	Who (principal) + What (role) + Which (resource)
VPC	Global by default, subnets are regional
BigQuery	Serverless warehouse, pay per TB scanned or flat-rate
Storage Classes	Standard → Nearline → Coldline → Archive
Cost Optimization	Right-size, CUDs, spot VMs, lifecycle policies

Kubernetes Interview Guide - Core concepts that apply to GKE
Docker Interview Guide - Container fundamentals
AWS Interview Guide - Compare cloud platforms
CI/CD & GitHub Actions Guide - Deploy to GCP
System Design Interview Guide - Architecture patterns

Table of Contents