Google Cloud Platform has become a major player in cloud computing, particularly for organizations leveraging data analytics, machine learning, and Kubernetes. GCP interviews test your understanding of core services and how they fit together—not just whether you've clicked through the console.
This guide covers 50+ GCP interview questions across compute, containers, networking, data services, and security—the depth interviewers expect from cloud engineers and DevOps professionals.
Table of Contents
- GCP Fundamentals Questions
- Compute Engine Questions
- Serverless Compute Questions
- GKE and Container Questions
- IAM and Security Questions
- Networking Questions
- Data and Analytics Questions
- Cost Optimization Questions
GCP Fundamentals Questions
Understanding GCP's organizational structure and core concepts is essential before diving into specific services.
How is GCP's resource hierarchy organized?
GCP organizes resources in a hierarchical structure that flows from organization to folders to projects to individual resources. This hierarchy determines how policies, permissions, and billing are inherited and applied throughout your cloud environment.
The organization is the root node, typically representing your company. Folders provide optional grouping for departments or environments. Projects are the fundamental organizing unit where you enable APIs, manage billing, and control IAM. Resources like VMs, storage buckets, and databases live within projects.
flowchart TB
ORG["Organization"]
FOLD["Folders (optional)"]
PROJ["Projects"]
RES["Resources<br/>(VMs, buckets, etc.)"]
ORG --> FOLD
FOLD --> PROJ
PROJ --> RESHow does GCP's resource hierarchy differ from AWS?
In AWS, accounts serve as the primary boundary for resources, billing, and permissions. Each AWS account is relatively isolated, and organizations use AWS Organizations to group accounts together. GCP takes a different approach with projects as the primary unit within a single organization.
GCP's model makes multi-project architectures more straightforward for large organizations. A single organization can have thousands of projects with centralized IAM and billing. Folders provide additional grouping that AWS doesn't have natively—you'd need AWS Control Tower or custom solutions to achieve similar organizational structures.
Key differences:
| Aspect | GCP | AWS |
|---|---|---|
| Primary boundary | Project | Account |
| Grouping mechanism | Folders | Organizations/OUs |
| Billing scope | Organization-wide | Per account |
| IAM inheritance | Top-down through hierarchy | Per account with cross-account roles |
What are labels in GCP and how do you use them?
Labels are key-value pairs that you attach to GCP resources for organization, filtering, and cost allocation. Unlike the resource hierarchy which is structural, labels provide flexible metadata that can categorize resources across organizational boundaries.
You might label resources by environment (prod, staging, dev), team (platform, data, frontend), cost center, or application. Labels appear in billing exports, letting you break down costs by any dimension you define. They also enable filtering in the console and CLI—find all production resources across projects, or all resources owned by a specific team.
# Add labels when creating resources
gcloud compute instances create my-vm \
--labels=env=prod,team=platform,cost-center=engineering
# Filter resources by label
gcloud compute instances list --filter="labels.env=prod"
# Labels appear in billing exports for cost allocationWhat is the difference between regions and zones in GCP?
Regions are independent geographic areas containing multiple zones, while zones are isolated locations within a region. This hierarchy provides both geographic distribution and fault isolation for your applications.
A region like us-central1 represents a geographic area (Iowa, USA) and contains multiple zones (us-central1-a, us-central1-b, etc.). Zones within a region have low-latency network connections but are isolated from each other's failures—a power outage in one zone doesn't affect others. Regional resources span zones automatically, while zonal resources exist in a single zone.
Choosing regions and zones:
- Deploy across multiple zones for high availability within a region
- Deploy across multiple regions for disaster recovery
- Choose regions close to your users for lower latency
- Consider data residency requirements for compliance
Compute Engine Questions
Compute Engine provides virtual machines in GCP, comparable to AWS EC2. Understanding VM options and configurations is fundamental for any GCP interview.
What are the different machine type families in Compute Engine?
GCP offers several machine type families optimized for different workloads, each with distinct price-performance characteristics. Choosing the right family can significantly impact both cost and application performance.
General-purpose machines (E2, N2, N2D) balance compute, memory, and networking for most workloads. Compute-optimized machines (C2, C2D) provide the highest per-core performance for CPU-intensive applications. Memory-optimized machines (M2, M3) offer high memory-to-CPU ratios for in-memory databases. Accelerator-optimized machines (A2, G2) include GPUs for machine learning and graphics workloads.
| Family | Use Case | Key Feature |
|---|---|---|
| E2 | Cost-effective general purpose | Lowest cost, shared-core options |
| N2/N2D | Balanced workloads | Good price-performance |
| C2/C2D | CPU-intensive | Highest per-core performance |
| M2/M3 | Memory-intensive | Up to 12TB RAM |
| A2/G2 | ML and graphics | NVIDIA GPUs attached |
What are preemptible VMs and when should you use them?
Preemptible VMs are short-lived compute instances available at 60-91% discount compared to regular VMs. Google can terminate them at any time with 30 seconds notice, and they automatically shut down after 24 hours. This significant cost reduction comes with the trade-off of uncertain availability.
Use preemptible VMs for fault-tolerant workloads that can handle interruption—batch processing jobs, CI/CD build agents, distributed data processing, rendering farms, or development environments. Don't use them for databases, user-facing services, or any workload requiring guaranteed availability.
# Create a preemptible instance
gcloud compute instances create batch-worker \
--machine-type=n2-standard-4 \
--preemptible \
--no-restart-on-failure \
--maintenance-policy=terminate
# Spot VMs (newer, similar concept without 24-hour limit)
gcloud compute instances create spot-worker \
--machine-type=n2-standard-4 \
--provisioning-model=SPOT \
--instance-termination-action=STOPWhat is the difference between preemptible VMs and Spot VMs?
Spot VMs are the evolution of preemptible VMs with some key differences. Both offer significant discounts for interruptible compute capacity, but Spot VMs provide more flexibility in how you handle termination.
Preemptible VMs always terminate after 24 hours and must be deleted when preempted. Spot VMs have no maximum runtime and can be configured to either stop or delete when capacity is reclaimed. Spot VMs also support live migration in some cases, whereas preemptible VMs never do. For new workloads, Google recommends Spot VMs as the modern replacement.
What is live migration and why does it matter?
Live migration moves running VMs between physical hosts without visible downtime to your applications. When Google needs to perform hardware maintenance, update host software, or respond to impending hardware failures, live migration keeps your VMs running uninterrupted.
This feature sets GCP apart from some other cloud providers where maintenance events cause brief outages. Your VM's memory, network connections, and local state transfer seamlessly to a new host. Live migration is enabled by default for most VM types and works transparently—you don't need to configure anything, and your applications continue running during the migration.
When would you use sole-tenant nodes?
Sole-tenant nodes provide dedicated physical servers where only your VMs run, isolating your workloads from other customers at the hardware level. This physical isolation addresses compliance requirements and licensing considerations that shared infrastructure cannot satisfy.
Use sole-tenant nodes when you have bring-your-own-license (BYOL) software with per-core or per-socket licensing that requires dedicated hardware. They're also necessary for compliance frameworks requiring physical isolation, such as certain financial or healthcare regulations. The trade-off is higher cost and reduced flexibility compared to shared infrastructure.
Serverless Compute Questions
GCP offers multiple serverless compute options, each suited to different use cases. Understanding when to use each is a common interview topic.
What is Cloud Run and when should you use it?
Cloud Run is a fully managed container platform that runs stateless containers without requiring you to manage infrastructure. It automatically scales from zero to thousands of instances based on incoming requests, and you pay only for the compute time you actually use.
Cloud Run excels for HTTP-driven workloads like REST APIs, web applications, and microservices. Because it runs containers, you can use any language, framework, or binary—anything that can be containerized. The concurrency model lets a single instance handle multiple concurrent requests, making it more efficient than function-as-a-service for many workloads.
# Cloud Run service configuration
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: api-service
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/minScale: "1" # Avoid cold starts
autoscaling.knative.dev/maxScale: "100"
spec:
containerConcurrency: 80 # Requests per instance
containers:
- image: gcr.io/my-project/api:latest
resources:
limits:
cpu: "2"
memory: "1Gi"What is the difference between Cloud Run and Cloud Functions?
Cloud Run and Cloud Functions are both serverless, but they target different use cases and have different execution models. The choice depends on whether you need container flexibility or event-driven simplicity.
Cloud Run runs containers, supports any runtime, and handles concurrent requests within a single instance. It's ideal for web applications, APIs, and services where you need full control over your runtime environment. Cloud Functions runs code snippets triggered by events, supports specific runtimes (Node.js, Python, Go, Java), and processes one event per instance in Gen 1 (Gen 2 supports concurrency).
| Aspect | Cloud Run | Cloud Functions |
|---|---|---|
| Unit of deployment | Container | Function code |
| Trigger types | HTTP (primarily) | HTTP, Pub/Sub, Storage, Firestore, etc. |
| Concurrency | Multiple requests per instance | One per instance (Gen 1) |
| Cold starts | Generally faster | Can be slower for infrequent invocations |
| Best for | Web apps, APIs, microservices | Event processing, webhooks, automation |
How do you handle cold starts in Cloud Run?
Cold starts occur when Cloud Run needs to spin up a new instance to handle a request, adding latency while the container initializes. For latency-sensitive applications, several strategies can minimize or eliminate cold start impact.
The most direct solution is setting a minimum instance count so at least one instance is always warm and ready. You can also optimize your container startup time by using smaller base images, lazy-loading dependencies, and minimizing initialization work. For applications that can tolerate occasional latency, properly configured health checks ensure traffic only routes to fully-initialized instances.
# Set minimum instances to avoid cold starts
gcloud run deploy my-service \
--image=gcr.io/my-project/my-image \
--min-instances=1 \
--max-instances=100
# Startup probe ensures traffic waits for initialization
# In your service.yaml:
# startupProbe:
# httpGet:
# path: /healthz
# initialDelaySeconds: 0
# periodSeconds: 1
# failureThreshold: 30How do you process Cloud Storage uploads automatically?
Cloud Functions excels at responding to GCP events like file uploads. When a file lands in a Cloud Storage bucket, a function can automatically process it—resizing images, parsing data files, or triggering downstream workflows.
The function receives event metadata including the bucket name and object path. You download the file, process it, and optionally write results back to storage. Handle idempotency carefully since cloud functions may execute multiple times for the same event in rare cases.
// Cloud Function triggered by Cloud Storage upload
const sharp = require('sharp');
const { Storage } = require('@google-cloud/storage');
exports.processImage = async (event, context) => {
const storage = new Storage();
const bucket = storage.bucket(event.bucket);
const file = bucket.file(event.name);
// Skip if already processed (avoid infinite loops)
if (event.name.startsWith('processed/')) return;
const [buffer] = await file.download();
const processed = await sharp(buffer)
.resize(800, 600)
.jpeg({ quality: 80 })
.toBuffer();
await bucket.file(`processed/${event.name}`).save(processed);
console.log(`Processed: ${event.name}`);
};What is Cloud Run Jobs and how does it differ from Cloud Run services?
Cloud Run Jobs runs containers to completion without serving HTTP requests, designed for batch processing, data migrations, scheduled tasks, and other run-to-completion workloads. Unlike Cloud Run services which stay running to handle requests, Jobs execute once and terminate.
Jobs support parallelism—you can run multiple instances of your container simultaneously to process work faster. They integrate with Cloud Scheduler for cron-like scheduling and can be triggered manually or programmatically. Use Jobs when you need container flexibility for batch work that doesn't fit the request-response model.
# Create a Cloud Run Job
gcloud run jobs create data-processor \
--image=gcr.io/my-project/processor:latest \
--tasks=10 \
--parallelism=5 \
--max-retries=3
# Execute the job
gcloud run jobs execute data-processor
# Schedule with Cloud Scheduler
gcloud scheduler jobs create http daily-process \
--schedule="0 2 * * *" \
--uri="https://us-central1-run.googleapis.com/apis/run.googleapis.com/v1/namespaces/my-project/jobs/data-processor:run" \
--http-method=POST \
--oauth-service-account-email=scheduler@my-project.iam.gserviceaccount.comGKE and Container Questions
Google Kubernetes Engine is GCP's managed Kubernetes service and often receives significant focus in cloud interviews.
What is the difference between GKE Autopilot and Standard mode?
GKE offers two operational modes with fundamentally different management models. Standard mode gives you full control over node configuration and management, while Autopilot abstracts away the node layer entirely, letting you focus purely on workloads.
In Standard mode, you create and manage node pools, choose machine types, configure autoscaling, and handle node security and updates. In Autopilot, Google manages all node infrastructure automatically—you just deploy pods, and GKE provisions appropriate nodes transparently. Autopilot enforces security best practices and charges per pod resource request rather than per node.
| Aspect | Standard | Autopilot |
|---|---|---|
| Node management | You manage | Google manages |
| Pricing model | Per node (VM cost) | Per pod resources |
| Customization | Full control | Limited (security-hardened) |
| Node access | SSH available | No node access |
| Best for | Complex/custom workloads | Simplified operations |
# Create Autopilot cluster (simpler)
gcloud container clusters create-auto my-cluster \
--region=us-central1
# Create Standard cluster (more control)
gcloud container clusters create my-cluster \
--region=us-central1 \
--num-nodes=3 \
--machine-type=e2-standard-4 \
--enable-autoscaling \
--min-nodes=1 \
--max-nodes=10What is Workload Identity and why is it the recommended authentication method?
Workload Identity provides a secure way for GKE workloads to authenticate to Google Cloud APIs without managing service account keys. It maps Kubernetes service accounts to GCP service accounts, allowing pods to inherit cloud permissions based on their Kubernetes identity.
Service account key files are a security risk—they can be leaked, don't expire automatically, and are difficult to rotate across distributed systems. Workload Identity eliminates keys entirely. When a pod uses a mapped Kubernetes service account, GKE automatically provides GCP credentials through the metadata server, following the same pattern as Compute Engine instance authentication.
# Kubernetes ServiceAccount with Workload Identity annotation
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-app
annotations:
iam.gke.io/gcp-service-account: my-app@my-project.iam.gserviceaccount.com
---
# Pod using the ServiceAccount
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
serviceAccountName: my-app
containers:
- name: app
image: gcr.io/my-project/my-app# Bind Kubernetes SA to GCP SA
gcloud iam service-accounts add-iam-policy-binding \
my-app@my-project.iam.gserviceaccount.com \
--role=roles/iam.workloadIdentityUser \
--member="serviceAccount:my-project.svc.id.goog[default/my-app]"What are node pools in GKE and why would you use multiple pools?
Node pools are groups of nodes within a GKE cluster that share the same configuration—machine type, disk size, labels, and taints. A single cluster can have multiple node pools with different configurations, allowing you to optimize for different workload requirements.
You might have a general-purpose pool for most workloads, a high-memory pool for databases and caching, and a GPU pool for machine learning jobs. Node pools can scale independently and have different autoscaling configurations. Taints and tolerations ensure pods schedule to appropriate pools.
# Create cluster with default pool
gcloud container clusters create my-cluster \
--num-nodes=3 \
--machine-type=e2-standard-4
# Add high-memory pool for databases
gcloud container node-pools create high-mem-pool \
--cluster=my-cluster \
--machine-type=n2-highmem-8 \
--num-nodes=2 \
--node-taints=workload=database:NoSchedule
# Add GPU pool for ML workloads
gcloud container node-pools create gpu-pool \
--cluster=my-cluster \
--machine-type=n1-standard-8 \
--accelerator=type=nvidia-tesla-t4,count=1 \
--num-nodes=0 \
--enable-autoscaling \
--min-nodes=0 \
--max-nodes=5What is Artifact Registry and how does it differ from Container Registry?
Artifact Registry is GCP's universal package repository that stores container images, language packages (npm, Maven, Python), and OS packages. It replaces Container Registry with additional features and broader format support.
Container Registry (gcr.io) only stores Docker images and has limited access control. Artifact Registry supports multiple formats, provides fine-grained IAM permissions at the repository level, integrates vulnerability scanning, and offers regional and multi-regional options. Google recommends migrating from Container Registry to Artifact Registry for new projects.
# Create Docker repository in Artifact Registry
gcloud artifacts repositories create my-repo \
--repository-format=docker \
--location=us-central1 \
--description="Docker images"
# Configure Docker authentication
gcloud auth configure-docker us-central1-docker.pkg.dev
# Push image (note the different domain)
docker tag my-image us-central1-docker.pkg.dev/my-project/my-repo/my-image:v1
docker push us-central1-docker.pkg.dev/my-project/my-repo/my-image:v1How do you implement a rolling update strategy in GKE?
Rolling updates gradually replace old pods with new ones, ensuring your application remains available throughout the deployment. Kubernetes manages this process based on parameters you configure, balancing update speed against availability guarantees.
The maxSurge parameter controls how many extra pods can exist during the update, while maxUnavailable controls how many pods can be unavailable. A typical configuration allows one extra pod and zero unavailable, ensuring capacity never drops below 100% during updates. Readiness probes ensure traffic only routes to pods that are actually ready to serve.
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # One extra pod during update
maxUnavailable: 0 # Never reduce below desired count
template:
spec:
containers:
- name: app
image: gcr.io/my-project/my-app:v2
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 5IAM and Security Questions
Identity and Access Management is fundamental to GCP security. Interviewers often probe deeply into IAM concepts and best practices.
How does GCP IAM work?
GCP IAM follows a simple model: it defines who (principal) can do what (role) on which resource. Policies bind principals to roles at specific points in the resource hierarchy, and permissions flow down from organization to folders to projects to individual resources.
Principals can be Google accounts, service accounts, Google groups, or domains. Roles are collections of permissions—predefined roles cover common use cases, while custom roles let you define precise permission sets. The principle of least privilege guides IAM design: grant only the permissions necessary for each principal's function.
# Grant a predefined role at project level
gcloud projects add-iam-policy-binding my-project \
--member="user:developer@example.com" \
--role="roles/compute.instanceAdmin"
# Grant at resource level (more restrictive)
gcloud storage buckets add-iam-policy-binding gs://my-bucket \
--member="serviceAccount:my-app@my-project.iam.gserviceaccount.com" \
--role="roles/storage.objectViewer"What is the difference between primitive, predefined, and custom roles?
GCP offers three categories of roles with increasing specificity. Understanding when to use each type is essential for implementing least-privilege access control.
Primitive roles (Owner, Editor, Viewer) are legacy roles with broad permissions across all services—avoid them for production. Predefined roles are service-specific and granular, like roles/storage.objectViewer or roles/compute.networkAdmin, designed for common use cases. Custom roles let you select individual permissions to create precisely-scoped access when predefined roles are too broad.
| Role Type | Example | Use Case |
|---|---|---|
| Primitive | roles/editor | Quick prototyping (avoid in production) |
| Predefined | roles/bigquery.dataViewer | Most production scenarios |
| Custom | Your defined permissions | When predefined roles grant too much |
# Create custom role with specific permissions
gcloud iam roles create limitedStorageReader \
--project=my-project \
--title="Limited Storage Reader" \
--description="Can only list and read objects" \
--permissions=storage.objects.get,storage.objects.listWhat are service accounts and when should you use them?
Service accounts are special accounts for applications and services rather than humans. They authenticate workloads to GCP APIs and can be used to run Compute Engine instances, GKE pods, Cloud Functions, and other services with specific permissions.
Every GCP project has default service accounts that are often overprivileged. Best practice is to create dedicated service accounts for each application with only the permissions it needs. Avoid service account keys when possible—use Workload Identity for GKE, attached service accounts for Compute Engine, and the default credentials chain for other services.
# Create dedicated service account
gcloud iam service-accounts create my-app-sa \
--display-name="My Application Service Account"
# Grant specific permissions
gcloud projects add-iam-policy-binding my-project \
--member="serviceAccount:my-app-sa@my-project.iam.gserviceaccount.com" \
--role="roles/storage.objectViewer"
gcloud projects add-iam-policy-binding my-project \
--member="serviceAccount:my-app-sa@my-project.iam.gserviceaccount.com" \
--role="roles/pubsub.publisher"
# Attach to Compute Engine instance (no key needed)
gcloud compute instances create my-vm \
--service-account=my-app-sa@my-project.iam.gserviceaccount.com \
--scopes=cloud-platformHow do you implement service account impersonation?
Service account impersonation lets you temporarily act as a service account without downloading its key. This is useful for local development, CI/CD pipelines, and administrative tasks where you need elevated permissions temporarily.
Your user account needs the roles/iam.serviceAccountTokenCreator role on the target service account. Then you can generate short-lived credentials or configure tools to impersonate the account. This is more secure than key files because credentials are temporary and auditable.
# Grant impersonation permission
gcloud iam service-accounts add-iam-policy-binding \
my-app-sa@my-project.iam.gserviceaccount.com \
--member="user:developer@example.com" \
--role="roles/iam.serviceAccountTokenCreator"
# Use impersonation for gcloud commands
gcloud auth application-default login \
--impersonate-service-account=my-app-sa@my-project.iam.gserviceaccount.com
# Or set for current session
gcloud config set auth/impersonate_service_account \
my-app-sa@my-project.iam.gserviceaccount.comWhat are IAM conditions and when would you use them?
IAM conditions add contextual rules to policy bindings, granting access only when specified conditions are met. This enables fine-grained access control based on attributes like time, resource properties, or request context.
Common uses include temporary access (grant permissions only during business hours or until a specific date), resource-based restrictions (access only resources with certain tags), and IP-based restrictions (access only from corporate network). Conditions use Common Expression Language (CEL) syntax.
# Grant access only during business hours
gcloud projects add-iam-policy-binding my-project \
--member="user:contractor@example.com" \
--role="roles/compute.instanceAdmin" \
--condition='expression=request.time.getHours("America/New_York") >= 9 && request.time.getHours("America/New_York") < 17,title=business-hours-only'
# Grant access until a specific date
gcloud projects add-iam-policy-binding my-project \
--member="user:contractor@example.com" \
--role="roles/storage.objectViewer" \
--condition='expression=request.time < timestamp("2025-03-01T00:00:00Z"),title=temporary-access'Networking Questions
GCP networking has unique characteristics, particularly the global VPC model. Understanding these differences is important for cloud interviews.
How does GCP VPC differ from AWS VPC?
GCP VPCs are global by default, which fundamentally changes how you design multi-region architectures. In AWS, VPCs are regional, requiring peering or Transit Gateway to connect resources across regions. In GCP, a single VPC spans all regions automatically—instances in different regions can communicate directly using internal IPs.
GCP uses firewall rules at the VPC level with network tags or service accounts for targeting, while AWS uses security groups attached to individual resources. GCP's approach is more centralized and can be simpler for consistent policies, while AWS provides more instance-specific control.
| Aspect | GCP VPC | AWS VPC |
|---|---|---|
| Scope | Global | Regional |
| Subnets | Regional | Zonal (AZ-specific) |
| Firewall model | VPC-level rules with tags | Security groups per resource |
| Cross-region communication | Direct (same VPC) | Requires peering/Transit Gateway |
How do you create a multi-region network in GCP?
Because GCP VPCs are global, creating a multi-region network is straightforward—you simply create subnets in different regions within the same VPC. Instances in different regions can communicate directly using internal IP addresses without any peering configuration.
Custom mode VPCs give you explicit control over subnet creation and IP ranges. Enable Private Google Access on each subnet so instances without public IPs can still reach GCP APIs. The global VPC model significantly simplifies multi-region architectures compared to other cloud providers.
# Create custom VPC (global by default)
gcloud compute networks create my-vpc --subnet-mode=custom
# Create subnets in different regions
gcloud compute networks subnets create us-subnet \
--network=my-vpc \
--region=us-central1 \
--range=10.0.1.0/24 \
--enable-private-ip-google-access
gcloud compute networks subnets create eu-subnet \
--network=my-vpc \
--region=europe-west1 \
--range=10.0.2.0/24 \
--enable-private-ip-google-access
gcloud compute networks subnets create asia-subnet \
--network=my-vpc \
--region=asia-east1 \
--range=10.0.3.0/24 \
--enable-private-ip-google-access
# Instances in all three regions can now communicate directlyHow do firewall rules work in GCP?
GCP firewall rules control traffic at the VPC level, specifying what traffic is allowed or denied based on source, destination, protocol, and port. Rules can target instances using network tags or service accounts, providing flexibility in how you apply security policies.
Every VPC has implied rules: deny all ingress and allow all egress. You create explicit rules to allow necessary traffic. Rules have priorities (0-65535, lower is higher priority), and the first matching rule determines whether traffic is allowed. This differs from AWS security groups which are stateful and attached to individual instances.
# Allow HTTP/HTTPS to instances tagged 'web'
gcloud compute firewall-rules create allow-web-traffic \
--network=my-vpc \
--allow=tcp:80,tcp:443 \
--target-tags=web \
--source-ranges=0.0.0.0/0 \
--priority=1000
# Allow internal communication within VPC
gcloud compute firewall-rules create allow-internal \
--network=my-vpc \
--allow=tcp,udp,icmp \
--source-ranges=10.0.0.0/8 \
--priority=1000
# Allow SSH only from specific IP (admin access)
gcloud compute firewall-rules create allow-ssh-admin \
--network=my-vpc \
--allow=tcp:22 \
--target-tags=allow-ssh \
--source-ranges=203.0.113.0/24 \
--priority=900What load balancer should you use for a global web application?
The Global External HTTP(S) Load Balancer is the right choice for web applications serving users worldwide. It provides a single anycast IP address that routes users to the nearest healthy backend, SSL termination at the edge, integration with Cloud CDN, and Cloud Armor for DDoS protection.
This load balancer operates at Layer 7, understanding HTTP and HTTPS traffic. It can route based on URL paths, host headers, and other request attributes. The anycast IP means the same address resolves to different points of presence globally, minimizing latency for users everywhere.
# Create health check
gcloud compute health-checks create http http-health-check \
--port=80 \
--request-path=/health
# Create backend service
gcloud compute backend-services create web-backend \
--global \
--protocol=HTTP \
--health-checks=http-health-check \
--port-name=http
# Create URL map (routing rules)
gcloud compute url-maps create web-map \
--default-service=web-backend
# Create HTTPS proxy with SSL certificate
gcloud compute target-https-proxies create web-proxy \
--url-map=web-map \
--ssl-certificates=my-cert
# Create forwarding rule (the public IP)
gcloud compute forwarding-rules create web-rule \
--global \
--target-https-proxy=web-proxy \
--ports=443What is Private Google Access and why is it important?
Private Google Access enables instances without public IP addresses to reach Google APIs and services through internal networks. Without it, a VM without a public IP cannot access Cloud Storage, BigQuery, or other GCP services—even within GCP's network.
This feature is essential for security-conscious architectures where you don't want instances exposed to the internet. Enable it at the subnet level, and instances in that subnet can access google.com services using their internal IP as the source. Combined with Cloud NAT for outbound internet access, you can run workloads entirely without public IPs.
# Enable Private Google Access on a subnet
gcloud compute networks subnets update my-subnet \
--region=us-central1 \
--enable-private-ip-google-access
# Verify it's enabled
gcloud compute networks subnets describe my-subnet \
--region=us-central1 \
--format="get(privateIpGoogleAccess)"Data and Analytics Questions
GCP's data and analytics services, particularly BigQuery, are major differentiators. These questions come up frequently in interviews.
What is BigQuery and how does it differ from traditional data warehouses?
BigQuery is a serverless, highly-scalable data warehouse that separates storage from compute. Unlike traditional data warehouses where you provision and manage clusters, BigQuery automatically allocates resources for each query—you don't manage infrastructure, and you pay per query or for reserved capacity.
The columnar storage format is optimized for analytical queries that scan specific columns across billions of rows. BigQuery's architecture allows it to scan terabytes in seconds by parallelizing work across thousands of nodes transparently. Standard SQL support means analysts can use familiar syntax without learning a new query language.
How does BigQuery pricing work?
BigQuery has two pricing models to fit different usage patterns. On-demand pricing charges $5 per TB of data scanned, with the first 1TB per month free. Flat-rate pricing provides dedicated query capacity (slots) for predictable monthly costs, better for heavy, consistent usage.
Storage costs $0.02/GB/month for active data (queried in last 90 days) and $0.01/GB/month for long-term storage. Understanding this model is crucial for cost optimization—the amount of data scanned directly impacts cost, making query optimization and table design essential.
| Pricing Model | Cost | Best For |
|---|---|---|
| On-demand | $5/TB scanned | Variable usage, exploration |
| Flat-rate | Fixed monthly (slots) | Heavy, predictable usage |
| Storage (active) | $0.02/GB/month | Frequently queried data |
| Storage (long-term) | $0.01/GB/month | Data not queried in 90+ days |
How do you optimize BigQuery costs and performance?
Cost optimization in BigQuery revolves around reducing the amount of data scanned per query. Partitioning divides tables by date or integer range, so queries can skip irrelevant partitions entirely. Clustering sorts data within partitions by specified columns, improving filter efficiency and reducing bytes scanned.
Always select only the columns you need—SELECT * scans every column and dramatically increases cost. Use the query validator or --dry-run flag to preview bytes scanned before running expensive queries. For large analytical workloads, consider materialized views to pre-compute common aggregations.
-- Create partitioned and clustered table
CREATE TABLE my_dataset.events
PARTITION BY DATE(event_timestamp)
CLUSTER BY user_id, event_type
AS SELECT * FROM raw_events;
-- Query uses partition pruning (only scans one day)
SELECT user_id, event_type, COUNT(*)
FROM my_dataset.events
WHERE DATE(event_timestamp) = '2025-01-15' -- Partition filter
AND event_type = 'purchase' -- Clustering benefits this
GROUP BY user_id, event_type;
-- Check bytes scanned before running
-- Use --dry_run with bq command or check validator in console
bq query --dry_run \
'SELECT * FROM my_dataset.events WHERE DATE(event_timestamp) = "2025-01-15"'What are the Cloud Storage classes and when should you use each?
Cloud Storage offers four storage classes with different price and access characteristics. Standard storage is for frequently accessed data with no minimum storage duration. Nearline is for data accessed less than once per month with a 30-day minimum. Coldline suits quarterly access patterns with a 90-day minimum. Archive is for yearly access with a 365-day minimum.
Each tier reduces storage cost but increases retrieval cost. Use lifecycle policies to automatically transition objects between classes as they age—start in Standard, move to Nearline after 30 days, Coldline after 90 days, and delete after a year. This automation optimizes costs without manual intervention.
| Class | Storage Cost | Retrieval Cost | Min Duration | Use Case |
|---|---|---|---|---|
| Standard | $0.020/GB | Free | None | Active data |
| Nearline | $0.010/GB | $0.01/GB | 30 days | Monthly backups |
| Coldline | $0.004/GB | $0.02/GB | 90 days | Quarterly archives |
| Archive | $0.0012/GB | $0.05/GB | 365 days | Compliance archives |
# Create bucket with lifecycle policy
cat > lifecycle.json << 'EOF'
{
"rule": [
{
"action": {"type": "SetStorageClass", "storageClass": "NEARLINE"},
"condition": {"age": 30}
},
{
"action": {"type": "SetStorageClass", "storageClass": "COLDLINE"},
"condition": {"age": 90}
},
{
"action": {"type": "SetStorageClass", "storageClass": "ARCHIVE"},
"condition": {"age": 180}
},
{
"action": {"type": "Delete"},
"condition": {"age": 365}
}
]
}
EOF
gcloud storage buckets update gs://my-bucket --lifecycle-file=lifecycle.jsonWhat is Pub/Sub and when would you use it?
Pub/Sub is a fully managed messaging service for asynchronous communication between services. Publishers send messages to topics, and subscribers receive messages through subscriptions. This decouples producers from consumers, enabling scalable, fault-tolerant architectures.
Use Pub/Sub for event-driven systems, data streaming pipelines, and service integration. It provides at-least-once delivery (messages may be delivered multiple times, so design for idempotency), supports both push and pull delivery models, and scales automatically to handle millions of messages per second.
# Publisher
from google.cloud import pubsub_v1
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path('my-project', 'user-events')
data = '{"event": "user_signup", "user_id": "123"}'
future = publisher.publish(topic_path, data.encode('utf-8'))
print(f'Published message ID: {future.result()}')
# Subscriber
subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path('my-project', 'user-events-sub')
def callback(message):
print(f'Received: {message.data}')
# Process message idempotently
message.ack()
streaming_pull_future = subscriber.subscribe(subscription_path, callback=callback)Cost Optimization Questions
Cost optimization is a practical concern that interviewers use to assess real-world cloud experience.
What are the main strategies for reducing GCP costs?
GCP cost optimization combines resource right-sizing, commitment-based discounts, and architectural decisions. Right-sizing means matching resource allocation to actual usage—use the Recommender API to identify oversized instances. Committed use discounts provide 1-3 year commitments for predictable workloads at significant savings.
For variable workloads, use preemptible or spot VMs for fault-tolerant jobs. Serverless services like Cloud Run and BigQuery eliminate idle capacity costs. Storage lifecycle policies automatically move data to cheaper tiers. Implement cost allocation using labels and set up budget alerts to catch unexpected spending.
How do committed use discounts work?
Committed use discounts (CUDs) provide significant savings (up to 57% for 3 years) in exchange for committing to minimum resource usage. You commit to a certain amount of vCPU and memory in a region, and GCP automatically applies the discount to matching usage.
CUDs are use-it-or-lose-it—you pay the committed amount whether you use the resources or not. Analyze your baseline usage carefully before committing. Resource-based commitments apply to Compute Engine and GKE, while spend-based commitments cover services like Cloud SQL and Cloud Run.
# View commitment options
gcloud compute commitments list-regions
# Create a commitment (example: 3-year commitment)
gcloud compute commitments create my-commitment \
--region=us-central1 \
--resources=vcpu=100,memory=400GB \
--plan=36-month
# View active commitments
gcloud compute commitments listHow do you track and allocate GCP costs across teams?
Labels provide the foundation for cost allocation in GCP. Apply consistent labels to all resources indicating team, project, environment, and cost center. These labels appear in billing exports, enabling detailed cost breakdown and chargeback to teams.
Export billing data to BigQuery for custom analysis and dashboards. Set up budget alerts at the project or label level to notify teams when spending approaches limits. The Cloud Billing Reports in the console provide quick insights, while BigQuery exports enable sophisticated analysis and integration with business intelligence tools.
# Create resources with cost allocation labels
gcloud compute instances create my-vm \
--labels=team=platform,env=prod,cost-center=eng-123
# Export billing to BigQuery (via console or API)
# Then query for cost breakdown-- BigQuery cost analysis by team
SELECT
labels.value AS team,
SUM(cost) AS total_cost
FROM `my-project.billing_export.gcp_billing_export_v1_*`
CROSS JOIN UNNEST(labels) AS labels
WHERE labels.key = 'team'
AND _PARTITIONTIME >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY team
ORDER BY total_cost DESC;Quick Reference
| Topic | Key Points |
|---|---|
| Resource Hierarchy | Organization → Folders → Projects → Resources |
| Compute Options | Compute Engine (VMs), Cloud Run (containers), Cloud Functions (events), GKE (Kubernetes) |
| GKE Modes | Standard (you manage nodes), Autopilot (Google manages) |
| IAM Model | Who (principal) + What (role) + Which (resource) |
| VPC | Global by default, subnets are regional |
| BigQuery | Serverless warehouse, pay per TB scanned or flat-rate |
| Storage Classes | Standard → Nearline → Coldline → Archive |
| Cost Optimization | Right-size, CUDs, spot VMs, lifecycle policies |
Related Resources
- Kubernetes Interview Guide - Core concepts that apply to GKE
- Docker Interview Guide - Container fundamentals
- AWS Interview Guide - Compare cloud platforms
- CI/CD & GitHub Actions Guide - Deploy to GCP
- System Design Interview Guide - Architecture patterns
