Google Cloud Integration

Connect to Google Cloud services including BigQuery for data warehousing and Cloud Storage for file management. Perfect for analytics, data lakes, backup storage, and large-scale data processing workflows.

Overview

This integration package provides access to Google Cloud Platform services, enabling you to insert data into BigQuery tables, query large datasets, and manage files in Cloud Storage buckets. Authenticate using service account credentials for secure, automated access to your Google Cloud resources.

Key Features

📊 BigQuery Integration

Insert, query, and manage data in Google's serverless data warehouse.

☁️ Cloud Storage

Upload and manage files in Google Cloud Storage buckets.

🔐 Service Account Auth

Secure authentication using Google Cloud service account credentials.

🗑️ Upsert Support

Delete existing records before inserting to prevent duplicates.

🔍 Advanced Querying

Filter, order, and select specific fields when retrieving BigQuery data.

📁 Bulk Import

Load data from Cloud Storage files directly into BigQuery tables.

Authentication Setup

To use this integration, you must create a Google Cloud service account with appropriate permissions. Follow Google's documentation to generate a JSON key file.

Service Account Key Structure

{
  "type": "service_account",
  "project_id": "your-project-id",
  "private_key_id": "key-id",
  "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
  "client_email": "service-account@your-project.iam.gserviceaccount.com",
  "client_id": "123456789",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/..."
}

⚠️ Required Permissions

Ensure your service account has the following roles:

  • BigQuery Data Editor - For inserting and querying data
  • BigQuery Job User - For running queries
  • Storage Object Admin - For Cloud Storage operations

Common Use Cases

📊 Analytics Data Warehouse

Stream e-commerce events (orders, page views, conversions) to BigQuery for real-time analytics and reporting.

Flow: Shopify Order → Transform → Insert to BigQuery → Looker Dashboard

🔄 Data Lake Integration

Centralize data from multiple sources into BigQuery for unified reporting and machine learning.

Flow: Multiple APIs → Normalize → BigQuery Tables → Data Studio Reports

💾 Backup & Archive

Archive production data to Cloud Storage for compliance, disaster recovery, or long-term retention.

Flow: Database Export → Compress → Upload to Cloud Storage → Lifecycle Policy

📈 Historical Reporting

Build historical datasets by syncing daily snapshots of transactional data to BigQuery.

Flow: Daily Batch → Insert Rows → Partition by Date → Query Historical Trends

🔍 Customer 360

Combine customer data from multiple touchpoints into a single BigQuery view for comprehensive analysis.

Flow: CRM + Orders + Support Tickets → BigQuery → Customer Segmentation

📁 Large File Processing

Upload large CSV/JSON files to Cloud Storage, then bulk import into BigQuery for processing.

Flow: Generate CSV → Upload to Storage → Insert from File → Transform in BigQuery

Best Practices

Use Partitioned Tables

Partition BigQuery tables by date or timestamp to improve query performance and reduce costs. Queries only scan relevant partitions instead of the entire table.

Implement Upsert Logic

Use the delete configuration to remove existing records before inserting updates, preventing duplicate data in your tables.

Batch Large Inserts

When inserting large datasets, batch rows together (100-1000 per request) to improve performance and reduce API quota usage.

Use File Imports for Bulk Data

For datasets larger than 10,000 rows, upload to Cloud Storage first and use rows_insert_from_file for better performance and reliability.

Monitor Costs

BigQuery charges based on data processed. Use query_fields to select only needed columns and apply filters to reduce scan size.

Secure Service Account Keys

Store service account credentials securely using environment variables or secret managers. Never commit keys to version control.

Troubleshooting

🔴 Issue: 403 Permission Denied

Cause: Service account lacks required permissions.

Solution:

  • Verify service account has BigQuery Data Editor and Job User roles
  • Check that dataset and table exist in the specified project
  • Ensure service account is added to the project's IAM policies
  • Confirm the project_id in the key matches your target project

🔴 Issue: Table Not Found

Cause: Incorrect dataset or table name.

Solution:

  • Verify datasetId and table names are correct
  • Check for typos or case sensitivity issues
  • Ensure table exists in BigQuery console before inserting
  • Create table schema manually if auto-creation is disabled

🔴 Issue: Invalid Credentials

Cause: Malformed or expired service account key.

Solution:

  • Verify the JSON key structure is complete and valid
  • Check that private_key includes BEGIN/END markers and line breaks
  • Regenerate service account key if it's expired or revoked
  • Ensure key hasn't been deleted in Google Cloud Console

🔴 Issue: Schema Mismatch

Cause: Data fields don't match table schema.

Solution:

  • Review table schema in BigQuery and ensure data matches field types
  • Check for missing required fields or extra fields not in schema
  • Use correct data types (STRING, INTEGER, FLOAT, TIMESTAMP, etc.)
  • Update table schema or transform data to match expected format