Architecture Overview

Retrieve is built on a modern, scalable architecture designed to handle complex data integrations efficiently and reliably. Each customer application has dedicated infrastructure to ensure isolation, security, and optimal performance.

Your Dedicated Infrastructure

Every Retrieve customer receives a dedicated infrastructure setup consisting of two core components:

1. Integrations Server

Your dedicated server where all your enabled integrations run. This server:

  • Runs your workflows: Executes all enabled integration packages
  • Processes your data: Handles all pull/push actions and transformations
  • Executes custom logic: Runs field mapping functions and rewrite functions
  • Manages credentials: Securely stores and uses your API credentials
  • Applies transformations: Processes data according to your configured mappings

2. Redis Message Server

Your dedicated Redis server that enables communication between integrations. This server:

  • Facilitates data exchange: Passes data between workflow nodes
  • Manages job queues: Handles job execution order and scheduling
  • Enables coordination: Coordinates multi-step workflows
  • Provides reliability: Ensures message delivery between integrations
  • Supports parallel processing: Allows multiple nodes to process simultaneously

System Architecture

Integration Packages

Each integration is a self-contained npm package installed on your integrations server:

@imagination-media/integrator-shopify
@imagination-media/integrator-bigcommerce
@imagination-media/integrator-odoo
... and more

Each package includes:

  • Action processors: Pull and push operations for the platform
  • API client: Pre-configured API communication layer
  • Data transformers: Built-in field mapping capabilities
  • Validators: Configuration and data validation
  • Error handlers: Platform-specific error handling

How Workflows Execute

Workflow Execution Flow

Workflow Execution Flow

Step-by-Step Process

  1. Trigger: Workflow starts (scheduled, webhook, or manual)
  2. Node 1 Executes: First node pulls data from source system
  3. Data Published: Results sent to Redis for next node
  4. Field Mapping Applied: Data transformed according to UI configuration
  5. Custom Functions Run: Field-level JS functions or After Origin Function executes
  6. Node 2 Executes: Receives data from Redis and processes
  7. Rewrite Functions: Optional custom logic filters/enriches data
  8. Final Action: Data pushed to destination system
  9. Completion: Workflow completes, logs saved

Node Communication

Nodes in your workflow communicate through Redis messages:

  • Data Passing: Each node publishes its results to Redis
  • Sequential Processing: Next node subscribes to previous node's output
  • Parallel Branches: Multiple nodes can read the same data simultaneously
  • State Management: Redis maintains workflow state throughout execution

Data Transformation Pipeline

Data flows through multiple transformation layers within each node:

  1. Raw Data Retrieval: Integration package pulls data from API
  2. Field Mapping (UI): Visual mapping transforms data structure
  3. Field-Level Functions: JavaScript functions transform individual fields using originalValue, originalData, integrationConfig
  4. After Origin Function: Complete array transformation with access to job, integration, integrationHelper, queueManager
  5. Rewrite Function: Final custom logic for filtering/enrichment
  6. Output: Transformed data passed to next node or destination

Dedicated Infrastructure Benefits

Isolation

  • Your data stays private: No sharing of infrastructure with other customers
  • Independent scaling: Your performance isn't affected by others
  • Custom configurations: Infrastructure tuned for your needs

Performance

  • Dedicated resources: Full server capacity for your integrations
  • Optimized processing: No resource contention
  • Predictable performance: Consistent execution times

Security

  • Credential isolation: Your API keys never shared or exposed
  • Network isolation: Dedicated network stack per customer
  • Encrypted communication: All data encrypted in transit

Scalability

Handling Growth

As your integration needs grow, your infrastructure can scale:

  • Vertical scaling: Increase server resources for higher throughput
  • Job prioritization: Critical workflows execute first
  • Parallel execution: Multiple workflows run simultaneously
  • Queue management: Redis efficiently handles large job volumes

Reliability & Error Handling

Fault Tolerance

  • Automatic retries: Failed jobs automatically retry with backoff
  • Error logging: Detailed error information captured for debugging
  • Partial completion: Workflows can resume from failed nodes
  • Status tracking: Real-time monitoring of workflow execution

Data Integrity

  • Transaction support: Ensures data consistency
  • Validation: Data validated at each transformation step
  • Audit trails: Complete history of data transformations

Next Steps