Query & Search Entities

Pull / OriginEntity Retrieval

Overview

The Query & Search Entities action allows you to pull products, categories, items, and other entities from InRiver PIM using advanced filtering, pagination, and chunked processing. This is the primary action for extracting product data from InRiver for synchronization with e-commerce platforms, data warehouses, or content management systems.

This action supports complex query filters, incremental synchronization based on modified dates, optional media loading, and configurable chunking for handling large enterprise product catalogs efficiently.

Key Features

🔍 Advanced Filtering

Use InRiver query filters with multiple operators for precise data extraction

📊 Incremental Sync

Date-based filtering to only retrieve entities modified since last sync

⚡ Chunked Processing

Handle millions of entities with configurable batch sizes and delays

🖼️ Media Support

Optionally load product images, videos, and digital asset metadata

🔄 Automatic Pagination

Seamlessly handles InRiver API pagination to retrieve all matching records

🎯 Entity Types

Query any InRiver entity type: Products, Categories, Items, Specifications, etc.

Configuration

Required Fields

Ensure your InRiver integration is configured with valid credentials and query parameters:

{
  "inriver": {
    "base_url": "https://your-instance.productmarketingcloud.com",
    "api_key": "your_api_key_here",
    "query_filters": [
      {
        "type": "ChannelId",
        "value": "393",
        "operator": "Equal"
      }
    ],
    "query_filter_last_date": "2024-01-01T00:00:00",
    "load_values": true,
    "load_media_details": false,
    "chunk_waiting_time": 500,
    "chunk_size": 1000
  }
}

🔑 Configuration Requirements

base_url: Your InRiver REST API endpoint
api_key: InRiver API authentication key
query_filters: Array of filter objects to specify which entities to retrieve
query_filter_last_date: Optional date for incremental sync
load_values: Whether to include entity field values (recommended: true)
load_media_details: Whether to include media/asset metadata
chunk_size: Records per batch (500-2000 recommended)
chunk_waiting_time: Delay between batches in milliseconds

Query Filters

Building Effective Queries

Query filters are the core mechanism for specifying which entities to retrieve from InRiver:

Filter Anatomy

{
  "type": "ChannelId",
  "value": "393",
  "operator": "Equal"
}

type

The InRiver field or attribute to filter on (e.g., ChannelId, EntityTypeId, ModifiedDate, custom field names)

value

The value to compare against (string, number, or date depending on field type)

operator

The comparison operator: Equal, NotEqual, GreaterThan, LessThan, Contains, In, Between

Common Filter Patterns

Filter by Channel

Retrieve entities from specific InRiver channel (e.g., web, print, mobile):

{
  "type": "ChannelId",
  "value": "393",
  "operator": "Equal"
}

Filter by Entity Type

Retrieve specific entity types (Products, Categories, Items):

{
  "type": "EntityTypeId",
  "value": "Product",
  "operator": "Equal"
}

Incremental Sync by Date

Only retrieve entities modified after specific date:

{
  "type": "ModifiedDate",
  "value": "2024-01-25T00:00:00",
  "operator": "GreaterThan"
}

Multiple Filters Combined

Combine filters for precise queries:

[
  {
    "type": "ChannelId",
    "value": "393",
    "operator": "Equal"
  },
  {
    "type": "EntityTypeId",
    "value": "Product",
    "operator": "Equal"
  },
  {
    "type": "ModifiedDate",
    "value": "2024-01-25T00:00:00",
    "operator": "GreaterThan"
  }
]

💡 Filter Best Practices

Start broad, then add filters to narrow results
Always filter by ChannelId for channel-specific data
Use EntityTypeId to separate products, categories, and other entities
Test filters in InRiver UI or Postman before implementing
For incremental sync, add ModifiedDate filter with GreaterThan operator
Combine filters logically—all filters must match (AND logic)

Workflow Configuration

Example workflow configuration for querying InRiver entities:

{
  "name": "Sync Products from InRiver to E-commerce",
  "origin": {
    "microservice": "integrations",
    "action": "query_search",
    "type": "pull",
    "integration": {
      "name": "inriver"
    },
    "mapped_fields": [
      {
        "origin_field": "id",
        "destination_field": "sku"
      },
      {
        "origin_field": "fieldValues.ProductName",
        "destination_field": "title"
      },
      {
        "origin_field": "fieldValues.Description",
        "destination_field": "description"
      },
      {
        "origin_field": "fieldValues.Price",
        "destination_field": "price"
      }
    ]
  },
  "destination": {
    "microservice": "ecommerce",
    "action": "push_products",
    "chunk_size": 100,
    "chunk_waiting_time": 1500
  }
}

Workflow Parameters

action: Set to query_search
type: Set to pull (origin operation)
chunk_size: Number of entities to process per batch (500-2000)
chunk_waiting_time: Delay between batches in milliseconds (500-2000)
integration.name: Set to inriver
mapped_fields: Map InRiver fields to destination system format

Use Cases

🛒 E-commerce Product Sync

Scenario: Pull product catalog from InRiver and push to Shopify, BigCommerce, or Magento.

Benefit: Maintain single source of truth in InRiver while keeping e-commerce platforms updated.

📊 Data Warehouse Population

Scenario: Export InRiver product data to Snowflake, BigQuery, or Redshift for analytics.

Benefit: Combine product data with sales, inventory, and customer data for comprehensive reporting.

🌐 Multi-Channel Publishing

Scenario: Pull channel-specific product data for web, mobile apps, print catalogs.

Benefit: Distribute tailored product content to each channel from InRiver PIM.

🔄 Real-Time Inventory Updates

Scenario: Use incremental sync (every 15-30 minutes) to push product updates as they occur.

Benefit: Keep downstream systems synchronized with near real-time product data.

🖼️ Digital Asset Management

Scenario: Pull product images, videos, and documents for CDN or media library.

Benefit: Centralize digital asset distribution from InRiver to all touchpoints.

📱 Mobile App Content

Scenario: Export product data optimized for mobile applications.

Benefit: Deliver consistent product information across web and mobile experiences.

Data Structure

Entity Response Format

Each entity retrieved from InRiver has the following structure:

{
  "id": 12345,
  "entityTypeId": "Product",
  "fieldValues": [
    {
      "fieldId": "ProductName",
      "value": "Wireless Headphones Pro",
      "fieldTypeId": "String"
    },
    {
      "fieldId": "Description",
      "value": "Premium wireless headphones with noise cancellation",
      "fieldTypeId": "LocaleString"
    },
    {
      "fieldId": "Price",
      "value": 199.99,
      "fieldTypeId": "Double"
    },
    {
      "fieldId": "SKU",
      "value": "WHP-PRO-001",
      "fieldTypeId": "String"
    }
  ],
  "completeness": {
    "en-US": 95.5,
    "de-DE": 87.3
  },
  "segmentName": "Electronics",
  "segmentId": 789,
  "createdDate": "2023-01-15T10:30:00Z",
  "modifiedDate": "2024-01-25T14:22:00Z"
}

Key Fields

id: Unique entity identifier in InRiver
entityTypeId: Type of entity (Product, Category, Item, etc.)
fieldValues: Array of field name/value pairs (product attributes, descriptions, prices)
completeness: Data completeness scores by language
segmentName / segmentId: Segment/category classification
createdDate / modifiedDate: Timestamps for change tracking
fieldTypeId: Data type of each field (String, Integer, LocaleString, etc.)

💡 Working with Entity Data

Use load_values: true to include fieldValues array
fieldValues contains the actual product data (names, descriptions, prices)
LocaleString fields contain language-specific translations
modifiedDate is key for incremental synchronization
completeness scores help identify incomplete product data

Media & Assets

Loading Product Media

Set load_media_details: true to include digital asset information:

{
  "id": 12345,
  "entityTypeId": "Product",
  "media": [
    {
      "id": 67890,
      "resourceId": "abc123def456",
      "filename": "headphones-main.jpg",
      "mimeType": "image/jpeg",
      "fileSize": 245678,
      "url": "https://cdn.inriver.com/resources/abc123def456",
      "width": 2000,
      "height": 2000
    },
    {
      "id": 67891,
      "resourceId": "xyz789uvw012",
      "filename": "headphones-side.jpg",
      "mimeType": "image/jpeg",
      "fileSize": 198432,
      "url": "https://cdn.inriver.com/resources/xyz789uvw012"
    }
  ]
}

⚠️ Performance Impact

Loading media details increases response size and processing time. Only enable if you need asset URLs, file metadata, or image dimensions. For large catalogs, consider separate media sync workflows.

📷 Media Best Practices

Set load_media_details: false for faster product data sync
Create separate workflow specifically for media sync if needed
Use media resourceId to download files from InRiver CDN
Cache media URLs to reduce repeated downloads
Filter media by type (images only, videos only) in post-processing

Incremental Synchronization

Efficient Updates with Date Filtering

For large catalogs, use incremental sync to only process changed entities:

Implementation Steps

Initial Full Sync: First run with no date filter pulls all entities
Record Sync Time: Store timestamp when sync completes successfully
Subsequent Syncs: Set query_filter_last_date to last successful sync time
Add Date Filter: Include ModifiedDate filter with GreaterThan operator
Update Timestamp: Update stored timestamp after each successful sync

Incremental Config Example

{
  "inriver": {
    "query_filters": [
      {
        "type": "ChannelId",
        "value": "393",
        "operator": "Equal"
      },
      {
        "type": "ModifiedDate",
        "value": "2024-01-25T08:00:00",
        "operator": "GreaterThan"
      }
    ],
    "query_filter_last_date": "2024-01-25T08:00:00",
    "load_values": true,
    "chunk_size": 1000
  }
}

💡 Incremental Sync Tips

Add 5-minute buffer to date filter to account for processing delays
Store last sync timestamp in database or persistent storage
Schedule frequent incremental syncs (every 15-60 minutes)
Implement fallback full sync weekly or monthly
Monitor for entities that might be missed due to timing
Test incremental logic thoroughly before production

Performance Optimization

⚡ Chunk Size Tuning

Optimize based on catalog size and entity complexity:

Simple products (few fields): chunk_size: 2000
Standard products: chunk_size: 1000
Complex products (many fields/relations): chunk_size: 500
With media loading: chunk_size: 200-500
Monitor memory usage and adjust accordingly

⏱️ Rate Limiting

Manage API rate limits with appropriate delays:

Set chunk_waiting_time: 500-1000ms for standard operations
Increase to 1500-2000ms for large queries or media loading
Monitor InRiver API response times
Implement exponential backoff for retry logic
Coordinate with InRiver team for high-volume requirements

🎯 Query Optimization

Reduce data volume with effective filtering:

Filter by ChannelId to limit to specific channels
Use EntityTypeId to separate entity types into different workflows
Add custom field filters to reduce result set
Use incremental sync for ongoing operations
Consider separate workflows for products vs. categories vs. assets

📅 Scheduling Strategy

Optimize timing for best performance:

Full sync: Weekly or monthly during off-peak hours
Incremental sync: Every 15-60 minutes during business hours
Media sync: Nightly or as-needed basis
Category sync: Less frequent (categories change rarely)

Troubleshooting

🔴 Issue: "No entities returned"

Cause: Query filters too restrictive or no entities match criteria.

Solution: Test filters in InRiver UI. Verify ChannelId is correct. Check EntityTypeId matches expected entities. Remove date filters temporarily to test.

🔴 Issue: "Authentication failed"

Cause: API key invalid, expired, or lacks permissions.

Solution: Verify api_key in InRiver Control Center. Check base_url matches your instance. Ensure key has read permissions for entities.

🔴 Issue: "Timeout or slow performance"

Cause: Query returning too many entities or complex relationships.

Solution: Reduce chunk_size (try 500). Add more specific filters. Disable load_media_details if not needed. Increase chunk_waiting_time.

🔴 Issue: "Missing field values"

Cause: load_values set to false or fields not populated in InRiver.

Solution: Set load_values: true in configuration. Verify fields are filled in InRiver. Check completeness scores in response.

🔴 Issue: "Incremental sync missing updates"

Cause: Date filter too recent or timestamp not updating.

Solution: Add 5-10 minute buffer to date filter. Verify timestamp storage/retrieval logic. Check ModifiedDate filter syntax. Test with known recently-updated entity.

🔴 Issue: "Media URLs not working"

Cause: Media not published to channel or load_media_details disabled.

Solution: Set load_media_details: true. Verify media published to queried channel in InRiver. Check media download URLs have proper authentication.

🔴 Issue: "Memory errors with large catalogs"

Cause: chunk_size too large or insufficient memory allocation.

Solution: Reduce chunk_size to 200-500. Process entity types separately. Increase server memory allocation. Disable media loading.

API Response

Success Response

When entities are successfully retrieved:

{
  "status": "success",
  "total_entities": 1547,
  "entities_retrieved": 1547,
  "channel_id": "393",
  "query_time": "2024-01-25T10:15:00Z",
  "entities": [
    {
      "id": 12345,
      "entityTypeId": "Product",
      "fieldValues": [...]
    }
  ]
}

Error Response

If an error occurs:

{
  "status": "error",
  "message": "Authentication failed",
  "details": {
    "reason": "Invalid API key",
    "code": 401
  }
}

Field Mapping

Mapping InRiver Fields to Destination

Map InRiver entity fields to your destination system format:

{
  "mapped_fields": [
    {
      "origin_field": "id",
      "destination_field": "sku",
      "convert_function": ""
    },
    {
      "origin_field": "fieldValues.ProductName",
      "destination_field": "title",
      "convert_function": ""
    },
    {
      "origin_field": "fieldValues.Description",
      "destination_field": "description",
      "convert_function": "stripHtml"
    },
    {
      "origin_field": "fieldValues.Price",
      "destination_field": "price",
      "convert_function": "parseFloat"
    },
    {
      "origin_field": "modifiedDate",
      "destination_field": "last_updated",
      "convert_function": ""
    }
  ]
}

💡 Mapping Tips

InRiver field names are stored in fieldValues array
Use origin_field path notation to access nested data
Apply convert_function for data transformation (formatting, parsing)
Handle LocaleString fields by specifying language code
Map modifiedDate for change detection in destination