Query & Search Entities

Pull / OriginEntity Retrieval

Overview

The Query & Search Entities action allows you to pull products, categories, items, and other entities from InRiver PIM using advanced filtering, pagination, and chunked processing. This is the primary action for extracting product data from InRiver for synchronization with e-commerce platforms, data warehouses, or content management systems.

This action supports complex query filters, incremental synchronization based on modified dates, optional media loading, and configurable chunking for handling large enterprise product catalogs efficiently.

Key Features

πŸ” Advanced Filtering

Use InRiver query filters with multiple operators for precise data extraction

πŸ“Š Incremental Sync

Date-based filtering to only retrieve entities modified since last sync

⚑ Chunked Processing

Handle millions of entities with configurable batch sizes and delays

πŸ–ΌοΈ Media Support

Optionally load product images, videos, and digital asset metadata

πŸ”„ Automatic Pagination

Seamlessly handles InRiver API pagination to retrieve all matching records

🎯 Entity Types

Query any InRiver entity type: Products, Categories, Items, Specifications, etc.

Configuration

Required Fields

Ensure your InRiver integration is configured with valid credentials and query parameters:

{
  "inriver": {
    "base_url": "https://your-instance.productmarketingcloud.com",
    "api_key": "your_api_key_here",
    "query_filters": [
      {
        "type": "ChannelId",
        "value": "393",
        "operator": "Equal"
      }
    ],
    "query_filter_last_date": "2024-01-01T00:00:00",
    "load_values": true,
    "load_media_details": false,
    "chunk_waiting_time": 500,
    "chunk_size": 1000
  }
}

πŸ”‘ Configuration Requirements

  • base_url: Your InRiver REST API endpoint
  • api_key: InRiver API authentication key
  • query_filters: Array of filter objects to specify which entities to retrieve
  • query_filter_last_date: Optional date for incremental sync
  • load_values: Whether to include entity field values (recommended: true)
  • load_media_details: Whether to include media/asset metadata
  • chunk_size: Records per batch (500-2000 recommended)
  • chunk_waiting_time: Delay between batches in milliseconds

Query Filters

Building Effective Queries

Query filters are the core mechanism for specifying which entities to retrieve from InRiver:

Filter Anatomy

{
  "type": "ChannelId",
  "value": "393",
  "operator": "Equal"
}
type

The InRiver field or attribute to filter on (e.g., ChannelId, EntityTypeId, ModifiedDate, custom field names)

value

The value to compare against (string, number, or date depending on field type)

operator

The comparison operator: Equal, NotEqual, GreaterThan, LessThan, Contains, In, Between

Common Filter Patterns

Filter by Channel

Retrieve entities from specific InRiver channel (e.g., web, print, mobile):

{
  "type": "ChannelId",
  "value": "393",
  "operator": "Equal"
}
Filter by Entity Type

Retrieve specific entity types (Products, Categories, Items):

{
  "type": "EntityTypeId",
  "value": "Product",
  "operator": "Equal"
}
Incremental Sync by Date

Only retrieve entities modified after specific date:

{
  "type": "ModifiedDate",
  "value": "2024-01-25T00:00:00",
  "operator": "GreaterThan"
}
Multiple Filters Combined

Combine filters for precise queries:

[
  {
    "type": "ChannelId",
    "value": "393",
    "operator": "Equal"
  },
  {
    "type": "EntityTypeId",
    "value": "Product",
    "operator": "Equal"
  },
  {
    "type": "ModifiedDate",
    "value": "2024-01-25T00:00:00",
    "operator": "GreaterThan"
  }
]

πŸ’‘ Filter Best Practices

  • Start broad, then add filters to narrow results
  • Always filter by ChannelId for channel-specific data
  • Use EntityTypeId to separate products, categories, and other entities
  • Test filters in InRiver UI or Postman before implementing
  • For incremental sync, add ModifiedDate filter with GreaterThan operator
  • Combine filters logicallyβ€”all filters must match (AND logic)

Workflow Configuration

Example workflow configuration for querying InRiver entities:

{
  "name": "Sync Products from InRiver to E-commerce",
  "origin": {
    "microservice": "integrations",
    "action": "query_search",
    "type": "pull",
    "integration": {
      "name": "inriver"
    },
    "mapped_fields": [
      {
        "origin_field": "id",
        "destination_field": "sku"
      },
      {
        "origin_field": "fieldValues.ProductName",
        "destination_field": "title"
      },
      {
        "origin_field": "fieldValues.Description",
        "destination_field": "description"
      },
      {
        "origin_field": "fieldValues.Price",
        "destination_field": "price"
      }
    ]
  },
  "destination": {
    "microservice": "ecommerce",
    "action": "push_products",
    "chunk_size": 100,
    "chunk_waiting_time": 1500
  }
}

Workflow Parameters

  • action: Set to query_search
  • type: Set to pull (origin operation)
  • chunk_size: Number of entities to process per batch (500-2000)
  • chunk_waiting_time: Delay between batches in milliseconds (500-2000)
  • integration.name: Set to inriver
  • mapped_fields: Map InRiver fields to destination system format

Use Cases

πŸ›’ E-commerce Product Sync

Scenario: Pull product catalog from InRiver and push to Shopify, BigCommerce, or Magento.

Benefit: Maintain single source of truth in InRiver while keeping e-commerce platforms updated.

πŸ“Š Data Warehouse Population

Scenario: Export InRiver product data to Snowflake, BigQuery, or Redshift for analytics.

Benefit: Combine product data with sales, inventory, and customer data for comprehensive reporting.

🌐 Multi-Channel Publishing

Scenario: Pull channel-specific product data for web, mobile apps, print catalogs.

Benefit: Distribute tailored product content to each channel from InRiver PIM.

πŸ”„ Real-Time Inventory Updates

Scenario: Use incremental sync (every 15-30 minutes) to push product updates as they occur.

Benefit: Keep downstream systems synchronized with near real-time product data.

πŸ–ΌοΈ Digital Asset Management

Scenario: Pull product images, videos, and documents for CDN or media library.

Benefit: Centralize digital asset distribution from InRiver to all touchpoints.

πŸ“± Mobile App Content

Scenario: Export product data optimized for mobile applications.

Benefit: Deliver consistent product information across web and mobile experiences.

Data Structure

Entity Response Format

Each entity retrieved from InRiver has the following structure:

{
  "id": 12345,
  "entityTypeId": "Product",
  "fieldValues": [
    {
      "fieldId": "ProductName",
      "value": "Wireless Headphones Pro",
      "fieldTypeId": "String"
    },
    {
      "fieldId": "Description",
      "value": "Premium wireless headphones with noise cancellation",
      "fieldTypeId": "LocaleString"
    },
    {
      "fieldId": "Price",
      "value": 199.99,
      "fieldTypeId": "Double"
    },
    {
      "fieldId": "SKU",
      "value": "WHP-PRO-001",
      "fieldTypeId": "String"
    }
  ],
  "completeness": {
    "en-US": 95.5,
    "de-DE": 87.3
  },
  "segmentName": "Electronics",
  "segmentId": 789,
  "createdDate": "2023-01-15T10:30:00Z",
  "modifiedDate": "2024-01-25T14:22:00Z"
}

Key Fields

  • id: Unique entity identifier in InRiver
  • entityTypeId: Type of entity (Product, Category, Item, etc.)
  • fieldValues: Array of field name/value pairs (product attributes, descriptions, prices)
  • completeness: Data completeness scores by language
  • segmentName / segmentId: Segment/category classification
  • createdDate / modifiedDate: Timestamps for change tracking
  • fieldTypeId: Data type of each field (String, Integer, LocaleString, etc.)

πŸ’‘ Working with Entity Data

  • Use load_values: true to include fieldValues array
  • fieldValues contains the actual product data (names, descriptions, prices)
  • LocaleString fields contain language-specific translations
  • modifiedDate is key for incremental synchronization
  • completeness scores help identify incomplete product data

Media & Assets

Loading Product Media

Set load_media_details: true to include digital asset information:

{
  "id": 12345,
  "entityTypeId": "Product",
  "media": [
    {
      "id": 67890,
      "resourceId": "abc123def456",
      "filename": "headphones-main.jpg",
      "mimeType": "image/jpeg",
      "fileSize": 245678,
      "url": "https://cdn.inriver.com/resources/abc123def456",
      "width": 2000,
      "height": 2000
    },
    {
      "id": 67891,
      "resourceId": "xyz789uvw012",
      "filename": "headphones-side.jpg",
      "mimeType": "image/jpeg",
      "fileSize": 198432,
      "url": "https://cdn.inriver.com/resources/xyz789uvw012"
    }
  ]
}

⚠️ Performance Impact

Loading media details increases response size and processing time. Only enable if you need asset URLs, file metadata, or image dimensions. For large catalogs, consider separate media sync workflows.

πŸ“· Media Best Practices

  • Set load_media_details: false for faster product data sync
  • Create separate workflow specifically for media sync if needed
  • Use media resourceId to download files from InRiver CDN
  • Cache media URLs to reduce repeated downloads
  • Filter media by type (images only, videos only) in post-processing

Incremental Synchronization

Efficient Updates with Date Filtering

For large catalogs, use incremental sync to only process changed entities:

Implementation Steps

  1. Initial Full Sync: First run with no date filter pulls all entities
  2. Record Sync Time: Store timestamp when sync completes successfully
  3. Subsequent Syncs: Set query_filter_last_date to last successful sync time
  4. Add Date Filter: Include ModifiedDate filter with GreaterThan operator
  5. Update Timestamp: Update stored timestamp after each successful sync

Incremental Config Example

{
  "inriver": {
    "query_filters": [
      {
        "type": "ChannelId",
        "value": "393",
        "operator": "Equal"
      },
      {
        "type": "ModifiedDate",
        "value": "2024-01-25T08:00:00",
        "operator": "GreaterThan"
      }
    ],
    "query_filter_last_date": "2024-01-25T08:00:00",
    "load_values": true,
    "chunk_size": 1000
  }
}

πŸ’‘ Incremental Sync Tips

  • Add 5-minute buffer to date filter to account for processing delays
  • Store last sync timestamp in database or persistent storage
  • Schedule frequent incremental syncs (every 15-60 minutes)
  • Implement fallback full sync weekly or monthly
  • Monitor for entities that might be missed due to timing
  • Test incremental logic thoroughly before production

Performance Optimization

⚑ Chunk Size Tuning

Optimize based on catalog size and entity complexity:

  • Simple products (few fields): chunk_size: 2000
  • Standard products: chunk_size: 1000
  • Complex products (many fields/relations): chunk_size: 500
  • With media loading: chunk_size: 200-500
  • Monitor memory usage and adjust accordingly

⏱️ Rate Limiting

Manage API rate limits with appropriate delays:

  • Set chunk_waiting_time: 500-1000ms for standard operations
  • Increase to 1500-2000ms for large queries or media loading
  • Monitor InRiver API response times
  • Implement exponential backoff for retry logic
  • Coordinate with InRiver team for high-volume requirements

🎯 Query Optimization

Reduce data volume with effective filtering:

  • Filter by ChannelId to limit to specific channels
  • Use EntityTypeId to separate entity types into different workflows
  • Add custom field filters to reduce result set
  • Use incremental sync for ongoing operations
  • Consider separate workflows for products vs. categories vs. assets

πŸ“… Scheduling Strategy

Optimize timing for best performance:

  • Full sync: Weekly or monthly during off-peak hours
  • Incremental sync: Every 15-60 minutes during business hours
  • Media sync: Nightly or as-needed basis
  • Category sync: Less frequent (categories change rarely)

Troubleshooting

πŸ”΄ Issue: "No entities returned"

Cause: Query filters too restrictive or no entities match criteria.

Solution: Test filters in InRiver UI. Verify ChannelId is correct. Check EntityTypeId matches expected entities. Remove date filters temporarily to test.

πŸ”΄ Issue: "Authentication failed"

Cause: API key invalid, expired, or lacks permissions.

Solution: Verify api_key in InRiver Control Center. Check base_url matches your instance. Ensure key has read permissions for entities.

πŸ”΄ Issue: "Timeout or slow performance"

Cause: Query returning too many entities or complex relationships.

Solution: Reduce chunk_size (try 500). Add more specific filters. Disable load_media_details if not needed. Increase chunk_waiting_time.

πŸ”΄ Issue: "Missing field values"

Cause: load_values set to false or fields not populated in InRiver.

Solution: Set load_values: true in configuration. Verify fields are filled in InRiver. Check completeness scores in response.

πŸ”΄ Issue: "Incremental sync missing updates"

Cause: Date filter too recent or timestamp not updating.

Solution: Add 5-10 minute buffer to date filter. Verify timestamp storage/retrieval logic. Check ModifiedDate filter syntax. Test with known recently-updated entity.

πŸ”΄ Issue: "Media URLs not working"

Cause: Media not published to channel or load_media_details disabled.

Solution: Set load_media_details: true. Verify media published to queried channel in InRiver. Check media download URLs have proper authentication.

πŸ”΄ Issue: "Memory errors with large catalogs"

Cause: chunk_size too large or insufficient memory allocation.

Solution: Reduce chunk_size to 200-500. Process entity types separately. Increase server memory allocation. Disable media loading.

API Response

Success Response

When entities are successfully retrieved:

{
  "status": "success",
  "total_entities": 1547,
  "entities_retrieved": 1547,
  "channel_id": "393",
  "query_time": "2024-01-25T10:15:00Z",
  "entities": [
    {
      "id": 12345,
      "entityTypeId": "Product",
      "fieldValues": [...]
    }
  ]
}

Error Response

If an error occurs:

{
  "status": "error",
  "message": "Authentication failed",
  "details": {
    "reason": "Invalid API key",
    "code": 401
  }
}

Field Mapping

Mapping InRiver Fields to Destination

Map InRiver entity fields to your destination system format:

{
  "mapped_fields": [
    {
      "origin_field": "id",
      "destination_field": "sku",
      "convert_function": ""
    },
    {
      "origin_field": "fieldValues.ProductName",
      "destination_field": "title",
      "convert_function": ""
    },
    {
      "origin_field": "fieldValues.Description",
      "destination_field": "description",
      "convert_function": "stripHtml"
    },
    {
      "origin_field": "fieldValues.Price",
      "destination_field": "price",
      "convert_function": "parseFloat"
    },
    {
      "origin_field": "modifiedDate",
      "destination_field": "last_updated",
      "convert_function": ""
    }
  ]
}

πŸ’‘ Mapping Tips

  • InRiver field names are stored in fieldValues array
  • Use origin_field path notation to access nested data
  • Apply convert_function for data transformation (formatting, parsing)
  • Handle LocaleString fields by specifying language code
  • Map modifiedDate for change detection in destination

Related Actions

Additional Resources