Skip to main content

Insights

ETL Transform Tasks

The insights database is populated using the built-in RavenDB SQL ETL Task. Each database has one or more ETL tasks for collections we want to replicate. The transform scripts are managed in code via the RavenDB migrations process.

The tasks are defined as migration tasks, a migration task will be executed every time the migrations for the given database runs but only if the contents of the file have changed. We use a file hash to determine this, see components/ravendb/migrations.ts for details.

Migration tasks are kept in the tasks folder under migrations for each service, i.e. services/customers/api/migrations/tasks

You can read more about RavenDB SQL ETL here

You can see an example of setting up ETL transforms for Inventory here services/inventory/api/migrations/tasks/050-inv-transformers.js

Notes...

  1. Array types are handled by RavenDB ETL but you need to use Text data type for strings not varchar / char / varchar() etc and you must pass an empty array instead of null otherwise RavenDB won't be able to determine the type properly, example...
source_business_units: {
Type : 'Array | Text',
Value : m.source && m.source.businessUnitIds ? m.source.businessUnitIds.filter(b => b) : []
},

Insights Schema Management

The SQL schema is managed via migration script services. If you need to change any schema related to customer data for example you should add a migration script to the customers service.

services/inventory/api/migrations/sql

These scripts will only be run once but should be idempotent