Import and Migration Strategies – Post 2: Choosing Between Direct Import and Staging with Temporary Tables

Choosing Between Direct Import and Staging with Temporary Tables

Import and Migration Strategies – Post 2
:link: Read Post 1: Understanding Migration, Interfacing, and Hybrid Integration in Novulo


Introduction

When importing or migrating data into Novulo, you essentially have two technical strategies to choose from:

  1. Direct Import – importing data straight into the Novulo tables (e.g. Products, Contacts)
  2. Staging with Temporary Tables – first importing raw data into temporary structures (e.g. ImportProducts), then transforming and processing them into the Novulo application tables

This post provides guidance for making that choice. The key message:

Use Direct Import unless there are good reasons not to.

The Novulo Import Engine is highly capable and supports many transformation patterns out-of-the-box. But in some scenarios, staging is the more robust or even required path.


Direct Import – Fast, Flexible and Powerful

Direct Import in Novulo is often the fastest way to get data in. It uses configuration-based mappings that are:

  • Quick to set up
  • Easy to adjust
  • Deployed without code changes

Advanced features supported by the Import Engine

Novulo’s import engine isn’t a basic field-to-field mapper. It supports:

  • Normalization & denormalization, by mapping multiple input fields into multiple related tables
  • Derived foreign keys, e.g., combining two fields into a reference
  • Data validation, with automatic error handling on required fields, uniqueness, formats, etc.
  • Composite keys, e.g. using (CustomerNumber + DivisionCode) as a key
  • Fallback expressions, for defaulting or computing values on-the-fly

This makes Direct Import very suitable for most one-off migrations and many interfaces — especially when the source data structure closely matches the Novulo target model.


When to Use Temporary Staging Tables

There are certain scenarios where staging tables are preferable — or even necessary.

What is a Temporary Staging Table?

A staging table is typically a setup consisting of two database tables:

  1. Run Table – captures the metadata of an import run (e.g., timestamp, source, status).
  2. Records Table – stores the raw, unvalidated data associated with each run.

Data is inserted into the staging table in raw format — meaning that there’s typically no validation or interpretation applied during import. After the data is staged, Novulo logic processes it and moves it into the structured application tables, applying validations, mappings, and reference lookups as needed.

Here’s a simplified visualization of the structure:

StagingTables RunTable Run Table + RunID (PK)\l+ Timestamp\l+ SourceFileName\l+ ImportStatus\l+ ImportedBy\l RecordsTable Records Table + RecordID (PK)\l+ RunID (FK)\l+ Field_1_raw\l+ Field_2_raw\l+ Field_3_raw\l RunTable->RecordsTable RunID

Why Use Staging Tables?

We advise: use Direct Import by default, unless there are good reasons not to. Below are common reasons why staging tables are preferred:

1. Preview and Correction Needs

When you want to preview the data before it’s applied to the Novulo model — or correct it either in the staging table or using Novulo master data — a staging table gives you control and visibility. You can re-process the same data after fixing issues, without resubmitting the original file.

2. Handling Record Fragmentation

In some source systems, data is fragmented across records. For example:

  • A night shift might be split into two records: 22:00–00:00 and 00:00–06:00, but Novulo expects a single record: 22:00–06:00.
    This requires logic to detect and merge fragments — something that is more manageable in staging logic than in direct import mappings.

3. Complex Structure Mismatches

In some cases, a single logical object is represented in an unusual structure. For example:

  • A “shipment” might be defined across a header, multiple detail tables, and status records that need to be combined into one Novulo object.
    Such reconstruction often benefits from temporary staging logic to combine and validate the intermediate records.

4. Preview + Manual Correction (Again)

Staging tables allow you to pause, inspect, and validate the data before applying it. This is particularly useful when working with legacy or unreliable source systems.

5. Error Handling and Traceability

With staging tables, errors can be logged and investigated without rejecting the entire import. You can retry only the failed records or adjust logic based on observed patterns.


A Note on Performance and Scale

For large datasets (50,000+ records), using staging tables may become too heavy. Every import step (insert, validate, transform, write) becomes a separate database operation. Especially with frequent updates (e.g., daily or hourly), this can lead to high database load and performance issues. In such cases, direct import with smart pre-processing might be more efficient.


Direct and Staged Imports Can Coexist

In a typical Novulo implementation, you’ll often use both approaches side-by-side. Some data (e.g., Contacts, Products) may use direct import, while others (e.g., complex Projects, hybrid integrations, or record-fragmented imports) may require staging tables. The strategy depends on the data’s complexity, quality, and the degree of control required.


Summary: Choose Direct Import unless…

Criteria Direct Import Temporary Staging
Data is clean and structured :white_check_mark: :white_check_mark:
Structure closely matches Novulo :white_check_mark: :warning:
Data volume is high and frequent :white_check_mark: :warning:
Needs business logic / complex joins :warning: :white_check_mark:
High traceability needed :warning: :white_check_mark:
Complex master data lookup :warning: :white_check_mark:
Record fragmentation in source :cross_mark: :white_check_mark:
Requires preview/editing before import :cross_mark: :white_check_mark:

Let us know your thoughts or share your own use cases!

Next up in the series:

  • Post 4 – Data Characteristics that Influence Import Strategy
  • Post 5 – Handling Record Fragmentation and Structure Mismatches
  • Post 6 – Advanced Import Tuning: Scheduling, Error Recovery, and Traceability
1 Like