Choosing Between Direct Import and Staging with Temporary Tables
Import and Migration Strategies – Post 2
Read Post 1: Understanding Migration, Interfacing, and Hybrid Integration in Novulo
Introduction
When importing or migrating data into Novulo, you essentially have two technical strategies to choose from:
- Direct Import – importing data straight into the Novulo tables (e.g.
Products
,Contacts
) - Staging with Temporary Tables – first importing raw data into temporary structures (e.g.
ImportProducts
), then transforming and processing them into the Novulo application tables
This post provides guidance for making that choice. The key message:
Use Direct Import unless there are good reasons not to.
The Novulo Import Engine is highly capable and supports many transformation patterns out-of-the-box. But in some scenarios, staging is the more robust or even required path.
Direct Import – Fast, Flexible and Powerful
Direct Import in Novulo is often the fastest way to get data in. It uses configuration-based mappings that are:
- Quick to set up
- Easy to adjust
- Deployed without code changes
Advanced features supported by the Import Engine
Novulo’s import engine isn’t a basic field-to-field mapper. It supports:
- Normalization & denormalization, by mapping multiple input fields into multiple related tables
- Derived foreign keys, e.g., combining two fields into a reference
- Data validation, with automatic error handling on required fields, uniqueness, formats, etc.
- Composite keys, e.g. using
(CustomerNumber + DivisionCode)
as a key - Fallback expressions, for defaulting or computing values on-the-fly
This makes Direct Import very suitable for most one-off migrations and many interfaces — especially when the source data structure closely matches the Novulo target model.
When to Use Temporary Staging Tables
There are certain scenarios where staging tables are preferable — or even necessary.
What is a Temporary Staging Table?
A staging table is typically a setup consisting of two database tables:
- Run Table – captures the metadata of an import run (e.g., timestamp, source, status).
- Records Table – stores the raw, unvalidated data associated with each run.
Data is inserted into the staging table in raw format — meaning that there’s typically no validation or interpretation applied during import. After the data is staged, Novulo logic processes it and moves it into the structured application tables, applying validations, mappings, and reference lookups as needed.
Here’s a simplified visualization of the structure:
Why Use Staging Tables?
We advise: use Direct Import by default, unless there are good reasons not to. Below are common reasons why staging tables are preferred:
1. Preview and Correction Needs
When you want to preview the data before it’s applied to the Novulo model — or correct it either in the staging table or using Novulo master data — a staging table gives you control and visibility. You can re-process the same data after fixing issues, without resubmitting the original file.
2. Handling Record Fragmentation
In some source systems, data is fragmented across records. For example:
- A night shift might be split into two records:
22:00–00:00
and00:00–06:00
, but Novulo expects a single record:22:00–06:00
.
This requires logic to detect and merge fragments — something that is more manageable in staging logic than in direct import mappings.
3. Complex Structure Mismatches
In some cases, a single logical object is represented in an unusual structure. For example:
- A “shipment” might be defined across a header, multiple detail tables, and status records that need to be combined into one Novulo object.
Such reconstruction often benefits from temporary staging logic to combine and validate the intermediate records.
4. Preview + Manual Correction (Again)
Staging tables allow you to pause, inspect, and validate the data before applying it. This is particularly useful when working with legacy or unreliable source systems.
5. Error Handling and Traceability
With staging tables, errors can be logged and investigated without rejecting the entire import. You can retry only the failed records or adjust logic based on observed patterns.
A Note on Performance and Scale
For large datasets (50,000+ records), using staging tables may become too heavy. Every import step (insert, validate, transform, write) becomes a separate database operation. Especially with frequent updates (e.g., daily or hourly), this can lead to high database load and performance issues. In such cases, direct import with smart pre-processing might be more efficient.
Direct and Staged Imports Can Coexist
In a typical Novulo implementation, you’ll often use both approaches side-by-side. Some data (e.g., Contacts, Products) may use direct import, while others (e.g., complex Projects, hybrid integrations, or record-fragmented imports) may require staging tables. The strategy depends on the data’s complexity, quality, and the degree of control required.
Summary: Choose Direct Import unless…
Criteria | Direct Import | Temporary Staging |
---|---|---|
Data is clean and structured | ![]() |
![]() |
Structure closely matches Novulo | ![]() |
![]() |
Data volume is high and frequent | ![]() |
![]() |
Needs business logic / complex joins | ![]() |
![]() |
High traceability needed | ![]() |
![]() |
Complex master data lookup | ![]() |
![]() |
Record fragmentation in source | ![]() |
![]() |
Requires preview/editing before import | ![]() |
![]() |
Let us know your thoughts or share your own use cases!
Next up in the series:
- Post 4 – Data Characteristics that Influence Import Strategy
- Post 5 – Handling Record Fragmentation and Structure Mismatches
- Post 6 – Advanced Import Tuning: Scheduling, Error Recovery, and Traceability