Data collections

Introduction

This article introduces Data Collections, a feature that allows technical users to query arbitrary data from the system’s database and extract the results for further use. Rather than focusing on predefined outputs, Data Collections is built around flexible querying and reuse, making it suitable for analysis, investigation, supervision and integration scenarios.

Prerequisites:

M5159 - Novulo Dataverzamelingen

M7606 - Novulo Applicatielogs bij dataverzameling obv expressie (optional, but useful in order to save Application logs based on the Data Collections, and being able to pair it with Mail Subscriptions)

M6704 - Novulo Mailen van applicatielogs (optional, but useful in order to have email sent to specific email address when certain logs for the Data Collections are set)

To create a data collection

Go to Management → Data collections and press the ‘+’ button:

  • Insert a “Description”
  • Select a “Type”:
    • A Snapshot Data Collection captures a point-in-time view of the selected data, preserving the results exactly as they existed at the defined moment of interval
    • A Period Data Collection evaluates data over a calculated time window. The window is defined by period_from and period_to, derived from the selected period (for example, daily, weekly, monthly, .. ) and its starting point.
  • Select a “Period”:
    • Manual means that the data are calculated when the data collection is run.
    • Daily, Weekly, … mean that you can filter your data based on a certain time-frame.

In the data definition section is where the magic happens:

  • Cardinality specifies whether a Data Collection produces a single measurement or multiple measurements.
    • If set to Single, you can just enter your expression that will retrieve only a single measurement.
    • With Multiple cardinality, the Data Collection operates on a record set. The configured expressions are evaluated individually for each selected record, producing multiple results.

You can run data collections in 2 ways, by using a scheduled task or by using the UI manually

Running Data Collections - Manual

Generate creates the Data Collection sets according to the configured start date and interval. When a past start date is selected (for example, last year with a daily interval), multiple sets are created—one per interval—up to the current date, be careful of performance in this case. The expressions are not evaluated during this step.

Generate and process performs the same generation step and immediately executes the expressions for all generated sets.

Running Data Collections - Scheduled task

If you are not familiar with the scheduled task, you can read How to add a new scheduled task

After a Data Collection is created, automated processing can be configured through the scheduler by using the Process generate_sets_for_collectioncontrollercomponent_xxxxx task. This allows Data Collections to be generated and processed at a predefined interval.

In practice, these tasks are commonly scheduled during low-usage periods, such as after business hours (especially for heavy queries), but depending on your business case you could also schedule them to run every 10 minutes.

Reading the result

For a simple, snapshot and manual data collection that runs every 15 minutes, the result will look something like this, showing the result of the data collection already on the same line of the data collection set:

While with a multiple cardinality, the values will be nested per record as set up previously:

Data collection for monitoring

To use Data Collections for monitoring and notifications, component M7606 and M6704 are required. It allows log rules to be defined for Data Collections and enables notification via application logs and mail subscriptions.

When defining a Log rule, we can provide a level of criticality for that rule:

This will be recorded in the application log, and creating a mail subscription on the “Data collection logging” emails can be received when the result of the data collection is interesting for a reason or another.

To know more on how to subscribe for a mail service, you can read Subscribe for e mails on certain application logs

Hi Marco,

We tested this and all seems to work fine.

We have set this up in the task scheduler and would like the result by email so we created a mail subscription.

However we still need to manually press below button to receive the e-mail.

The process behind this button (Application log - send to subscribers) is produced.

Of course we can create another task to call this process 5 min after the datacollection task but we would prefer to have the e-mail sent right after the datacollections task.

Do we need to create a component with a process that runs these 2 processes and put that in the task scheduler or is there a better way?