[UPDATED] Docebo Data Lake 2.5: Automated data refresh update

  • 20 September 2022
  • 51 replies
  • 4165 views

Userlevel 4

Update as of Dec 20, 2022: 

As you may already know, we slowed down the rollout process of the new data refresh in order to add a few improvements to the update and make it more robust for our customers.

 

We have abided by the plan for all regions, with the exception of the US, where the update for a small portion of platforms is not yet completed. We have concluded the implementation of the data lake for those regions and expect to activate the update on the missing platforms by mid-January 2023.

 

Over the last several months, we recognize that you have been experiencing delays in updating and extracting data for reporting purposes. Behind the scenes we have been working on a solution that involves building a new “data lake” (database where the LMS data is stored for reporting purposes) to remedy the issues. Now, after the final testing stage, we are ready to roll it out to all of our customers' instances, but before we do, we want to ensure you have everything you need to know.  

 

FAQ: Your Data Lake Questions Asked and Answered

In anticipation of this release, we thought we would start an FAQ with some of the questions we think you will find helpful. Please use the comments to ask any questions and we will be sure to get the answers! 

 

  1. Let’s level set, what is a ‘data lake’?
    A. Fair question, let's start at the beginning.  A data lake is a central repository where all structured and unstructured data is stored. The Docebo Data Lake is the database where all LMS data is stored and used specifically for reporting purposes. 

 

  1. How is this update going to improve the exporting of reports?
    A. Many customers have been experiencing delays in data updates, the Data Lake has been designed to improve response time with increases in nightly data refreshes, and a decrease in refresh errors. In short, you will not experience the same delays in updating and accessing your data.

 

Pro tip: By scheduling the reports, you can save the wait altogether as the data will refresh and send.

 

  1. So, does this mean I am getting realtime reporting?
    A. Well not real time, but much closer and much faster. Your data will be refreshed when triggered by the export of a report. If the data is not validated (i.e., it has been longer than 4 hours since the last refresh,) a new refresh will occur. This refresh will take approximately 10 minutes to complete before you can export the report. Data refreshes can happen up to 12 times a day.

 

  1. How can I take this update to the next level?
    A. Glad you asked!  Here’s that pro tip again - Schedule your reports to export on a specific time and day to avoid any delays and to get the most updated view of your data. 

 

  1. When can I expect my platform will be running at full steam?
    A. We will be rolling out this update gradually starting with sandbox instances this week. All other updates will be by regions as follow
  • APAC (ap-southeast-2) - The week of October 10
  • APAC (ap-southeast-2) - The week of October 10
  • Canada (ca-central-1)  - The week of October 10
  • Europe (eu-west-2) - The week of October 17
  • Europe (eu-central-1) - The week of October 17
  • United States (us-east-1) - October 10 - Nov 18 (as per the above update, a small portion of these platforms are not yet updated but will be activated on any missing platforms by mid-January 2023)
  • Europe (eu-west-1) - October 10 - Nov 18

 

Not sure where you fall? You can find which region you are hosted under Advanced Settings > Platform Information

 

  1. Will it interrupt other activities?
    A. Absolutely not! The automatic data refresh in the data lake or the platform will have no impact on the activities of the learners or the collection of data based on system activities. But you will have to wait until the update is complete (approximately 10 minutes) to preview your reports or export any others. 

 

  1. What will this look like on my platform?
    A. Here’s a quick look at what you’ll see with the new update.

 

In the two images below, you will see the manual refresh icon removed.
Before the update:
LTX0frpBt0fbj3WSKgqd9vjM9-wlm0qGvx1O0k8R3gFBs3U3OvB5RAfLHkzMZwdpnSSowh0T8RkEnUXxKm-wsWcF71Gntxtesuvj7DqYxiOEhACZVnH22WhPTkQRNBcf6VZrZ_J20veSuNa8fw5qQGn3k-iL2ljvCuPTF3RG3ET05_5jrSR6nGTKhg

 

After the update:

ywQNAXj2YpDaNLo9RMjEYqGKsO0SxTkn9-zLCzBzCAx1yK74Up-HvjmAFaOU9uQyDKs7Z7UncAlVDxuIRKo-gWw02i5nXptMePV9XZ4Ubm9-aN2T9nd5Lrlhrn0Iet6KyhxH_uGkmZ8FClxka7TrbU6OB_uFkrgN1qiSxelqw45mg0Bk7P9MFYDB1g

 

The two images below represent before and after screenshots of your “New Reports” page with a new section showing the status of the data refresh. 

Before the update: 

3zUJZ1maY9Tz_fHHMZb-3oPjFZe4a23U5tq76EK_gGv3MxFasBxWMdlupF3bmIEmv5OIoEk3lgdr660QjLylGUXxruHukjpDD2FDQgM0xbB16qjtS-fq57stq182erQ_zMMiyEF9LxUeAzigVI23A-lxTQx5Q81wtSlYY_diXv5jpVX8dRmeBnEIxg

 

After the update:

I06QNpaE22YRpakXafAGR8t9QbJkHagZTWTtqAp9Ob96hVFX0CUfOvrHy0-raEBUTLqso6O5d3LY6XqgNYuEskdsqXpRDy132kxYe9foP4FNgGVGE7LS7xdlB9-2RqO5dugksV5_uJkdmm0Za7my5sBNuOxZ7DOlXKxT-Bk9lfYpyAygC2qQu0wwPQ

 

  1. Where can I learn more about the update?
    A. New knowledge base alert!  Creating & Managing Custom Reports includes information on how to work with custom reports and in the update’s context, a detailed account on how the refresh takes place, an illustration of scenarios on the validity of the data and instructions on how to schedule custom reports. You can also find more details in the October 2022 Release Course in Docebo University.

 

 

If you have any more questions about the update, don’t hesitate to enter them below and we’d be happy to follow up!


51 replies

Hi there, I’m not sure if this new update is working correctly for me.

I updated some of our user data around 9:42am, and exported a report at 9:58am, and none of the data had been updated. The last time I ran the report was yesterday at 2pm.

There is a notice that says my data is up to date as of 9am - is this what the “the refresh does not take real time data, but the more fresh data available, that usually are between 1 and 2 hours old” condition refers to? Does this mean if we make updates to information, we can’t expect to seem them in reporting for another 4 hours?

Userlevel 4
Badge

Hi @ayesha.ahmad 

the behaviour you report is the actual correct behaviour of the new data refresh. 

When a data refresh is triggered by an export request, the system takes the more fresh data available, that are not in real time, but as mentioned are usually 1 or 2 hours old. 

That means that to find the updates into a report you have

  • to run a report at least after 2 hours
  • that report must trigger the data refresh

 

Userlevel 7
Badge +6

@nicolo malinverno - thank you for all of the detail - how does this impact My Team reports?

 

Userlevel 7
Badge +6

We have been experiencing intermittent failures of the scheduled reports more and more. Any advice along these lines?

Userlevel 7
Badge

We have been experiencing intermittent failures of the scheduled reports more and more. Any advice along these lines?

@dklinger I’ll follow up with @nicolo malinverno first thing on Monday to see if he has any insight. Thanks for your patience!

Can anyone run reports today?  We have created several tickets and have reached out via chat.  The response has been to wait until 12/21 to run reports.  This isn’t a solution for us.  Have any of you found any workarounds to get data out to your stakeholders?

Userlevel 1

Can anyone run reports today?  We have created several tickets and have reached out via chat.  The response has been to wait until 12/21 to run reports.  This isn’t a solution for us.  Have any of you found any workarounds to get data out to your stakeholders?

 

I can’t run reports today either. 

Userlevel 1

Can anyone run reports today?  We have created several tickets and have reached out via chat.  The response has been to wait until 12/21 to run reports.  This isn’t a solution for us.  Have any of you found any workarounds to get data out to your stakeholders?

I cannot run any, either.

Userlevel 2
Badge

Haven’t been able to get our scheduled reports since Friday. We opened a ticket just now and then I saw this thread after looking for known issues.

 

Userlevel 4

Is this update complete? We have yet to see changes to our system (regarding the New Reports interface) yet our system (according to info above) was supposed to be done in November.

Any updates?

Thank you

Edit to share with others, I confirmed we can still run reports but it does not yet appear we are on the new solution.

Userlevel 7
Badge

Is this update complete? We have yet to see changes to our system (regarding the New Reports interface) yet our system (according to info above) was supposed to be done in November.

Any updates?

Thank you

Edit to share with others, I confirmed we can still run reports but it does not yet appear we are on the new solution.

@dwilburn we recently published an update to this post indicating that any missing platforms will be activated by mid-January.

Userlevel 7
Badge +6

I feel like I just stumbled into something.

Have we lost the capability for people to receive reports without logging in with this rollout?

A few moments later….please dont mind me - I found what I was looking for.

Userlevel 4
Badge +1

Hello @nicolo malinverno,

I´m confusing about the report data refresh.

You said in the posts above:

“1 -  Data refreshes can happen up to 12 times a day.”

“ 2 - |The data refresh is run automatically when you export a report (and the data are older than 4 hours) 

“ 3 - the refresh does not take real time data, but the more fresh data available, that usually are between 1 and 2 hours old. “

So, If the data refreshes run At 10 a.m ( item 1),

And I generated a report at 11 a.m , and automatically  refresh the data ( item 2)

The report will be generated with the 10 a.m updated. Every change between 10 a.m to 11 am will not reflected in the report. 

Is my understanding correct?

Is it possible to reduce the data report refresh  to 4 hours for 2 hours? 

During the business day, we have only two times to process a data report refresh. Wee need more.

Thank you.

Userlevel 3

Hi @erin.brisson  @nicolo malinverno 
Wanted to get more clarity on the 3 main rules you posted. I am a little confused about point 2 and 3. They read like they contradict each other. Can you please explain them with an example and timings?   

Point 3 refers to data between 1-2 hours old but point 2 says it must be older than 4 hours? It’s a little confusing.


Keep in mind 3 main rules:

  1. to update the data is needed a report export request
  2. the data are updated only when they are older than 4 hours
  3. the refresh does not take real time data, but the more fresh data available, that usually are between 1 and 2 hours old.
     

Thank you 

Userlevel 5
Badge

@nicolo malinverno Confirming this was rolled out to all platforms in January 2023?  I still see a note that US regions were delayed.  Perhaps an update to the update if all regions are now aligned? Thanks! 

Userlevel 4
Badge +1

This process is very confusing.

Today I discovered through Help Desk, that there is a configuration to inform how many hours the reports will be updated: Advanced Settings> Advanced Section. But I didn't find this option on my platform. See below.

 

 

Userlevel 3

This process is very confusing.

Today I discovered through Help Desk, that there is a configuration to inform how many hours the reports will be updated: Advanced Settings> Advanced Section. But I didn't find this option on my platform. See below.

 

 

I don’t see this option in my platform either!

Userlevel 4
Badge +1

I opened a ticket about this. As soon as I have an answer, I'll post it here.

Userlevel 5
Badge

I opened a ticket about this. As soon as I have an answer, I'll post it here.

I think that is the functionality they took away...the automatic refresh.  It’s supposed to refresh data in the data lake now if the data is more than 4 hours old...but the refresh apparently is pulling data that could be up to 2 hours old as the “refresh”  This “upgrade” doesn’t seem to be an upgrade… All the documentation says “real time” update...but then contradicts that with this caveat about the refresh data being up to 2 hours old with the refresh is pushed to the data lake.  It’s extremely confusing.  

Userlevel 4
Badge +1

Hi @hwolfehall , I  agree with you, very confusing. I had an understanding that there are two processes. One is the data lake update and the other is reporting data update.
Because of it , if a learner completes a course, and we update the report data, this information is not updated

Let's see if we can get an answer on how the process works.

Userlevel 4
Badge +1

Hello,

The Docebo Help desk set my platform to update the data refresh in 2 h. I still do not have access this configuration, but it´s work.

I generated a report at 14h, and 16:10h the data will be updated again. See below:

 

 

 

 

 

Userlevel 5
Badge

Hello,

The Docebo Help desk set my platform to update the data refresh in 2 h. I still do not have access this configuration, but it´s work.

I generated a report at 14h, and 16:10h the data will be updated again. See below:

 

 

 

 

 

Thank you!  I have asked for an interval change via Support Ticket.  Fingers crossed.  :)  

Userlevel 4
Badge

Hi @msantos 

your description 

The report will be generated with the 10 a.m updated. Every change between 10 a.m to 11 am will not reflected in the report. 

is correct. 

we know that the current data refresh procedure is complex, but as you have seen, in particular condition we can achieve the refresh within 2 hours and we have to think about this as part of our commitment to improve the data refresh frequency and we have to see it as a step through a even more frequent (and less complex) process. 

For a detail description of the behaviour please check our knowledge base article and let me know it that description is clear enough. 

 

 

 

Userlevel 4
Badge

hi @hwolfehall 

I confirm that the feature is in general availability, and so available for everybody, unless different specific agreement. 

If you are facing troubles you can reach out our support. 

Userlevel 4
Badge +1

Hi @nicolo malinverno , thank you for your explanation.

The article informed “ A snapshot of the live-platform data is updated in the Docebo Datalake on average every four hours”

 

And in the the figure below, “every 1-2 hours”

 

What´s the correct information?

 

 

 

 

 

 

 

 

Reply