Skip to main content

Pulling data from ALL surveys using the API

  • 15 October 2021
  • 34 replies
  • 2399 views

As you all know, fetching information about responses to surveys is a “bit” painful task, as there is no report or single UI allowing you to get it for multiple courses and/or surveys.

 

In our system we have course evaluation surveys added to each course (around 250 courses and counting), most of the courses use the same survey added from the central repository, however, some have their own surveys.

Of course tracking user responses, by navigating to each course and downloading the file is time consuming and is not a feasible solution, as we want to analyze feedback weekly and quickly react to any issues and comments.

 

Below I’d like to share with you my approach to solving that problem by using Docebo API (I can’t share the full code of the Python script that I wrote for that, but given this outline any skilled developer should be able to create a similar tool).

 

My goal was to create a solution, which would automatically pull data from ALL surveys across ALL courses and save the responses into CSV files.

 

1 Get the list of all courses on the platform

First fetch all courses using the API

GET /course/v1/courses

 

2 Iterate through all courses

Then, in a loop iterate through all courses, skipping those that are not published (coursePublished == False) or deleted (courseDeleted == True)

 

2.1 (optionally) get course star rating value

You may want to get and store the star rating for each course for that use the API

GET /learn/v1/courses/{courseId}/rating

 

2.2 Retrieve the list of learning objects

To identify surveys, you need to fetch the list of all learning objects in a course

GET /learn/v1/courses/{courseId}/los

 

2.3 Iterate through all learning objects

In another loop, go through all LOs, skipping those which are NOT polls ('object_type' != "poll")

 

2.3.1 Get the poll ID

From learning objects which are polls, get the poll identifier from the id_resource field

pollId = lo<‘id_resource’]

 

2.3.2 Fetch all answers for a poll

To get all answers use the API

GET /learn/v1/survey/{pollId}/answer?id_course={courseId}

in Python, you can now convert the received JSON to a flat table using Pandas DataFrame

df = pd.DataFrame(pd.json_normalize(courseSurveys))

Optionally, add a few columns to make the CSV file more user friendly and make it easier if you’d like to aggregate responses from several courses.

df 'COURSE-code'] = course["code"]

dfs'COURSE-id'] = courseId

df='COURSE-category_name'] = coursec'category_name']

dfo'COURSE-type'] = course['type']

dfs'COURSE-title'] = course'"title"]

and then save it to a CSV file:

df.to_csv(f'surveysFolder/course{courseId}-poll{pollId}.csv', index=False)

 

2.3.3 Repeat for each poll

If there are multiple surveys in a course, they all will be saved to separate files

 

2.4 Repeat for each course

In that way, you will end up with the surveysFolder containing separate files for each survey in each course :-)

 

NOTE, if you have your survey in the Central Repository, the poll ID (id_resource) value will be the same across all courses that are using that survey, so you will be able to easily aggregate responses from all courses which are using that survey.

 

I hope this is helpful.

Great work!!!  It’s just too bad this isn’t a built-in report since this chews up quite a few operations with a tool like Integromat.


This is very nice, agree with elamast, how are you mitigating the api usage s this seems to potentially burn more than the recommended hourly use in its own?


What frustrates me the most about this is the fact that the surveys are just data in a large database and we (Docebo team, really) cannot write a query to spit out the information we need. Or am I looking at this too simplistically?

  


This is very nice, agree with elamast, how are you mitigating the api usage s this seems to potentially burn more than the recommended hourly use in its own?

It really depends on how many courses you have.  However, I should probably point out the “old fashioned” way that’s good for a limited number of courses:

  1. Navigate through Course Management to the course.
  2. Go to the Reports tab
  3. Click on Training material statistics
  4. Click on the survey/poll link (left side)
  5. In the pop-up window, export the results.

What frustrates me the most about this is the fact that the surveys are just data in a large database and we (Docebo team, really) cannot write a query to spit out the information we need. Or am I looking at this too simplistically?

  

There is something listed in the API-browser under the “report” service that indicates there’s a Query Builder tool which appears to use SQL. However, I don’t know if it’s available to customers, and if so, what Docebo charges for it.  It seems like it would be very helpful for those of us who do a lot of “out of the box” work with the platform.


What frustrates me the most about this is the fact that the surveys are just data in a large database and we (Docebo team, really) cannot write a query to spit out the information we need. Or am I looking at this too simplistically?

  

There is something listed in the API-browser under the “report” service that indicates there’s a Query Builder tool which appears to use SQL. However, I don’t know if it’s available to customers, and if so, what Docebo charges for it.  It seems like it would be very helpful for those of us who do a lot of “out of the box” work with the platform.

Yeah unfortunately they don’t seem to make it available to the public, its mostly there since the company utilizes the api themselves. 

I agree on the surface this should be a fairly easy report type setup, but as with most things, without seeing the underlying architecture there could be all sorts of ‘gotchas’ we simply have no way of knowing about which make this not quite as straight forward an ask.


This is very nice, agree with elamast, how are you mitigating the api usage s this seems to potentially burn more than the recommended hourly use in its own?

It really depends on how many courses you have.  However, I should probably point out the “old fashioned” way that’s good for a limited number of courses:

  1. Navigate through Course Management to the course.
  2. Go to the Reports tab
  3. Click on Training material statistics
  4. Click on the survey/poll link (left side)
  5. In the pop-up window, export the results.

That’s only half of it, because the downloaded data does not contain the name of the course, unless it was asked in the form (but then users often get the wrong one). When compiling 300+ Excel file into one, we don’t know what course is getting which feedback. 


This is very nice, agree with elamast, how are you mitigating the api usage s this seems to potentially burn more than the recommended hourly use in its own?

It really depends on how many courses you have.  However, I should probably point out the “old fashioned” way that’s good for a limited number of courses:

  1. Navigate through Course Management to the course.
  2. Go to the Reports tab
  3. Click on Training material statistics
  4. Click on the survey/poll link (left side)
  5. In the pop-up window, export the results.

That’s only half of it, because the downloaded data does not contain the name of the course, unless it was asked in the form (but then users often get the wrong one). When compiling 300+ Excel file into one, we don’t know what course is getting which feedback. 

is the course name or something similar in the file name at least? That is terrible otherwise. If it was in the file name, would totally find someone who is comfortable with quick windows/mac scripts and have them just look at a folder of excel files, add a column based on file name, then copy all content into one master file and save with a  new name. Dirty, but effective, at least until this report is brought online in a better way.


is the course name or something similar in the file name at least? That is terrible otherwise. If it was in the file name, would totally find someone who is comfortable with quick windows/mac scripts and have them just look at a folder of excel files, add a column based on file name, then copy all content into one master file and save with a  new name. Dirty, but effective, at least until this report is brought online in a better way.

Unfortunately the files seem to have date/time-based names rather than course-based names.

There are RPA/web scraper type tools that would allow you to automate the download and renaming of the surveys.  The direct link to each follows this pattern:  https://YOURDOMAIN.com/lms/index.php?r=player/report/pollResult&objectId=32&organizationId=1234&course_id=54   A free tool is BrowserAutomationStudio.  Others are UI.Vision, UBot Studio, UiPath…   I personally use BrowserAutomationStudio for the painful task of creating course surveys (takes some time to set up, but was worth it for the number of these I needed to create).


Yeah, the trouble is a lot of those screen scrapping automation tools are a ‘no go’ in corporate environments for security reasons. But totally useful and then could have it re-name things for you and go through merging. I also get nervous on those kinds of setups for anything vital, as a UI change or system change throughs it all off, far more unstable than proper programmatically dealing, another reason surveys within the platform seem almost unusable.


Yeah, the trouble is a lot of those screen scrapping automation tools are a ‘no go’ in corporate environments for security reasons. But totally useful and then could have it re-name things for you and go through merging. I also get nervous on those kinds of setups for anything vital, as a UI change or system change throughs it all off, far more unstable than proper programmatically dealing, another reason surveys within the platform seem almost unusable.

Your point is well-taken about environments that restrict the type of software you can use, and the instability that can come up with UI changes that break a bot/scraper.  End user workarounds are nearly always the result of failing to provide a well-engineered solution for needed functionality.

The “right” solution here depends a bit on skill levels: 1) For everyone there should be much improved reporting capabilities where you can build your own report through a drag/drop function, 2) For skilled users there should be additional API endpoints that will pull the data, and 3) For expert users there should be an ability to do direct SQL queries (read-only would be fine).


Integromat workflow for survey’s course evaluations

I thought I’d share a way I found to get the surveys/course evals programmatically with Integromat.  Other automation tools will work as well with some tweaking.

The above workflow starts with a mailhook that is generated from a PowerApp that has a bunch of admin tools I’ve created to help manage Docebo.

The mailhook is checked to make sure the mailhook came from the right place.

An API call is made to gather all the courses and then iterate through each. (This one assumes there is only one page of results, but it can be modified for more pages.)

Another API call is made to gather all the Learning Objects.  These are iterated through and filtered to only send the Polls to the next step.

I cheated a bit on the next step and found a way to get the UI to help out. Make an OAuth 2.0 GET call to https://www.YOURSITEURL.com/lms/index.php?r=player/report/pollResultExport pasing in the query parameters course_id, organizationId, objectId, and the exportType (csv in my case).  The values of the query parameters are taken from the previous iteration step.

An HTTP “get a file” call is made to the same URL.  The auth from the previous step is used here.  For some reason in Integromat you can’t just make an OAuth file download directly, so we have this odd two-step.

Replace the previous file name with the course code and date/time stamp.

Upload each course’s survey results file to a SharePoint repository.


Nice @elamast ! Similar to the original posters process. My one massive concern with this route is always volume since the API is not that robust and some of these loops with a  lot of courses can hit the 1k calls an hour limit quickly, its such a low limit!


If I remember correctly the API limit is per IP address per hour, so one way of working through this would be to introduce a delay into the workflow.  You could couple that strategy with round-robin’ing requests through a pool of IP addresses (built into some automation tools, but not Integromat) if you have a really large number of LOs to drive up the cycle count.

In my environment the above scenario takes 5 minutes to run and uses 727 operations (every step, and every iteration is an operation).  I currently have 244 courses in my catalog, so for me this is roughly 3 operations per course.  Your mileage may vary.


It is per IP address per hour, so using 75% from one ip for that hour for one process is very limiting, and does not scale, since as soon as you hit roughly 350 courses you literally shouldn’t run it within one hour, and theres only so many hours in the day, and a lot of courses out there, was helping a group last week with thousands.

Running a couple hundred processes that leverage the API to some extentant for various purposes over here so API mitigation techniques are also at front of mind when designing and deciding on pathways constantly.

1,000 per hour is very low, for instance have several products that let me do 10,000 per minute. It’s a very real thing to keep in mind.

 


It is per IP address per hour, so using 75% from one ip for that hour for one process is very limiting, and does not scale, since as soon as you hit roughly 350 courses you literally shouldn’t run it within one hour, and theres only so many hours in the day, and a lot of courses out there, was helping a group last week with thousands.

Running a couple hundred processes that leverage the API to some extentant for various purposes over here so API mitigation techniques are also at front of mind when designing and deciding on pathways constantly.

1,000 per hour is very low, for instance have several products that let me do 10,000 per minute. It’s a very real thing to keep in mind.

 

I absolutely agree that 1000 per hour is too low.  I should be specific that my workflow had that many operations in Integromat, but the number of API calls was roughly half of that.

Here’s an idea submitted for raising the limit: 

 


Sweet, upvoting!


You may also want to upvote this idea to allow customer access to the Query Builder tool: 

 


This is a giant thread and since I’m not a developer (and have a bunch of role-related work to do) I’m really wishing Docebo had SOMEONE who could make this a functional thing. Until then, I guess I can’t get this LMS to report on a standard engagement survey across our curriculum and thus can’t convey a 1:1 ROI value to leadership. :(


This is a giant thread and since I’m not a developer (and have a bunch of role-related work to do) I’m really wishing Docebo had SOMEONE who could make this a functional thing. Until then, I guess I can’t get this LMS to report on a standard engagement survey across our curriculum and thus can’t convey a 1:1 ROI value to leadership. :(

It really is just not usable in current form.


You all (y’all out here in Texas) may find this interesting regarding the rate limit:  

 


You all (y’all out here in Texas) may find this interesting regarding the rate limit:  

 

Yup, they always say its not a hard limit, but since they can’t guarantee things, when designing production processes need to play nicely with the limit. Still exceptionally low for a cloud based api limit.


As you all know, fetching information about responses to surveys is a “bit” painful task, as there is no report or single UI allowing you to get it for multiple courses and/or surveys.

 

In our system we have course evaluation surveys added to each course (around 250 courses and counting), most of the courses use the same survey added from the central repository, however, some have their own surveys.

Of course tracking user responses, by navigating to each course and downloading the file is time consuming and is not a feasible solution, as we want to analyze feedback weekly and quickly react to any issues and comments.

 

Below I’d like to share with you my approach to solving that problem by using Docebo API (I can’t share the full code of the Python script that I wrote for that, but given this outline any skilled developer should be able to create a similar tool).

 

My goal was to create a solution, which would automatically pull data from ALL surveys across ALL courses and save the responses into CSV files.

 

1 Get the list of all courses on the platform

First fetch all courses using the API

GET /course/v1/courses

 

2 Iterate through all courses

Then, in a loop iterate through all courses, skipping those that are not published (coursePublished == False) or deleted (courseDeleted == True)

 

2.1 (optionally) get course star rating value

You may want to get and store the star rating for each course for that use the API

GET /learn/v1/courses/{courseId}/rating

 

2.2 Retrieve the list of learning objects

To identify surveys, you need to fetch the list of all learning objects in a course

GET /learn/v1/courses/{courseId}/los

 

2.3 Iterate through all learning objects

In another loop, go through all LOs, skipping those which are NOT polls ('object_type' != "poll")

 

2.3.1 Get the poll ID

From learning objects which are polls, get the poll identifier from the id_resource field

pollId = lo<‘id_resource’]

 

2.3.2 Fetch all answers for a poll

To get all answers use the API

GET /learn/v1/survey/{pollId}/answer?id_course={courseId}

in Python, you can now convert the received JSON to a flat table using Pandas DataFrame

df = pd.DataFrame(pd.json_normalize(courseSurveys))

Optionally, add a few columns to make the CSV file more user friendly and make it easier if you’d like to aggregate responses from several courses.

dfc'COURSE-code'] = course["code"] dfd'COURSE-id'] = courseId dfO'COURSE-category_name'] = coursec'category_name'] dfu'COURSE-type'] = course['type'] dfp'COURSE-title'] = course'"title"]

and then save it to a CSV file:

df.to_csv(f'surveysFolder/course{courseId}-poll{pollId}.csv', index=False)

 

2.3.3 Repeat for each poll

If there are multiple surveys in a course, they all will be saved to separate files

 

2.4 Repeat for each course

In that way, you will end up with the surveysFolder containing separate files for each survey in each course :-)

 

NOTE, if you have your survey in the Central Repository, the poll ID (id_resource) value will be the same across all courses that are using that survey, so you will be able to easily aggregate responses from all courses which are using that survey.

 

I hope this is helpful.

 

That’s super helpful! Thanks a lot! :D


If any of you are using Integromat/Make for automating things with the API, I’m happy to share my blueprint/solution for this built with the logic supplied by @alekwo .  


Does this work for things like tests as well and could it be iFramed onto a page?


Reply