Skip to main content

Pulling data from ALL surveys using the API

  • 15 October 2021
  • 34 replies
  • 2399 views

As you all know, fetching information about responses to surveys is a “bit” painful task, as there is no report or single UI allowing you to get it for multiple courses and/or surveys.

 

In our system we have course evaluation surveys added to each course (around 250 courses and counting), most of the courses use the same survey added from the central repository, however, some have their own surveys.

Of course tracking user responses, by navigating to each course and downloading the file is time consuming and is not a feasible solution, as we want to analyze feedback weekly and quickly react to any issues and comments.

 

Below I’d like to share with you my approach to solving that problem by using Docebo API (I can’t share the full code of the Python script that I wrote for that, but given this outline any skilled developer should be able to create a similar tool).

 

My goal was to create a solution, which would automatically pull data from ALL surveys across ALL courses and save the responses into CSV files.

 

1 Get the list of all courses on the platform

First fetch all courses using the API

GET /course/v1/courses

 

2 Iterate through all courses

Then, in a loop iterate through all courses, skipping those that are not published (coursePublished == False) or deleted (courseDeleted == True)

 

2.1 (optionally) get course star rating value

You may want to get and store the star rating for each course for that use the API

GET /learn/v1/courses/{courseId}/rating

 

2.2 Retrieve the list of learning objects

To identify surveys, you need to fetch the list of all learning objects in a course

GET /learn/v1/courses/{courseId}/los

 

2.3 Iterate through all learning objects

In another loop, go through all LOs, skipping those which are NOT polls ('object_type' != "poll")

 

2.3.1 Get the poll ID

From learning objects which are polls, get the poll identifier from the id_resource field

pollId = lo<‘id_resource’]

 

2.3.2 Fetch all answers for a poll

To get all answers use the API

GET /learn/v1/survey/{pollId}/answer?id_course={courseId}

in Python, you can now convert the received JSON to a flat table using Pandas DataFrame

df = pd.DataFrame(pd.json_normalize(courseSurveys))

Optionally, add a few columns to make the CSV file more user friendly and make it easier if you’d like to aggregate responses from several courses.

df 'COURSE-code'] = course["code"]

dfs'COURSE-id'] = courseId

df='COURSE-category_name'] = coursec'category_name']

dfo'COURSE-type'] = course['type']

dfs'COURSE-title'] = course'"title"]

and then save it to a CSV file:

df.to_csv(f'surveysFolder/course{courseId}-poll{pollId}.csv', index=False)

 

2.3.3 Repeat for each poll

If there are multiple surveys in a course, they all will be saved to separate files

 

2.4 Repeat for each course

In that way, you will end up with the surveysFolder containing separate files for each survey in each course :-)

 

NOTE, if you have your survey in the Central Repository, the poll ID (id_resource) value will be the same across all courses that are using that survey, so you will be able to easily aggregate responses from all courses which are using that survey.

 

I hope this is helpful.

If any of you are using Integromat/Make for automating things with the API, I’m happy to share my blueprint/solution for this built with the logic supplied by @alekwo .  

 

Hat tip to @alekwo for the underlying logic for this solution built mostly on Integromat/Make (make.com). You could build something similar with PowerAutomate or Docebo Connect.

 

This particular workflow is kicked off using a Mailhook generated from a PowerApp. That PowerApp is a Docebo Admin Panel that I’ve been working on for a while.  Eventually I’ll switch this to a webhook or just run the Make automation on a schedule.

Explanation of steps:

  1. The Mailhook kicks off everything.  There’s a filter between this step and the next to ensure the email came from the right place.
  2. An OAuth2 connection is made to the Docebo APi to find out how many pages are going to be retrieved.
  3. A repeater is set to the “total_page_count” returned by Docebo.
  4. an OAuth2 request is made to the Docebo API to loop through each page get all the Learning Objects (LO’s)/Training materials.
  5. In iterator looks at each LO, and a filter only sends on the “poll” type LOs for further processing.
  6. An request is made to the LMS itself to initiate a Poll Result Export.
  7. Another request is made to the LMS itself to download the Excel version of the file.
  8. Some text parsing will occur to rename the file to match a naming convention.
  9. The file is uploaded to SharePoint/OneDrive for further processing.

The downloaded files are then handled through other means to create instructor-specific versions and placed into access-controlled folders.  This part is handled through some VBA code currently,

Attached is a sanitized blueprint.json file that can be used with Make to re-create the workflow there. Remove the .txt extension before using.


The bigger issue, at least for me, is this endpoint only returns the first 100 results. There is no provisioning in the endpoint to change the page and page size like many of them have. As a result I can only get the first 100 results.

GET /learn/v1/survey/{pollId}/answer?id_course={courseId}

 


The bigger issue, at least for me, is this endpoint only returns the first 100 results. There is no provisioning in the endpoint to change the page and page size like many of them have. As a result I can only get the first 100 results.

GET /learn/v1/survey/{pollId}/answer?id_course={courseId}

 

Have you tried adding on the page and page size like in others? I have encountered this a few times and although not documented, they worked. I don’t have a survey to test on for you though.


@Bfarkas, thanks, no I have not. Here is what the API browser shows. I also opened a ticket with Docebo. They are going to investigate further and get back to me.

I will try it next week.

 


Yeah, I’d give it a shot, if you have a tool outside of the api browser, just take them onto the end of the url as variables like you normally would. Literallly just ran into this with the skills endpoint and it worked. 


Yeah, I’d give it a shot, if you have a tool outside of the api browser, just take them onto the end of the url as variables like you normally would. Literallly just ran into this with the skills endpoint and it worked. 

Thanks @Bfarkas, I confirmed this works.


Excellent, good to know! Thanks for circling back to confirm.


@Bfarkas or others, what happens when you exceed the 1000 API calls per hour? Do you get a notification or do things just quit working?

I have been working on this and instead of doing all of this via API calls, I dumped a course report (CSV) and imported that to get much of the data, reducing the needed calls.

We have only two surveys, one for ILT and one for e-Learning, so I can hard code the query to course ID and the associated ID of the poll.

At that point all I have to do is loop on the survey pull for each course ID based on how many pages of survey results there are. I will further reduce the calls by having one pull for the ILTs and one for the e-Learnings.

But I am unsure of how to estimate when I will get to 1000 calls. We have 374 e-Learnings and 46 ILTs. But I will probably be ok as long as I split the two up.

Any advice is appreciated.


@Bfarkas or others, what happens when you exceed the 1000 API calls per hour? Do you get a notification or do things just quit working?

I have been working on this and instead of doing all of this via API calls, I dumped a course report (CSV) and imported that to get much of the data, reducing the needed calls.

We have only two surveys, one for ILT and one for e-Learning, so I can hard code the query to course ID and the associated ID of the poll.

At that point all I have to do is loop on the survey pull for each course ID based on how many pages of survey results there are. I will further reduce the calls by having one pull for the ILTs and one for the e-Learnings.

But I am unsure of how to estimate when I will get to 1000 calls. We have 374 e-Learnings and 46 ILTs. But I will probably be ok as long as I split the two up.

Any advice is appreciated.

Yup, that is my biggest concern with this method. You did some solid mitigations that I would also recommend in your use case, nice job. In terms of the limit, it is a soft limit, so it won’t stop you if you go over, but just know support can’t help you if you are going significantly over and begin experiencing issues.


Reply