Incident Summary:
On the morning of January 15, starting around the commencement of school hours (CEST), our support team received multiple reports that users on macOS devices were unable to start the Schoolyear
application. Users reported successfully installing the application but were unable to transition from the onboarding screen to the Safe Exam Workspace. The issue prevented students from starting their digital exams, forcing some institutions to revert to paper backups.
The root cause was identified as a version mismatch between the client (desktop application) and the onboarding (web-application). A fix was deployed to the onboarding application, and the client version was re-released to ensure all users were on the correct version, resolving the issue for all customers.
Leadup:
On the evening of January 14, a new version of the Schoolyear client was released to production. Prior to release, this version was tested extensively on a wide range of devices in our Beta environment, where it performed without issue and was therefore confirmed to be production-ready.
Fault:
The new client version relied on a specific update to the web-based onboarding application to function correctly, which is uncommon enough to not be part of our standard operating procedures. While the client was successfully deployed to production, the corresponding update to the onboarding application was not deployed in advance. This resulted in a conflict where the client received a malformed launch URL from the outdated onboarding page, preventing startup.
This omission occurred because the onboarding application relied on a manual release process that was not integrated with the main client release pipeline.
Impact:
The incident specifically affected macOS users, who were presented with a "Schoolyear installed successfully" message but were looped back to the browser without the exam starting. This resulted in a significant disruption for students attempting to take exams on MacBooks for the affected time. Windows and (managed) Chromebook users did not run into this issue and where able to conduct the exam successfully.
Response:
Upon validating the issue on internal macOS devices, the engineering team initially advised a rollback. The roll-back allowed the students to start the exam again after re-installing the Schoolyear application.
Simultaneously, we verified that the Beta environment remained healthy, isolating the issue to the Production environment. By pinning our internal tenant to the new version in Production, we utilized our logging system to identify the "malformed launch URL" error. The missing update for the onboarding app was identified and immediately released. Following this, the client was re-released to all customers to ensure a seamless fix without requiring manual re-installation on the students devices.
Timeline:
00:00 - New client version released to production.
08:50 - Support team receives initial report of macOS startup failures and escalates to the on-call engineer
09:03 - Issue validated on internal devices; initial rollback advice issued.
09:15 - Engineering identifies the missing onboarding update via production logs.
09:27 - Onboarding application update released to production.
09:27 - Client version re-released to all customers; issue resolved.
09:32 - Status page updated and affected customers contacted directly.
Reflection:
This incident highlighted a gap between our Beta/testing environment and the production release process. Because our Beta environment is frequently updated manually, the dependency issue was masked during testing. To prevent this in the future, we are changing our release protocol to require final validation on the Production environment rather than relying solely on Beta. Furthermore, we are implementing Continuous Integration/Continuous Deployment for our onboarding application to automate releases and remove the risk of manual deployment errors.