Unable to activate or upload hosted creatives
Incident Report for Xandr
Postmortem

Incident Summary

Between 2019-03-05 19:20 UTC - 2019-03-06 11:30 UTC All customer experienced a delay in creatives becoming active. The root cause of this issue was a software issue on one of application which is responsible for placing hosted creative assets on the appnexus CDN, failed uploading. The immediate problem was resolved by restarting the application. Failed file uploads were reprocessed and the associated creatives were automatically transitioned to the correct "active" state.

Scope of Impact

All customers likely experienced a delay in HOSTED creatives becoming active during the incident window. Any hosted creative associated with a new file between 2019-03-05 19:20 UTC - 2019-03-06 09:30 UTC would have had their activation delayed until 2019-03-06 11:30 UTC.

Timeline (UTC)
2019-03-05 19:20 Incident Started: application began failing to upload on small percentage of files
2019-03-06 06:13 The issue was reported internally.
2019-03-06 06:43 Engineer started investigation
2019-03-06 08:10 Problem was identified
2019-03-06 09:30 Root cause was identified
2019-03-06 09:40 Manually restarted application, resolving file upload problems for all new files/creatives
2019-03-06 09:50 Verified uploads working as expected
2019-03-06 10:46 Deployed Hotfix of application to allow us to reprocess the uploads correctly
2019-03-06 11:35 Incident Resolved: All failed uploads successfully reprocessed. All related inactive creatives now activated.

Cause Analysis
The incident was caused by a problem on one of application which is responsible for placing hosted creative assets on the appnexus CDN, which failed to upload a large number of assets. The upload failures were caused by a slow sftp connection leak in the application, which eventually lead to max out connections and rejecting additional upload requests.

Resolution Steps
Manually restarted application resolved this issue for newly uploaded creatives 2019-03-06 09:40 UTC. Hot fix for the application fixes reprocessing the creatives uploading and activating is deployed on 2019-03-06 10:46 UTC

Next Steps
- Application program fix for prevent recurrence of this issue.
- Add new alert for monitoring uploading error.

Posted Mar 08, 2019 - 15:25 UTC

Resolved

The incident has been fully resolved. We apologize for the inconvenience this issue may have caused, and thank you for your continued support.

Posted Mar 06, 2019 - 13:58 UTC
Monitoring

We have patched the issue and are monitoring our systems closely. We will provide an update as soon as the issue has been fully resolved.

Posted Mar 06, 2019 - 11:42 UTC
Update

We've solved the immediate problem, but are keeping an eye on it (reporting catching up, ensuring no repeat intermittent issues, etc

Posted Mar 06, 2019 - 11:40 UTC
Identified

We have identified the cause of the issue, and our engineers are actively working towards a resolution. We will provide an update as soon as possible. Thank you for your patience.

Posted Mar 06, 2019 - 08:29 UTC
Investigating

We are currently investigating the following issue of hosted creatives not uploading correctly or remaining inactive after saving:

  • Component(s): Creatives
  • Impact(s):
    • Unable to upload or preview creatives
    • Unable to activate creatives
  • Severity: Major Outage
  • Datacenter(s): Global

We will provide an update as soon as more information is available. Thank you for your patience.

Posted Mar 06, 2019 - 07:00 UTC