From approximately 06:30 UTC on August 9, 2019 to 08:50 UTC on August 9, 2019, two out of three circuits between our U.S. and Europe data centers went down due to a manhole fire in NYC. Resulting circuit congestion caused delays in reporting.
During the incident window, Console reporting was delayed by up to 9 hours, compared to the typical 2-4 hours.
2019-08-09 06:30: Incident started: two circuits simultaneously down.
2019-08-09 07:16: Engineering notified of data congestion.
2019-08-09 07:25: Source of congestion identified on remaining circuit.
2019-08-09 07:28: Incident ticket created.
2019-08-09 08:50: One downed circuit recovered.
2019-08-09 09:40: Impression bus traffic shifted from Amsterdam data center (AMS1) to New York data center (NYM2) to relieve data congestion.
2019-08-09 12:07: Impression bus traffic shifted back from NYM2 to AMS1.
2019-08-10 02:30: Reporting delays falls back within 6 hr SLA.
2019-08-10 04:46: Incident resolved: Reporting back to normal.
Failure of two of the three circuits between Europe and the U.S. saturated the remaining circuit.
While the circuit provider worked to fix the downed circuit(s), our engineering team (1) shifted ad traffic from AMS1 to NYM2 to help relieve data congestion and (2) ensured business-critical data was prioritized over less important data.
The incident has been fully resolved. We apologize for the inconvenience this issue may have caused, and thank you for your continued support.
We are currently investigating the following issue:
We will provide an update as soon as more information is available. Thank you for your patience.