Add Me!Close Menu Navigation
Back to main page:

20th July Outage Postmortem

  • 23 July, 2015 -
  • Photon -
  • Tags :
  • Comments Off on 20th July Outage Postmortem

On Monday, 20th July, the Photon Cloud servers in the US Cloud were not available.
Players where affected was from 11:22 UTC – 13:40 UTC. Shortly after period, our servers started to accept connections again.
In that time, the datacenter of our provider suffered from a complete power outage.
The Chat Cloud, our Clouds in EU and none of the other Regions were affected.

At about the same time, also had an issue with our stats service failing. This was unrelated to the power outage and did not affect any players.
The issue was caused by an Out-of-Memory being handled very bad by the docker version we where using to host our service and it affected the primary
and secondary systems.
The stats-service failed from Sunday 19-Jul-2015 09:23 UTC – 20-Jul-2015 10:39 UTC.
Note: You may see data from some time before, where the stats-service was partially restored but not stable starting from 20-Jul-2015 8:20 UTC.

Planned Measures
1) We haven’t finished assessing how to handle a major disruption like the one caused by the power failure for the future.
In all the years since we started we didn’t have such a case.

2) The stats service is being upgraded this week – both memory and docker are being upgraded.

Posted By Tobias

Comments are closed.