hub.foundries.io Outage
Timeline of Events
- 17:30 UTC - DataDog monitoring reports issues with hub.foundries.io
- 18:00 UTC - We eliminate all possible “what changed” things and can pin
things down to this line of code.
- 20:00 UTC - After trying to find operational issues with Google Storage, we
conclude it must be a client API change and start to play with
different versions of the Docker Registry.
- 20:30 UTC - Version 2.7.2 is able to “stat” GCS objcts
- 20:45 UTC - hub.foundries.io is operational again using 2.7.2
What Happened
The back story to this whole thing was a bug in Docker 2.6.
This issue originally prevented us from deploying our container registry on
2.6 and we instead deployed with 2.4. This has left us with an old version
of the registry which worked until today. Luckily, Docker 2.7
finally included the fix we needed and used an update-to-date GCS API which
did not exhibit the “stat” bug we were seeing.