Changelog
This changelog lists all updates, improvements and new features our Engineering team develops for our Skyscrapers Reference Developer Platform. These are rolled out automatically to all DevOps-as-a-Service customers.
2020 Q2
- 2020-05-20
Maintenance
Resizable persistent volumes
We updated the defaults for our persistent volume storage classes to allow for volume expansion. This allows you to Modify the volume in your PersistentVolumeClaim without the need to replace the existing volume. This was already the default for our …
- 2020-05-19
Maintenance
Monitoring RDS instances through Prometheus
Up until now we’ve been monitoring our managed RDS instances through Icinga, which has been working great. But for Kubernetes customers, we’ve been relying on Prometheus more and more to monitor external services, as well as the cluster itself. …
- 2020-05-19
Maintenance
Automated OpenVPN deployments
In the past the OpenVPN deployment was done manually to each cluster. As of today we made it part of our Addons stack and deploy and update OpenVPN in an automated way. Information about the OpenVPN setup and how you can use it is documented in the …
- 2020-05-13
Maintenance
Upgrades to monitoring & logging components
We’ve rolled out some minor updates to the following monitoring and logging components: prometheus: 2.15.2 –> 2.18.1 Several enhancements and bug fixes, including performance improvements. prometheus-operator: 0.37.0 –> 0.38.1 …
- 2020-05-13
Maintenance
Fix Services without Endpoints timeout instead of reject
If a Kubernetes Service had no active Endpoints, for example when a deployment is scaled to 0, then requests to that Service were timing out. Instead it’s supposed to reject traffic with the appropriate ICMP response. The reason this was happening is …
- 2020-05-12
Maintenance
Option to configure custom routes and endpoints in Alertmanager
We now offer the option to configure custom endpoints and custom routes in Alertmanager. This is useful if you want to route your prometheus alerts to custom slack endpoints or use an escalation tool like PagerDuty or OpsGenie. Get in touch with your Lead …
- 2020-05-07
Maintenance
Option to add an internal-only Ingress controller
We now offer the option to enable and use an internal-only Nginx Ingress Controller, next to the public one we offer by default. This is useful if you want to expose services running in K8s only within the private AWS VPC network. With this change, …
- 2020-05-06
Maintenance
Oauth-proxy security fix following High Severity CVE-20200-11052
Today a notice for CVE-2020-11053 with a severity of High went out, impacting our oauth2-proxy that is used for authentication to our internal dashboards. As users can provide a redirect address for the proxy to send the authenticated user to at the end of …
- 2020-05-06
Maintenance
Loki - Option to set custom labels in Promtail
We now offer the option to parse custom log labels with Promtail so you can use them in Loki. Previously there was only the default log labels where you could filter on in Loki. As of now it is possible to adjust promtail to parse extra labels from your …
- 2020-04-27
Maintenance
Update of cluster components
We have updated the following cluster components to their latest version: Dex v2.22.0 -> v2.23.0 external-dns 0.6.0 -> 0.7.1 Grafana 6.7.1 -> 6.7.3 kube2iam 0.10.8 -> 0.10.9 metrics-server 0.3.5 -> 0.3.6 oauth2-proxy 5.0.0 -> 5.1.0 …
- 2020-04-16
Maintenance
Upgrade EKS to 1.15
We have updated our EKS control planes and nodes to the latest supported version: 1.15. This brings EKS on K8s v1.15.11. In the process of upgrading EKS we updated: KubeProxy from 1.14.9 to 1.15.11 Cluster Autoscaler from 1.14.11 to 1.15.5 Although …
- 2020-04-16
Maintenance
Improved Slack notifications for Prometheus alerts
We’re updating the format of our monitoring Slack messages. As you already know, all the alerts produced by your Kubernetes clusters show up in Slack. The goal is to provide visibility on what’s going on in your infrstructure and application, …
- 2020-04-16
Maintenance
Adjusting the NodeFilesystemSpaceFillingUp Prometheus alert
You might have noticed the NodeFilesystemSpaceFillingUp alert passing by on some occasions. That alert triggers when Prometheus predicts that a node’s disk will run out of space, based on the trend of the last few hours. Usually, that alert triggers …
- 2020-04-14
Maintenance
Simplifying our ECS monitoring
We’re deprecating the Prometheus setup of our ECS clusters. We released that setup a while ago as an alternative for both infrastructure and application montoring for our ECS clusters, similar to what we have in place for Kubernetes. But we’ve …
- 2020-04-10
Maintenance
Migration to Helm 3
We have migrated all our managed cluster add-ons and our CI to Helm 3. Compared to Helm 2, version 3 comes with quite a lot of improvements, of which most aparent is the removal of the Tiller server component: Removal of Tiller Release Names are now scoped …
- 2020-04-09
Maintenance
Auto configuration for AWS ElasticSearch multi-az deployment
Our AWS ElasticSearch terraform module now supports auto-configuration for multi-az deployment. The criteria is to always enable multiple Availability Zones up to 3 zones and within the available resources, unless specified otherwise by the user. Your lead …
- 2020-04-01
Maintenance
Help fight COVID-19 with your Kubernetes cluster
In the context of the current global situation regarding the COVID-19 pandemic, we’re making it easy for us and our customers to commit part of our infrastructure spare resources to the Folding@Home project. In short, Folding@Home uses distributed …
2020 Q1
- 2020-03-31
Maintenance
Upgrades to monitoring components
We’ve rolled out some minor updates to the monitoring components. List of updated components: grafana: v6.6.2 –> v6.7.1 Several enhancements and bug fixes. prometheus: v2.14.0 –> v2.15.2 Several enhancements and bug fixes, including …
- 2020-03-27
Maintenance
Cluster addons upgrades
Over the past weeks we’ve rolled out a bunch of updates to our Kubernetes addons stack for all staging and production clusters. List of updated components: cert-manager: v0.9.0 –> v0.13.1 During the process we also made it through the major …
- 2020-03-27
Maintenance
Alert and documentation for NodeWithImpairedVolumes
Using EBS-backed Persistent Volumes on Kubernetes comes with some caveats. Among those is the (silent) limit of maximum attachments per EC2 instance. For more information about this issue, you can check the documentation. We have also added an alert to …
- 2020-03-26
Maintenance
Use NetworkPolicies
We have deployed Calico to our EKS setups as a network policy engine. By default, Pods are non-isolated and thus accept traffic from any source. By specifying NetworkPolicies you can isolate Pods from each other and thus have more fine-grained K8s …
- 2020-03-26
Maintenance
Upgrade of core EKS components
We have upgraded the core cluster components, running in kube-system, to their latest recommended versions (for EKS 1.14): AWS VPC CNI from 1.5.3 to 1.5.5 CoreDNS from 1.3.1 to 1.6.6 KubeProxy from 1.14.7 to 1.14.9 These are minor updates, bringing some …
- 2020-03-25
Maintenance
Upgrade Concourse to version 5.8.1
We rolled out Concourse version 5.8.1 to all our setups. This is a CVE version upgrade that patches an edge case of CVE-2018-15798 You can check out the full Concourse changelog here.
- 2020-03-23
Maintenance
Upgrade Caddy to version 1.0.4 with ACMEv2
We rolled out version 1.0.4 of the Caddy web server to all our setups which use on-demand “whitelabel” type of domains. All these certificates are now being requested and renewed against the ACMEv2 API. ACMEv1 has been deprecated for a while …
- 2020-03-17
Maintenance
Upgrade Concourse to version 5.8.0
We rolled out Concourse version 5.8.0 to all our setups. This is a minor version upgrade, coming from version 5.7.2, and it includes the following: The firrst step to spaces in Concourse a handful of fixes and smaller features. You can check out the full …