Changelog
This changelog lists all updates, improvements and new features our Engineering team develops for our Skyscrapers Reference Developer Platform. These are rolled out automatically to all DevOps-as-a-Service customers.
2019 Q2
- 2019-04-08
Maintenance
Increased monitoring alerts visibility
During the following days we’re going to rollout some changes in how Kubernetes monitoring notifications are delivered. From now on, all notifications comming from the production k8s monitoring system will be shown in our shared slack channel, that …
2019 Q1
- 2019-03-29
Maintenance
Upgrade to Kubernetes 1.11.9 [CVE-2019-1002100, CVE-2019-9946, CVE-2019-3874, CVE-2019-1002101]
We are in the process of upgrading our managed Kubernetes clusters from v1.11.6 to v1.11.9. Next to some general bugfixes and improvements, which you can find full details in the Kubernetes changelog, this rollout comes with several high and medium …
- 2019-03-19
Maintenance
Create simple AWS resources from K8s via the AWS Service Operator
We’ve made the AWS Service Operator available for deployment on our managed Kubernetes clusters. This Operator allows you to manage some AWS resources, like ECR repositrories and S3 buckets, by using Kubernetes Custom Resource Definitions. For …
- 2019-03-18
Maintenance
Support for cronjob monitoring
Update (18-03-2019): We found out there were enough default alerts covering all cases of cronjob failures. The following alerts are covering different failure cases accordingly: KubeJobCompletion: Warnning alert after 1 hour if any Job doesn’t …
- 2019-03-06
Maintenance
Upgrade Kubernetes components
We are in the process of upgrading our staging Kubernetes clusters components to the latest stable releases. Production clusters will follow in 1 to 2 weeks (to be announced) after we have confirmed there are no issues with our customer’s workloads. …
- 2019-03-06
Maintenance
Improved monitoring alerts on Slack
We have updated the format of the monitoring Slack notifications. You might have already noticed that the monitoring messages in your Slack channels now contain more useful information and are more structured. We’ve already started rolling out the …
- 2019-02-21
Maintenance
Mongodb monitoring and dashboards
We have updated the clusters to have support for mongodb monitoring, alerts and dashboards. If you have a mongodb cluster you will see that there is now a mongodb dashboard in Grafana and that we added specific alert rules for mongodb in prometheus.
- 2019-02-19
Maintenance
Improved etcd backups
We’ve upgraded all the k8s cluster with a new etcd backup implementation. The old backup solution was relying on daily snapshots taken from a service running in the master nodes. We’ve decided to take a new approach by using AWS Data Lifecycle …
- 2019-02-18
Maintenance
CVE-2019-5736 - Rolling out patched runc
Update: Added other affected services next to Kubernetes. Last week a new vulnerability in Docker’s runc was announced: CVE-2019-5736. You can read more about this specific vulnerability and how it affects Kubernetes users in the Kubernetes blog: …
- 2019-01-21
Maintenance
Use encrypted EBS volumes for etcd storage and (optionally) encrypt k8s node root volumes
We’re rolling out a major update for our Kubernetes etcd clusters to now use encrypted EBS volumes for storing all of the Kubernetes state. As an optional feature, it’s also possible to have the Kubernetes nodes root volumes encrypted. If this …
- 2019-01-15
Maintenance
Move to CoreDNS dns server and add gp2-encrypted StorageClass
We’re updating our Kubernetes staging clusters with CoreDNS, the new dns server that replaces KubeDNS. After an in-depth analysis and tests we’ve verified that the performance and the stability between the two solutions are almost identical. …
- 2019-01-11
Maintenance
Upgrade Vault to 1.0.1
A Vault upgrade for our setups was long overdue. We’ve upgraded our Vault installation tools from version 0.9.3 to 1.0.1, which is the latest Vault version available at the moment. As Vault is set up as HA, the downtime of the upgrade will be …
- 2019-01-03
Maintenance
Upgrade to Kubernetes 1.11.6 [updated]
Update: Changed Kubernetes update from 1.10.12 to 1.11.6 We’ve upgraded our Kubernetes staging clusters to the latest stable version. Bumping from version 1.10.10 to 1.10.12 1.11.6. There are some nice additions to the 1.11 release, like Pod priority …
2018 Q4
- 2018-12-03
Maintenance
Upgrade to Kubernetes 1.10.11 [updated]
Update 2 (2018-12-03): Since our last update, the people at Kubernetes updated their documentation to add an important fix in the 1.10.11 changelog: CVE-2018-1002105: Fix critical security issue in kube-apiserver upgrade request proxy handler (#71411, …
- 2018-11-27
Maintenance
Set resource reservations for kubelet and other system processes
Following our efforts to improve the overall stability of our Kubernetes clusters, we’ve now set resource reservations for kubelet and other system processes. This will ensure that these critical processes always have enough CPU and memory available …
- 2018-11-27
Maintenance
Adding Prometheus monitoring for ECS
We’ve deployed on all our ECS managed staging clusters a prometheus monitoring system. This allows us to have a better monitoring for our ECS nodes and adds the opportunity to create custom metrics to monitor your applications. Thanks to alert …
- 2018-11-21
Maintenance
Updated Prometheus & Grafana monitoring stack - update
As announced in our previous update, we have migrated our cluster-monitoring stack by using the new stable/prometheus-operator as base chart. By now these updates have already been rolled out across staging clusters. Initially we planned to do a phased …
- 2018-11-19
Maintenance
Updated Prometheus & Grafana monitoring stack
Our cluster monitoring stack is based on the prometheus-operator developed by the people at CoreOS, more concretely we used kube-prometheus as a starting point for a complete setup. This project has seen numerous changes and improvements, like the …
- 2018-11-13
Maintenance
Set resource requests and limits for all infrastructure pods
We’ve recently adjusted resource requests and limits for all Pods running in the infrastructure namespace. Previously, some of them didn’t have requests nor limits, and some others had unnecessary high values. We’ve reviewed the CPU and …
- 2018-11-13
Maintenance
Moving from kube-lego to cert-manager for automatic TLS certificates
We’re moving the Letsencrypt service on our Kubernetes from the deprecated kube-lego to cert-manager. Cert-manager comes with a whole set of new features, mainly the ability to use the dns01 ACME challenge for certificate validation. This means you …
- 2018-11-05
Maintenance
Grafana Pods dashboard updated memory metrics
We’ve updated the Pods dashboard so it displays both the actual container memory usage (container_memory_working_set_bytes) next to the previous metric including caches (container_memory_usage_bytes). You can find this dashboard in your grafana …
- 2018-10-30
Maintenance
Releasing our user-level documentation repository
Today we’re releasing a new user-level knowledge base of our products and services. It’s aimed to help you be more confident and autonomous in managing your applications on our platforms. You can find it in the following GitHub repository: …
- 2018-10-23
Maintenance
K8S upgrade to stretch
We upgraded and tested our test cluster successfully to Debian stretch now that all open issues are resolved. This change makes the K8S stack more future proof because we are running on the current stable release of Debian called stretch. This also allows …
2018 Q3
- 2018-09-28
Maintenance
Vault data is now backed up
Our Vault setup is configured to store the data in a DynamoDB table, using Vault DynamoDB storage backend. DynamoDB already replicates all the data in a table across three availability zones, giving Vault high availability and data durability. From today, …
- 2018-09-28
Maintenance
Teleport upgrade to 2.7.5
Teleport has been upgraded to version 2.7.5 for all users. This upgrade includes various bugfixes and performance improvements, as well as additional functionality such as scp (secure copy) from the web interface. You can find the full changelog on the …