PSO-Analytics: Visibility into how Kubernetes Applications use Storage

Joshua Robinson
5 min readApr 21, 2020

Running stateful applications on Kubernetes is becoming easier (still not easy!) and more mainstream. The use of Container Storage Interface (CSI) plugins like the Pure Service Orchestrator (PSO) have made automating storage tasks more predictable and well-supported. But there are crucial day-2 administrative tasks for storage that are still very difficult, including visibility into how Kubernetes applications use storage.

PSO-analytics is a simple Kubernetes-native utility for providing insight into how different applications are consuming storage through PSO. The accompanying github repository contains all source code and configs.

The Challenge: Storage Visibility

Stateful applications in Kubernetes manage storage through PersistentVolumeClaims and use a CSI provisioner, but there is no visibility into storage layer metrics. The only information visible through the Kubernetes API is the capacity allocation per PersistentVolume, which leaves almost all useful storage administration questions unanswered:

  • How much storage does this cluster use on each backend device?
  • How much storage does each statefulset use in total?
  • How much storage usage comes from each namespace?
  • What is the data reduction across any of the above dimensions?
  • What is the performance across any of the above dimensions?

The origin of this problem is that Kubernetes has only half the necessary info, e.g. labels and statefulset definitions. And the storage system has the other half: space usage, data reduction, and performance stats. To have full visibility into the storage usage of your Kubernetes applications, you need to somehow merge these two views.

Each PersistentVolume maps to a volume or filesystem on one of the storage backends under management, either a FlashArray or FlashBlade depending on the storageClass name. Therefore, to answer the above questions requires looking up each PersistentVolume across all storage backends and then correlating the statistics. Manually collating this information is impractical for even a handful of volumes.

PSO-analytics instead automates collecting information from both the Kubernetes cluster and each of the FlashArrays and FlashBlades under management by PSO. It then merges the collected data and correlates storage information of each volume across multiple dimensions. The end result is answers to the questions above and true visibility into how applications are using storage.

An example output shows the storage usage and data reduction rates aggregated per each statefulset in the cluster:

Another feature in pso-analytics is the detection of orphaned volumes, i.e., volumes or filesystems with names that should match a PersistentVolume but do not. The most common explanation for this is when PSO is removed without proper cleanup or if the volume’s reclamation policy is “retain”. The end result is a list of volumes that are no longer managed by PSO and so need to be manually cleaned up.

There are proposals (examples here and here) to standardize and provide more volume-level metrics through CSI, though they would still miss out on the deeper insight into the full storage topology managed by PSO. Another important direction in the future is to expose the storage statistics to more mature and widespread monitoring tools like Prometheus, which can already monitor FlashArray and FlashBlades.

PSO-analytics Installation and Usage

Installation of this utility consists of creating a single deployment and the necessary RBAC roles to give the appropriate permissions to access internal Kubernetes APIs. The yaml below creates the necessary ServiceAccount, ClusterRole, and ClusterRoleBindings in order to run a simple deployment based on the public docker image: quay.io/purestorage/pso-analytics.

Quickstart install with this command-line:

> kubectl apply -f https://raw.githubusercontent.com/joshuarobinson/pso-analytics/master/pso-analytics.yaml

To customize the installation, changing the installed namespace or output format options, download the yaml and edit locally before applying.

View the output tables computed by pso-analytics:

> kubectl logs -n=pso-analytics deployment.apps/pso-analytics

Example output:

==== statefulset =====
name drr phys_bytes logical_bytes prov_bytes vol_count
--------------------------------------------------------------------
elastic-internal 5.8 6.1G 35.5G 3.5T 3
pgs-postgresql 5.6 6.9M 38.7M 12.0T 1
==== label/app =====
name drr phys_bytes logical_bytes prov_bytes vol_count
--------------------------------------------------------------------
presto-worker 1.8 75.5M 138.0M 192.0T 24
postgresql 5.6 6.9M 38.7M 12.0T 1

These tables breakdown storage across multiple dimensions, here by statefulset and by label. Other dimensions not shown include per-backend usage foreach FlashArray and FlashBlade. The output table includes the space usage, volume counts, and data reduction rates (DRR). For example, the “elastic-internal” cluster above is a three node Elasticsearch cluster and so the table reports total space usage for that whole cluster.

This utility is currently version 0.1 (please report bugs!) and is subject to breaking changes in the future. If this proves useful or for any feature requests, please contact me or open an issue on github.

How it works

PSO-analytics is a python script to collect, correlate, and present the storage usage information across multiple dimensions, as well as the necessary Kubernetes RBAC ClusterRoles to grant needed permissions to the script.

This utility needs to query multiple different APIs, Kubernetes, FlashArrays and FlashBlades, without requiring any manual configuration.

  • Access to the Kubernetes API is provided through Kubernetes RBAC and running within a ServiceAccount that has sufficient privileges. The necessary authentication token is conveniently added to the pso-analytics pod by Kubernetes, so the simple function “load_incluster_config()” automatically finds the necessary token.
  • In order to login to the FlashArrays and FlashBlades configured for PSO, pso-analytics reads the same secret used by PSO which stores the Management IP and login token for each Pure backend. Sharing the login information with PSO keeps the installation configuration-free and ensures pso-analytics has the necessary visibility to query all PSO-managed volumes.

The script is written in Python as that is the common language with a convenient SDK for accessing the REST API of both Kubernetes and Pure (both FlashArray and FlashBlade). Each polled volume is associated with a list of tags which are then used to present aggregated views across multiple dimensions.

Finally, the Dockerfile for this image is straightforward:

FROM python:3.6-alpine
RUN pip install kubernetes purity_fb purestorage tabulate
COPY *.py .

Feedback Encouraged

This utility is intended to be useful, but is not exhaustively tested so please report any problems and I will endeavor to fix quickly. And if this proves valuable to understanding your storage usage patterns but there are key missing pieces, I would love to know that too. In particular, let us know if this would make a valuable dashboard on Pure1. There is a lot more that could be done with this too, like including performance metrics!

--

--