Cloud-native applications must often co-exist with legacy applications. Those legacy applications are hardened and just work, so rewriting can seem hardly worth the trouble.

For legacy applications to take advantage of new technology requires bridges, and fuse clients for object storage are a bridge that allow most (but not all)…


Kafka and Elasticsearch Pipelines Made Easier with Kubernetes and Object Storage

Collecting and indexing logs from servers, applications, and devices enables crucial visibility into running systems. A log analytics pipeline allows teams to debug and troubleshoot issues, track historical trends, or investigate security incidents. The most commonly deployed pipeline combines…


A Quick and Easy Tool with No Dependencies

Spent a few hours trying to debug why Apache Spark on FlashBlade is slower than expected only to realize you have an underlying networking issue?

Flashblade-plumbing is a tool to validate NFS and S3 read/write performance from a single client to a…


RESTful interfaces make interacting with modern applications easier for other applications. “Easy” means you can use a variety of languages and tools to build scripts and automation around the application and that REST API, instead of relying only on manual clicks and a GUI.

You may already be familiar with…


Dremio is a rapidly growing scale-out analytics engine that is part of a new generation of data lake services, emphasizing the power and flexibility of disaggregated storage like FlashBlade. …


In this post, I look at what it takes to list all keys in a single bucket with 67 billion objects and build a simple list benchmark program in Golang. …


Comparing Text, JSON, Parquet, and Elasticsearch

Observability and diagnosability require collecting logs from a huge variety of sources, e.g. firewalls, routers, servers, and applications, into a central location for analysis. These logs help in troubleshooting, optimization, and predictive maintenance; all of which benefit from longer timeframes of log data. But…


Confluent recently announced the general availability of Tiered Storage in the Confluent Platform 6.0 release, a new feature for managing and simplifying storage in large Kafka deployments. As this feature has been in tech preview, I have been able to test the solution with an on-prem object store, FlashBlade.

Kafka…


One of the biggest challenges in making use of data is avoiding the complexity of making N copies of a dataset to be used with N different analytics systems. Dataset sprawl and application silos are quicksand for productivity. And the backups of your important tables are sitting idle so perhaps…


Monitoring infrastructure is essential for keeping production workloads healthy and debugging issues when things go wrong. Observability is essential for troubleshooting.

The goal of this post is to learn how to quickly and easily deploy a minimal configuration open-source Prometheus and Grafana monitoring infrastructure in Kubernetes.

The full yaml for…

Joshua Robinson

Data science, software engineering, hacking

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store