A Guide to Elasticsearch Snapshots

  • How to configure a snapshot repository on an on-premises S3 object store
  • How to setup and use a shared filesystem repository on NFS using a dynamically provisioned ReadWriteMany PersistentVolume
  • Techniques to simplify setting up and using Elasticsearch snapshots in Kubernetes
A possible DR architecture for multiple Kubernetes clusters. Site 2 can also use FlashBlade A as a snapshot repository.

S3 Snapshot Repository

  initContainers:
- name: install-plugins
command:
- sh
- -c
- |
bin/elasticsearch-plugin remove --batch repository-s3
bin/elasticsearch-plugin install --batch repository-s3
  initContainers:
- name: install-plugins
command:
- sh
- -c
- |
bin/elasticsearch-plugin remove repository-s3
bin/elasticsearch-plugin install --batch repository-s3
- name: add-access-keys
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: irp210-s3-keys
key: access-key
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: irp210-s3-keys
key: secret-key
command:
- sh
- -c
- |
echo $AWS_ACCESS_KEY_ID | bin/elasticsearch-keystore add --stdin --force s3.client.default.access_key
echo $AWS_SECRET_ACCESS_KEY | bin/elasticsearch-keystore add --stdin --force s3.client.default.secret_key
> kubectl create secret generic my-s3-keys --from-literal=access-key=’XXXXXXX’ --from-literal=secret-key=’YYYYYYY’
config:
node.master: true
node.data: true
node.ingest: true
thread_pool.snapshot.max: 8
{
“type”: “s3”,
“settings”: {
“bucket”: “elastic-snapshots”,
“endpoint”: “10.62.64.200”,
“protocol”: “http”,
“max_restore_bytes_per_sec”: “1gb”,
“max_snapshot_bytes_per_sec”: “200mb”
}
}

NFS Snapshot Repository

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: es-snap-repo-claim
spec:
storageClassName: pure-file
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Ti
  nodeSets:
- name: all-nodes
count: 5
podTemplate:
spec:
containers:
- name: elasticsearch
volumeMounts:
- name: es-snap-repo-vol
mountPath: /es-snaps

volumes:
- name: es-snap-repo-vol
persistentVolumeClaim:
claimName: es-snap-repo-claim
config:
node.master: true
node.data: true
node.ingest: true
thread_pool.snapshot.max: 8
path.repo: [“/es-snaps”]
{
“type”: “fs”,
“settings”: {
“location”: “/es-snaps”,
“compress”: false,
“max_restore_bytes_per_sec”: “1gb”,
“max_snapshot_bytes_per_sec”: “1gb”
}
}

Snapshot Command Guide

apiVersion: v1
kind: Pod
metadata:
name: es-jump
spec:
containers:
- name: es-jump
image: centos:7
env:
- name: PASSWORD
valueFrom:
secretKeyRef:
name: quickstart-es-elastic-user
key: elastic
- name: ESHOST
value: “https://quickstart-es-http:9200"
command: [“tail”, “-f”, “/dev/null”]
imagePullPolicy: IfNotPresent
restartPolicy: Always
> kubectl exec -it pod/es-jump bash
curl -u “elastic:$PASSWORD” -k “$ESHOST/_cat/health?v”
curl -u “elastic:$PASSWORD” -k “$ESHOST/_cat/nodes?v”
curl -u “elastic:$PASSWORD” -k “$ESHOST/_cat/indices?v”
curl -u “elastic:$PASSWORD” -k “$ESHOST/_snapshot?pretty
curl -u “elastic:$PASSWORD” -k -X PUT “$ESHOST/_snapshot/my_s3_repository?pretty” -H ‘Content-Type: application/json’ -d’
{
“type”: “s3”,
“settings”: {
“bucket”: “elastic-snapshots”,
“endpoint”: “10.62.64.200”,
“protocol”: “http”,
“max_restore_bytes_per_sec”: “1gb”,
“max_snapshot_bytes_per_sec”: “200mb”
}
}
curl -X DELETE -u “elastic:$PASSWORD” -k “$ESHOST/_snapshot/my_s3_repository?pretty”
curl -u “elastic:$PASSWORD” -k “$ESHOST/_cat/snapshots/my_s3_repository?v”
curl -u “elastic:$PASSWORD” -k -X PUT $ESHOST/_snapshot/my_s3_repository/test-snapshot-$(date +”%Y-%m-%d-%H-%M”)?pretty
curl -u “elastic:$PASSWORD” -k -X PUT $ESHOST/_snapshot/my_s3_repository/test-snapshot-$(date +”%Y-%m-%d-%H-%M”)?wait_for_completion=true&pretty
curl -u “elastic:$PASSWORD” -k -X PUT $ESHOST/_snapshot/my_s3_repository/test-snapshot?wait_for_completion=true -H ‘Content-Type: application/json’ -d’{ “indices”: “nyc_taxis” }’
curl -u “elastic:$PASSWORD” -k “$ESHOST/_snapshot/my_s3_repository/test-snapshot/_status?pretty”
curl -X DELETE -u “elastic:$PASSWORD” -k “$ESHOST/_snapshot/my_s3_repository/test-snapshot?pretty”
curl -u “elastic:$PASSWORD” -k -X POST $ESHOST/_snapshot/my_s3_repository/test-snapshot/_restore?pretty -H ‘Content-Type: application/json’ -d’{ “indices”: “nyc_taxis”, “rename_pattern”: “nyc_taxis”, “rename_replacement”: “restored_taxis” }’
curl -X DELETE -u “elastic:$PASSWORD” -k “$ESHOST/restored_taxis?pretty”
curl -u “elastic:$PASSWORD” -k “$ESHOST/_cat/thread_pool/snapshot?h=name,max&v”
curl -u “elastic:$PASSWORD” -k -X PUT $ESHOST/_snapshot/my_nfs_repository/test-snapshot?pretty
curl -u “elastic:$PASSWORD” -k “$ESHOST/_cat/snapshots/my_nfs_repository?v”
curl -u “elastic:$PASSWORD” -k -X POST $ESHOST/_snapshot/my_nfs_repository/test-snapshot/_restore?pretty -H ‘Content-Type: application/json’ -d’{ “indices”: “nyc_taxis”, “rename_pattern”: “nyc_taxis”, “rename_replacement”: “restored_taxis” }’

Snapshot Lifecycle Management

curl -u “elastic:$PASSWORD” -k “$ESHOST/_slm/status?pretty”
curl -u “elastic:$PASSWORD” -k “$ESHOST/_slm/policy?pretty”
curl -X PUT -u “elastic:$PASSWORD” -k “$ESHOST/_slm/policy/daily-snapshots?pretty” -H ‘Content-Type: application/json’ -d’
{
“schedule”: “0 30 1 * * ?”,
“name”: “<daily-snap-{now/d}>”,
“repository”: “my_s3_repository”,
“config”: {
“indices”: [“plsidx-*”, “nyc_taxis”],
“ignore_unavailable”: true,
“include_global_state”: false
},
“retention”: {
“expire_after”: “30d”,
“min_count”: 5,
“max_count”: 50
}
}
curl -u “elastic:$PASSWORD” -k “$ESHOST/_slm/stats?pretty”
curl -X DELETE -u “elastic:$PASSWORD” -k “$ESHOST/_slm/policy/daily-snapshots?pretty”

Custom Kubernetes Cronjob

apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: elasticsearch-snapshotter
spec:
schedule: “@daily”
concurrencyPolicy: Forbid
jobTemplate:
spec:
template:
spec:
containers:
- name: snapshotter
image: centos:7
env:
- name: ESPASSWORD
valueFrom:
secretKeyRef:
name: quickstart-es-elastic-user
key: elastic
command:
- /bin/bash
args:
- -c
- |
curl -s -i -k -u “elastic:$ESPASSWORD” -XPUT “https://quickstart-es-http:9200/_snapshot/my_s3_repository/%3Csnapshot-%7Bnow%2Fd%7D%3E" | tee /dev/stderr | grep “200 OK”
restartPolicy: OnFailure

Summary

--

--

--

Data science, software engineering, hacking

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Announcing the new Challenges and Rankings for Metablox

Data Processing Programming (2): Code Complexity

VSCode find and replace using regex super powers to remove duplicates

Data Analytics for Rightsizing Tests

Generate sequential thumbnails from a video using kotlin

What you could do if you had a free, tireless browser operator?! Introducing Playwright.

Hallelujah Instrumental Download

Running Alphafold on Google Cloud compute engine

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Joshua Robinson

Joshua Robinson

Data science, software engineering, hacking

More from Medium

Scaling Airflow Workers in EKS

What is an Operator in K8s and why FPGAs need one in Data Centers

Flink on Kubernetes — how and why?

Set up Multi-Datacenter Cassandra Clusters in GKE with K8ssandra and Cloud DNS