Using Curator to prune Elasticsearch Indices

If you are using Elasticsearch as a database to store data from various sources, you are going to need to a way to prune the indices before they end up filling your drive. Using Curator is one way to go about this task.

The following instructions are based on a Ubuntu LTS 16.04 install with Elasticsearch 5.6 locally installed on the same machine.

  • First, Add the source to your apt repository list –
sudo echo "deb [arch=amd64] http://packages.elastic.co/curator/5/debian9 stable main" >> /etc/apt/sources/curator.list
  • Update your package listing and install the curator package
sudo apt-get update && sudo apt-get install elasticsearch-curator

The Curator application uses the popular YAML format as a basis for it’s configuration files. We will need to create two files at a minimum to get going.

  • First, create a blank file for the base configuration –
touch ~/.curator/curator.yml

Paste in the following basic configuration (Assuming your Elasticsearch sever is on the same machine as curator)

client:
  hosts:
    - 127.0.0.1
  port: 9200
  url_prefix:
  use_ssl: False
  certificate:
  client_cert:
  client_key:
  ssl_no_validate: False
  http_auth:
  timeout: 30
  master_only: False

logging:
  loglevel: INFO
  logfile:
  logformat: default
  blacklist: ['elasticsearch', 'urllib3']
~

Then, create another “action file” that holds the action we want to run. Here, I’m using a slightly modified action from an example that will delete any indice older than 30 days that starts with “netflow-“. Modify this file to suit your needs. We will call it “delete_indices.yml”

touch ~/delete_indices.yml

Contents of file

actions:
  1:
    action: delete_indices
    description: >-
      Delete indices older than 30 days (based on index name), for netflow-
      prefixed indices. Ignore the error if the filter does not result in an
      actionable list of indices (ignore_empty_list) and exit cleanly.
    options:
      ignore_empty_list: True
      disable_action: false
    filters:
    - filtertype: pattern
      kind: prefix
      value: netflow-
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 30

We can then run the configuration file with the –dry-run parameter to simulate the actions taken. Remove –dry-run when you are ready to run it for real!

➜  ~ sudo curator ./delete_indice.yml --dry-run
2017-10-30 14:28:42,528 INFO      Preparing Action ID: 1, "delete_indices"
2017-10-30 14:28:42,536 INFO      Trying Action ID: 1, "delete_indices": Delete indices older than 30 days (based on index name), for logstash- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly.
2017-10-30 14:28:42,582 INFO      DRY-RUN MODE.  No changes will be made.
2017-10-30 14:28:42,582 INFO      (CLOSED) indices may be shown that may not be acted on by action "delete_indices".
2017-10-30 14:28:42,582 INFO      Action ID: 1, "delete_indices" completed.
2017-10-30 14:28:42,582 INFO      Job completed.

To run this on a schedule, simply add it to crontab. This will run it daily at midnight. Open your crontab with

crontab -e

Add these lines –

00 00 * * * /usr/bin/curator ~/delete_indice.yml
Share this content:

1 thought on “Using Curator to prune Elasticsearch Indices”

Leave a Reply to coder Cancel Reply

Your email address will not be published. Required fields are marked *