hass:high_availability
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
hass:high_availability [2020/03/06 16:42] – created a | hass:high_availability [2021/08/13 23:36] (current) – a | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== High Availability ====== | ====== High Availability ====== | ||
- | To aid in overall system availability, | + | To aid in overall system availability, |
+ | |||
+ | Beyond this, a couple of small tricks are required in order to get two Home Assistant instances to place nicely together. | ||
+ | |||
+ | ===== Variables ===== | ||
+ | There are a number of settings which must be unique for each Home Assistant instance in a cluster. This includes: | ||
+ | |||
+ | * name | ||
+ | * hostname | ||
+ | * internal_url (and possibly external_url, | ||
+ | |||
+ | Different methods exist for keeping these values unique within a common configuration, | ||
+ | |||
+ | secrets.yaml | ||
+ | < | ||
+ | # Tempates for hosts | ||
+ | hostname: leonard | ||
+ | partner: johnson | ||
+ | name: Home | ||
+ | google_report_state: | ||
+ | </ | ||
+ | |||
+ | configuration.yaml | ||
+ | < | ||
+ | homeassistant: | ||
+ | ... | ||
+ | name: !secret name | ||
+ | ... | ||
+ | internal_url: | ||
+ | </ | ||
+ | |||
+ | ===== Active/ | ||
+ | Only once instance should be active at any given time. This requires a way of suppressing the automations from running, as well as detected when the active instance has failed. | ||
+ | |||
+ | This starts off with a heartbeat automation that publishes the instances IP address every 60 seconds; | ||
+ | |||
+ | < | ||
+ | - id: ' | ||
+ | alias: HASS: | ||
+ | description: | ||
+ | trigger: | ||
+ | - platform: time_pattern | ||
+ | seconds: ' | ||
+ | condition: | ||
+ | - condition: state | ||
+ | entity_id: binary_sensor.active | ||
+ | state: ' | ||
+ | action: | ||
+ | - data: | ||
+ | payload: "{{ states(' | ||
+ | topic: home/ | ||
+ | service: mqtt.publish | ||
+ | </ | ||
+ | |||
+ | This works in concert with a binary_sensor that compares the published IP to the instance IP: | ||
+ | |||
+ | < | ||
+ | binary_sensor: | ||
+ | - platform: template | ||
+ | sensors: | ||
+ | active: | ||
+ | friendly_name: | ||
+ | value_template: | ||
+ | </ | ||
+ | |||
+ | Each automation then includes a condition to check the state of this sensor: | ||
+ | |||
+ | < | ||
+ | condition: | ||
+ | - condition: state | ||
+ | entity_id: binary_sensor.active | ||
+ | state: ' | ||
+ | </ | ||
+ | |||
+ | Control can be manually transferred between instances by calling the heartbeat automation on the standby instance. This immediately publishes the IP address, toggling the state of the binary sensor in each instance. | ||
+ | |||
+ | ===== Active/ | ||
+ | This is the previous method I used for transferring control between the two instances. It had the advantage of not needing a condition in each automation, but could get itself unstuck if all of the automations failed to enable or disable correctly. | ||
+ | |||
+ | The currently active instance was recorded in the following MQTT topic: | ||
+ | |||
+ | < | ||
+ | home/ | ||
+ | </ | ||
+ | |||
+ | This topic is used to populate a sensor that records the currently active instance: | ||
+ | |||
+ | < | ||
+ | sensor: | ||
+ | .. | ||
+ | platform: mqtt | ||
+ | state_topic: | ||
+ | name: partner | ||
+ | expire_after: | ||
+ | </ | ||
+ | |||
+ | The sensor will revert to '' | ||
+ | |||
+ | < | ||
+ | mqtt: | ||
+ | .. | ||
+ | birth_message: | ||
+ | topic: ' | ||
+ | payload: !secret hostname | ||
+ | retain: ' | ||
+ | will_message: | ||
+ | topic: ' | ||
+ | payload: !secret partner | ||
+ | retain: ' | ||
+ | </ | ||
+ | |||
+ | Additionally, | ||
+ | |||
+ | < | ||
+ | automation: | ||
+ | .. | ||
+ | - id: ' | ||
+ | alias: HASS: | ||
+ | description: | ||
+ | trigger: | ||
+ | - platform: time_pattern | ||
+ | seconds: ' | ||
+ | condition: [] | ||
+ | action: | ||
+ | - data: | ||
+ | payload: !secret hostname | ||
+ | topic: home/ | ||
+ | service: mqtt.publish | ||
+ | </ | ||
+ | |||
+ | This will tickle the heartbeat every 60 seconds, ensuring the '' | ||
+ | |||
+ | ===== Automations ===== | ||
+ | The main purpose of clustering Home Assistant is to allow either instance to take over the execution of Automations. In general however, each automation should only be executed by one instance at a time. There are two methods for achieving this: | ||
+ | |||
+ | * Automation Conditions | ||
+ | * Disabling/ | ||
+ | |||
+ | ==== Automation Conditions ==== | ||
+ | This is arguably the simplest method for controlling automations. Each automation should have a condition set which checks the status of the '' | ||
+ | |||
+ | < | ||
+ | .. | ||
+ | condition: | ||
+ | - condition: state | ||
+ | entity_id: sensor.active | ||
+ | state: !secret hostname | ||
+ | .. | ||
+ | </ | ||
+ | |||
+ | This arrangement also allows for automations which should run regardless of whether the instance is active or not, or automations which run specifically when the instance is inactive. It does however require each automation to be modified with this condition, which may be onerous for an established setup. | ||
+ | |||
+ | ==== Disabling/ | ||
+ | This method is a little more involved to implement, but does allow existing automations to be used unmodified. The first part of the puzzle is a script to create a group which contains all the current automations: | ||
+ | |||
+ | < | ||
+ | create_every_automation_group: | ||
+ | sequence: | ||
+ | - service: group.set | ||
+ | data_template: | ||
+ | object_id: every_automation | ||
+ | entities: '{{ states.automation | map(attribute='' | ||
+ | </ | ||
+ | |||
+ | With this group created we can now control the status of each automation with the '' | ||
+ | |||
+ | < | ||
+ | - id: ' | ||
+ | alias: HASS: | ||
+ | description: | ||
+ | trigger: | ||
+ | - entity_id: sensor.active | ||
+ | platform: state | ||
+ | to: !secret partner | ||
+ | condition: [] | ||
+ | action: | ||
+ | - data: {} | ||
+ | service: script.create_every_automation_group | ||
+ | - data: {} | ||
+ | entity_id: group.every_automation | ||
+ | service: automation.turn_off | ||
+ | - data: {} | ||
+ | entity_id: automation.hass_active | ||
+ | service: automation.turn_on | ||
+ | - id: ' | ||
+ | alias: HASS: | ||
+ | description: | ||
+ | trigger: | ||
+ | - entity_id: sensor.active | ||
+ | from: !secret partner | ||
+ | platform: state | ||
+ | condition: [] | ||
+ | action: | ||
+ | - data: {} | ||
+ | service: script.create_every_automation_group | ||
+ | - data: {} | ||
+ | entity_id: group.every_automation | ||
+ | service: automation.turn_on | ||
+ | </ | ||
+ | |||
+ | Note that after the instance is moved into standby, the last action of the HASS: |
hass/high_availability.1583484157.txt.gz · Last modified: 2020/03/06 16:42 by a