DevPath · Learn to code ESPTEN

Security and reliability

Reliability

Backups and disaster recovery

A reliable system survives hardware failures, accidental deletions and attacks. For that you need tested backups (a backup you don't know how to restore is not a backup) and a disaster recovery plan (disaster recovery). It is measured with two objectives:

SLA, SLO and error budget

Incident response and blameless postmortems

When something breaks, you follow an incident response process: detect, declare the incident, mitigate, communicate and resolve. Afterwards a postmortem is written: what happened, impact, root cause and actions to prevent it from recurring.

The postmortem is blameless: systems and processes are analyzed, you don't look for someone to point at. Only that way do people tell what really happened and the organization learns.

Safe deployments: blue-green, canary and rollback

Deploying is one of the riskiest moments. Strategies to reduce that risk:

Put this into practice

DevPath is a hands-on course: you read the theory here; in the app you put it into practice with exercises that really run, offline.

Start free in the app →
← Abuse and validationView the module →