Skip to content

ADR 016: Backup Strategy

Status: Accepted Type: Feature Created: 2024-07-17 Related-ADRs: 010, 024, 036

Context and Goals

Ensuring the availability and integrity of data is critical for the Hop3 platform. A robust backup strategy is essential to protect against data loss, corruption, and ensure quick recovery in case of failures. The goal is to define a comprehensive backup strategy that covers different types of data (e.g., configuration files, application data, and databases) and ensures that backups are performed regularly, stored securely, and can be restored efficiently.

This ADR defines the long-term vision for Hop3's backup capabilities. ADR 024 specifies the foundational backup and restore system on which the later phases build.

Decision

Hop3 implements a comprehensive backup strategy that includes regular backups of critical data, secure storage of backup files, and efficient restoration procedures. This strategy encompasses application data, configuration files, and databases.

The strategy is delivered in phases, so that a usable foundation exists before the more operationally demanding capabilities (scheduling, remote storage, encryption, incremental backups) are layered on:

Feature Phase ADR
Manual full backups Phase 1 ADR 024
Local storage Phase 1 ADR 024
Checksum verification Phase 1 ADR 024
Service-specific backups Phase 1 ADR 024
Automated scheduled backups Phase 2 -
Retention policies Phase 2 -
Remote storage (S3, B2) Phase 3 -
Encryption Phase 3 -
Incremental backups Phase 3 -
Transaction log backups Phase 3 -

Key Components

Backup Types and Frequency

Phase 1 (specified in ADR 024): - Manual full backups on demand - All application components in one backup

Phase 2+:

  1. Configuration Files:
  2. Frequency: Daily backups of configuration files such as hop3.toml and other relevant configurations.
  3. Retention: Retain daily backups for 30 days and monthly backups for 12 months.

  4. Application Data:

  5. Frequency: Incremental backups daily and full backups weekly for application data.
  6. Retention: Retain daily incremental backups for 30 days and weekly full backups for 6 months.

  7. Databases:

  8. Frequency: Daily backups of databases with transaction log backups every hour.
  9. Retention: Retain daily backups for 30 days and monthly backups for 12 months.

Backup Storage and Security

Phase 1 (specified in ADR 024): - Local file-based storage only - File permissions (600) for access control - SHA256 checksums for integrity

Phase 2+:

  1. Storage Locations:
  2. Local Storage: Store backups locally on a dedicated backup server or storage device.
  3. Remote Storage: Use remote storage solutions such as cloud storage providers (e.g., AWS S3, Google Cloud Storage, Backblaze B2) for redundancy and disaster recovery.

  4. Security Measures:

  5. Encryption: Encrypt all backup files at rest and in transit to ensure data confidentiality (using Age or GPG).
  6. Access Control: Implement strict access control measures to restrict access to backup files to authorized personnel only.

Restoration Procedures

Phase 1 (specified in ADR 024): - Manual restore via CLI (hop3 backup restore) - Checksum verification before restore - Service-specific restore (PostgreSQL via pg_restore, etc.)

Phase 2+:

  1. Regular Testing:
  2. Test Restorations: Perform regular test restorations to ensure that backup files are not corrupted and can be restored successfully.
  3. Documentation: Maintain detailed documentation of the restoration procedures and update it regularly.

  4. Automated Restoration:

  5. Automation Tools: Use automated tools and scripts to facilitate quick and efficient restoration of backups.
  6. Monitoring: Implement monitoring systems to detect and alert on backup failures or issues.

Continuous Improvement

  1. Feedback Loop:
  2. User Feedback: Establish a feedback loop with users and administrators to continuously improve the backup strategy based on real-world usage and feedback.
  3. Performance Monitoring: Monitor the performance and reliability of the backup processes to identify and address any issues promptly.

  4. Community Engagement:

  5. Hop3 Community: Encourage contributions from the Hop3 community to refine and enhance the backup strategy.

Consequences

Benefits

  • Data Protection: Ensures the availability and integrity of critical data.
  • Quick Recovery: Facilitates quick recovery in case of data loss or corruption.
  • Security: Enhances security through encryption and strict access control measures (Phase 3).

Drawbacks

  • Resource Intensive: Requires significant storage resources and network bandwidth for regular backups.
  • Management Complexity: Adds complexity to system management, requiring careful planning and monitoring.
  • Phased Delivery: The advanced capabilities (scheduling, remote storage, encryption, incremental backups) depend on operational machinery that the foundational phase does not provide, so they become available only as later phases are built.

Risks

  • Backup Failures: Potential risk of backup failures or corruption. Mitigation involves regular testing and monitoring.
  • Security Breaches: Risk of unauthorized access to backup files. Mitigation includes strong encryption (Phase 3) and access control measures.

References


Related ADRs: ADR 010: Security and Resilience (Umbrella), ADR 024: Backup and Restore System, ADR 036: CLI Ergonomics and Command Surface