MySQL Backups for Large Volumes

Posicoes · May 07, 2023, 12:16 AM

Which program can I use to backup a large MySQL volume without database varnish? There is a mysql+nginx web server used for statistics. The server gets locked for 5-10 minutes every day due to Mysqldump when the database volume exceeds 1GB. This option is not viable because of the presence of innodb tables.
Are there any other alternatives available?

Copying separately by tables is not an interesting option as the main statistics table constitutes more than 50% of the database volume. A separate server for the database is not possible as there is only one machine available.

tevez · May 07, 2023, 02:13 AM

The Maatkit (Percona toolkit) contains a tool called mysqlhotcopy. Additionally, there is another utility called mk-parallel-dump that takes a snapshot of tables and databases in parallel mode. When used with mk-parallel-restore, previously made backup dumps can be loaded. One significant advantage of these tools over standard MySQL methods is the use of parallel database access methods which can dramatically increase the operation speed.

A more complex and selective backup option is provided by mk-archiver. This utility allows for the upload of records selected according to a given criterion from one MySQL table to another, from the same or different server, or to a specified file. It is designed to be easily expandable, allowing for the addition of postprocessors and pre-filters, as well as custom logic for saving data.

It is useful to be aware of these alternative options for backing up and restoring data in MySQL, as they may offer advantages over standard methods. The ability to perform operations in parallel can greatly improve performance, while the flexibility of mk-archiver allows for greater customization of the backup process.

vKchiliahusy · May 07, 2023, 03:50 AM

An effective solution for addressing the problem is through the utilization of snapshots at the file system level, as detailed in resources such as en.wikipedia.org/wiki/Snapshot_%28computer_storage%29#File_systems. This method works by incorporating the usage of the copy-on-write principle, where a copied file initially refers to the same blocks as the original one but subsequently copies any altered blocks to resolve changes.

Using the concept of shadow copy, another alternative can be explored to handle this issue. By implementing this strategy, a snapshot of the data can be generated at a particular point and time, which can be restored whenever necessary. Additionally, this allows the user to access the previously saved versions of their files, effectively providing a backup system in case of an unexpected event.

zMsliliGreaw · May 07, 2023, 05:56 AM

When utilizing lvm on the server, the transaction lock is verified on the database server to ensure that ongoing transactions have finished before creating an lv snapshot.
After the transfer of data to where compression will occur in parallel, the lock is removed and it is important to remember to clean up the snapshots to prevent a slowdown in io.

In the case of switching to xtrabackup, it is necessary to immediately switch to the Perkon stack while also considering all of the added features such as increments or quicker backup/rollback processes.

ImagineWorks · Oct 10, 2023, 08:35 AM

In your situation, you might want to consider the following options:

1. **Percona XtraBackup**: It's an open-source hot backup software for MySQL and MariaDB that allows you to backup your data without locking your database. It works with InnoDB, XtraDB, and MyISAM tables among others. Apart from saving the data, it also records the log position of the database server. This allows a database administrator to take backups without running into consistency issues1.

2. **MariaDB Backup**: It's a fork of the Percona XtraBackup tool and is officially supported by MariaDB for performing hot backups. If you use MariaDB, then this could be a good choice.

3. **mysqldump with --single-transaction**: If you are only using InnoDB tables, you might still be able to use mysqldump, but with the `--single-transaction` flag. This should allow mysqldump to perform consistent backups without acquiring a lot of locks. But keep in mind that this only works reliably if all your tables are using the InnoDB storage engine. If you have MyISAM or other tables that do not support transactions, then conditions might not be met, and database locks can still occur.

4. **MySQL Enterprise Backup**: This is a commercial solution provided by Oracle which can take online "hot" backups of your databases without requiring you to lock your database1.

Always remember, after taking backups, ensure you also have reliable recovery processes in place. Regular testing of restores should be an essential part of your backup strategy.

Lastly, consider optimizing the database itself, indexes, and queries, which might help alleviate some pressure during backup times.

here are some additional methods to consider:

Snapshots Using a Filesystem or Block-Storage Technology: If your MySQL server is running on a volume that supports snapshots (like AWS EBS, SAN storage systems, or a filesystem like ZFS or Btrfs), this can provide a fast and efficient way to create a consistent backup of a database. The process involves flushing all InnoDB pages to disk and locking for a short time as the snapshot is taken (this typically takes a few seconds), after which MySQL can continue to handle requests while you mount the snapshot elsewhere and back it up.

Replication and Backup: If you have the resources and permission to set up replication to a second server, this can be a good solution. While your production server continues to handle requests, the second server keeps a copy of the data. You can run your backups on the secondary server, thereby avoiding the load and locks on your production machine that backups can cause. While you mentioned you only have a single machine available, the second server in this setup could potentially be a much less powerful, and therefore less expensive machine.

3a. Binary Log Backup: If the data modification rate is not very high, another approach could be doing a full backup (with any method including mysqldump) say, once a week, and in addition to it, keep the binary log file. The binary log file contains all commands that would modify data.

3b. Binary Log Backup: When you do the restore, you apply the full backup and after that feed all the commands from the binary log file to the server. This will bring the database to the state it had at the time of the crash. The major advantage can be that this method will allow you to bring the database to any point in time between two full backups.

LVM Snapshot: If you have your MySQL data on an LVM partition, you could use an LVM snapshot to do the backup. You'd briefly lock the tables, create an LVM snapshot, unlock the tables, and then backup from the snapshot.

Incremental Backups: While not traditionally used with MySQL, many backup systems support incremental backups. This means only new or changed data since the last backup is saved. This can dramatically speed up the backup process and reduce the amount of storage needed, at the cost of a more complex restore process. The aforementioned XtraBackup supports this type of backup.

Streaming Backups: Some backup tools, such as XtraBackup and mysqldump, can stream backup data to another machine or storage media, reducing the need for local storage during the backup process. This could be useful if you are experiencing I/O bottlenecks during the backup process.

Backup Compression: Backup tools like XtraBackup can directly compress the backup data during the process. This will reduce the size of the backup and could help both with storage space and backup speed if you are bandwidth-constrained and CPU is not a bottleneck. Be careful as the compression process could increase CPU utilisation.

Selective Table Backup: While you noted that a single table makes up more than 50% of your database and so selective table backup may not be useful, if the rest of the tables are also large and could be backed up individually without affecting performance, this strategy might still help.

Optimising Your Database: This isn't so much another method, it's more a potential solution if other methods aren't effective. This can include database sharding or table partitioning, to distribute data across multiple tables or databases, making each smaller and faster to back up. It could also involve normalising or denormalising your database depending on the specific issues you might be having.

Cloud Service Providers: If your infrastructure is hosted on a cloud platform, you might consider using one of the cloud-managed backup solutions. These services like AWS RDS, Google Cloud SQL, and Azure Database for MySQL offer automated backups, replication, and failover support.

Here are some more advanced concepts and strategies for tackling database backup.

1. **Backup validation**: After taking database backups, it's critical to validate them to ensure that the data can be recovered. Tools like `mysqlcheck` and `myisamchk` can help with this validation.

2. **Percona Toolkit**: Percona offers a toolkit for MySQL that has a variety of advanced tools for MySQL users. They can prove beneficial when you're dealing with database backups.

3. **Cloud-native Backup Solutions**: There are many cloud-native database backup solutions available now, like Google Cloud's `gcloud sql backups create` and AWS' automated backups. If your database is hosted on a cloud-server, doing some research into what solutions your cloud provider offers might be worthwhile.

4. **Database as a Service (DBaaS) solutions**: Cloud providers like AWS, Google Cloud, and Azure offer DBaaS solutions that handle backups, failover, and scaling for you. This might not be an option depending on your business requirements, but it can significantly simplify backup management.

5. **Partitioning**: Partitioning your data effectively can allow backups to be taken of smaller chunks at a time, speeding up the overall process and allowing for smaller 'change windows' in your data. MySQL supports range, list, and hash partitioning, which could be applicable depending on your specific use case.

6. **Delayed Replication**: Using a MySQL replica with replication delay can give you a "time-machine" that lets you recover data from a certain amount of time in the past. By setting up a replica to delay its replication by the length of your backup window, you can ensure there's always a copy of the database from before the last backup started.

7. **Distributed Backups**: Using a distributed system to run backups in parallel can speed up the process, especially for large databases. This requires splitting your database across multiple machines, but can significantly reduce backup times.

8. **Hot Backups**: A hot backup involves taking a backup of the database while it is still in use and requires no downtime. This is more complex, but may be necessary for businesses that require 24/7 uptime. There are commercial tools available, as well as open-source tools like Percona XtraBackup.

Keep in mind the best solution for you is not just the fastest or space-efficient one—it's the one that meets your business needs, respects your resources, and gives you the best cost-benefit ratio. Backup strategies can be as varied as businesses themselves, so there's no one-size-fits-all solution.

MySQL Backups for Large Volumes

Posicoes

tevez

vKchiliahusy

zMsliliGreaw

ImagineWorks