commit ef5695c36f9ccf554eedd4994b829b96de14a402 Author: Fausto Marzi Date: Tue Jul 22 22:46:31 2014 +0100 Freezer initial commit diff --git a/.gitignore b/.gitignore new file mode 100644 index 00000000..a295864e --- /dev/null +++ b/.gitignore @@ -0,0 +1,2 @@ +*.pyc +__pycache__ diff --git a/.gitreview b/.gitreview new file mode 100644 index 00000000..b40d672b --- /dev/null +++ b/.gitreview @@ -0,0 +1,4 @@ +[gerrit] +host=gerrit.hpcloud.net +port=29418 +project=automation/automation-backup.git diff --git a/CHANGES.rst b/CHANGES.rst new file mode 100644 index 00000000..bb9d4495 --- /dev/null +++ b/CHANGES.rst @@ -0,0 +1,5 @@ +v1.0.5, 2014-05-16 - Freezer initial release. v1.0.6, 2014-05-24: - +Fixed error restore date is not provided. - Changed the datetime format +for --restore-from-date to "yyyy-mm-ddThh:mm:ss" - Created a FAQ.txt +file - Extended use case example in README.txt and Hacking.txt v1.0.7, +2014-06-05: - added multiprocessing support for backup and restore diff --git a/CREDITS.rst b/CREDITS.rst new file mode 100644 index 00000000..1fd71021 --- /dev/null +++ b/CREDITS.rst @@ -0,0 +1,34 @@ +Authors +======= + +- Fausto Marzi +- Ryszard Chojnacki +- Emil Dimitrov + +Maintainers +=========== + +- Fausto Marzi +- Ryszard Chojnacki +- Emil Dimitrov + +Contributors +============ + +- Duncan Thomas +- Coleman Corrigan + +Credits +======= + +- Davide Guerri +- Jim West +- Lars Noldan +- Stephen Pearson +- Sneha Mini +- Chris Delaney +- James Bishop +- Matt Joyce +- Anupriya Ramraj +- HP OpsAuto Team + diff --git a/FAQ.rst b/FAQ.rst new file mode 100644 index 00000000..7b047c66 --- /dev/null +++ b/FAQ.rst @@ -0,0 +1,66 @@ +FAQ +=== + +1) What is freezer? Is a tool to automate data backup and restore + process using OpenStack Swift. + +2) Does freezer support incremental backup? Yes. Incremental backup are + done using GNU tar incremental features + +3) Does freezer check the file contents to establish if a file was + modified or not? No. Freezer check for changes at mtime and ctime in + every file inode to evaluate if a file changed or not. + +4) Why GNU tar rather then rsync? Both approaches are good. Rsync check + the file content, while tar check the file inode. In our + environment, we need to backup directories with size > 300GB and + tens of thousands of files. Rsync approach is effective but slow. + Tar is fast as it needs to check only the file inodes, rather then + the full file content. + +5) Does feezer support encrypted backup? Yes. Freezer encrypt data + using OpenSSL (AES-256-CFB). + +6) Does freezer execute point-in-time backup? Yes. For point in time + backup LVM snapshot feature used. + +7) Can I use freezer on OSX or other OS where GNU Tar is not installed + by default? Yes. For OSX and \*BSD, just install gtar and freezer + automatically will use gtar to execute backup. OS other then Linux, + OSX and \*BSD are currently not supported. + +8) What Application backup does freezer support currently? MongoDB and + File system. + +9) How does the MongoDB backup happens? Freezer required journal + enabled in mongo and lvm volume to execute backup. It checks if the + mongo instance is the master and create lvm snapshot to have + consistent data. + +10) Does freezer manage sparse files efficiently? Yes. Zeroed/null data + is not backed up. So less space and bandwidth will be used. + +11) Does freezer remove automatically backup after some time? Yes. From + command line the --remove-older-then option (days) can be used to + remove objects older then (days). + +12) Does freezer support MySQL Backup? MySQL and MariaDB support soon + will be included. + +13) Is there any other storage media supported? Not currently. There's a + plan to add: + +- Amazon S3 +- Store files on a remote host file system +- MongoDB object storage. +- Other directory in the local host (NFS mounted volumes) + +14) Does freezer has any UI or API? Not currently. The UI in OpenStack + Horizon is being currently developed as REST API too. + +15) Tar is not capable to detect deleted files between different backup + levels. Is freezer capable of doing that? Not currently. We are + writing a set of tar extensions in python to overcome tar + limitations like this and others. + + diff --git a/HACKING.rst b/HACKING.rst new file mode 100644 index 00000000..b8d074f3 --- /dev/null +++ b/HACKING.rst @@ -0,0 +1,344 @@ +======== Freezer ======== + +Freezer is a Python tool that helps you to automate the data backup and +restore process. + +The following features are avaialble: + +- Backup your filesystem using snapshot to swift +- Strong encryption supported: AES-256-CFB +- Backup your file system tree directly (without volume snapshot) +- Backup your journaled MongoDB directory tree using lvm snap to swift +- Backup MySQL DB with lvm snapshot +- Restore your data automatically from Swift to your file system +- Low storage consumption as the backup are uploaded as a stream +- Flexible Incremental backup policy +- Data is archived in GNU Tar format +- Data compression with gzip +- Remove old backup automatically according the provided parameters + +Requirements +============ + +- OpenStack Swift Account (Auth V2 used) +- python >= 2.6 (2.7 advised) +- GNU Tar >= 1.26 +- gzip +- OpenSSL +- python-swiftclient >= 2.0.3 +- python-keystoneclient >= 0.8.0 +- pymongo >= 2.6.2 (if MongoDB backups will be executed) +- At least 128 MB of memory reserved for freezer + +Installation & Env Setup +======================== + +Install required packages +------------------------- + +Ubuntu / Debian +--------------- + +Swift client and Keystone client:: $ sudo apt-get install -y +python-swiftclient python-keystoneclient + +MongoDB backup:: $ sudo apt-get install -y python-pymongo + +MySQL backup:: $ sudo apt-get install -y python-mysqldb + +Freezer installation from Python package repo:: $ sudo pip install +freezer OR $ sudo easy\_install freezer + +The basic Swift account configuration is needed to use freezer. Make +sure python-swiftclient is installed. + +Also the following ENV var are needed you can put them in ~/.bashrc:: + +:: + + export OS_REGION_NAME=region-a.geo-1 + export OS_TENANT_ID= + export OS_PASSWORD= + export OS_AUTH_URL=https://region-a.geo-1.identity.hpcloudsvc.com:35357/v2.0 + export OS_USERNAME=automationbackup + export OS_TENANT_NAME=automationbackup + + $ source ~/.barshrc + +Let's say you have a container called foobar-contaienr, by executing +"swift list" you should see something like:: + +:: + + $ swift list + foobar-container-2 + $ + +These are just use case example using Swift in the HP Cloud. + +*Is strongly advised to use execute a backup using LVM snapshot, so +freezer will execute a backup on point-in-time data. This avoid risks of +data inconsistencies and corruption.* + +Usage Example +============= + +Backup +------ + +The most simple backup execution is a direct file system backup:: + +:: + + $ sudo freezerc --file-to-backup /data/dir/to/backup --container new-data-backup \ + --backup-name my-backup-name + +By default --mode fs is set. The command would generate a compressed tar +gzip file of the directory /data/dir/to/backup. The generated file will +be segmented in stream and uploaded in the swift container called +new-data-backup, with backup name my-backup-name + +Now check if your backup is executing correctly looking at +/var/log/freezer.log + +Execute a MongoDB backup using lvm snapshot:: + +We need to check before on which volume group and logical volume our +mongo data is. These information can be obtained as per following:: + +:: + + $ mount + [...] + +Once we know the volume where our mongo data is mounted on, we can get +the volume group and logical volume info:: + +:: + + $ sudo vgdisplay + [...] + $ sudo lvdisplay + [...] + +We assume our mongo volume is "/dev/mongo/mongolv" and the volume group +is "mongo":: + +:: + + $ sudo freezerc --lvm-srcvol /dev/mongo/mongolv --lvm-dirmount /var/lib/snapshot-backup \ + --lvm-volgroup mongo --file-to-backup /var/lib/snapshot-backup/mongod_ops2 \ + --container mongodb-backup-prod --exclude "*.lock" --mode mongo --backup-name mongod-ops2 + +Now freezerc create a lvm snapshot of the volume /dev/mongo/mongolv. If +no options are provided, default snapshot name is freezer\_backup\_snap. +The snap vol will be mounted automatically on /var/lib/snapshot-backup +and the backup meta and segments will be upload in the container +mongodb-backup-prod with the namemongod-ops2. + +Execute a file system backup using lvm snapshot:: $ sudo freezerc +--lvm-srcvol /dev/jenkins/jenkins-home --lvm-dirmount +/var/snapshot-backup + --lvm-volgroup jenkins --file-to-backup /var/snapshot-backup + --container jenkins-backup-prod --exclude "\*.lock" --mode fs +--backup-name jenkins-ops2 + +MySQL backup require a basic configuration file. The following is an +example of the config:: $ sudo cat /root/.freezer/db.conf host = +your.mysql.host.ip user = backup password = + +Every listed option is mandatory. There's no need to stop the mysql +service before the backup execution. + +Execute a MySQL backup using lvm snapshot:: $ sudo freezerc --lvm-srcvol +/dev/mysqlvg/mysqlvol --lvm-dirmount /var/snapshot-backup + --lvm-volgroup mysqlvg --file-to-backup /var/snapshot-backup + --mysql-conf /root/.freezer/freezer-mysql.conf--container +mysql-backup-prod + --mode mysql --backup-name mysql-ops002 + +All the freezerc activities are logged into /var/log/freezer.log. + +Restore +------- + +As a general rule, when you execute a restore, the application that +write or read data should be stopped. + +There are 3 main options that need to be set for data restore + +File System Restore:: Execute a file system restore of the backup name +adminui.git:: $ sudo freezerc --container foobar-container-2 +--backup-name adminui.git + --restore-from-host git-HP-DL380-host-001 --restore-abs-path +/home/git/repositories/adminui.git/ + --restore-from-date "23-05-2014T23:23:23" + +MySQL restore:: Execute a MySQL restore of the backup name holly-mysql. +Let's stop mysql service first:: $ sudo service mysql stop + +Execute Restore:: $ sudo freezerc --container foobar-container-2 +--backup-name mysq-prod + --restore-from-host db-HP-DL380-host-001 --restore-abs-path +/var/lib/mysql + --restore-from-date "23-05-2014T23:23:23" + +And finally restart mysql:: $ sudo service mysql start + +Execute a MongoDB restore of the backup name mongobigdata:: $ sudo +freezerc --container foobar-container-2 --backup-name mongobigdata + --restore-from-host db-HP-DL380-host-001 --restore-abs-path +/var/lib/mongo + --restore-from-date "23-05-2014T23:23:23" + +Architecture +============ + +Freezer architecture is simple. The components are: + +- OpenStack Swift (the storage) +- freezer client running on the node you want to execute the backups or + restore + +Frezeer use GNU Tar under the hood to execute incremental backup and +restore. When a key is provided, it uses OpenSSL to encrypt data +(AES-256-CFB) + +Low resources requirement +------------------------- + +Freezer is designed to reduce at the minimum I/O, CPU and Memory Usage. +This is achieved by generating a data stream from tar (for archiving) +and gzip (for compressing). Freezer segment the stream in a configurable +chunk size (with the option --max-seg-size). The default segment size is +128MB, so it can be safely stored in memory, encrypted if the key is +provided, and uploaded to Swift as segment. + +Multiple segments are sequentially uploaded using the Swift Manifest. +All the segments are uploaded first, and then the Manifest file is +uploaded too, so the data segments cannot be accessed directly. This +ensue data consistency. + +By keeping small segments in memory, I/O usage is reduced. Also as +there's no need to store locally the final compressed archive +(tar-gziped), no additional or dedicated storage is required for the +backup execution. The only additional storage needed is the LVM snapshot +size (set by default at 5GB). The lvm snapshot size can be set with the +option --lvm-snapsize. It is important to not specify a too small snap +size, because in case a quantity of data is being wrote to the source +volume and consequently the lvm snapshot is filled up, then the data is +corrupted. + +If the more memory is available for the backup process, the maximum +segment size can be increased, this will speed up the process. Please +note, the segments must be smaller then 5GB, is that is the maximum +object size in the Swift server. + +Au contraire, if a server have small memory availability, the +--max-seg-size option can be set to lower values. The unit of this +option is in bytes. + +How the incremental works +------------------------- + +The incremental backups is one of the most crucial feature. The +following basic logic happens when Freezer execute: + +1) Freezer start the execution and check if the provided backup name for + the current node already exist in Swift + +2) If the backup exists, the Manifest file is retrieved. This is + important as the Manifest file contains the information of the + previous Freezer execution. + +The following is what the Swift Manifest looks like:: + +:: + + { + 'X-Object-Meta-Encrypt-Data': 'Yes', + 'X-Object-Meta-Segments-Size-Bytes': '134217728', + 'X-Object-Meta-Backup-Created-Timestamp': '1395734461', + 'X-Object-Meta-Remove-Backup-Older-Than-Days': '', + 'X-Object-Meta-Src-File-To-Backup': '/var/lib/snapshot-backup/mongod_dev-mongo-s1', + 'X-Object-Meta-Maximum-Backup-level': '0', + 'X-Object-Meta-Always-Backup-Level': '', + 'X-Object-Manifest': u'socorro-backup-dev_segments/dev-mongo-s1-r1_mongod_dev-mongo-s1_1395734461_0', + 'X-Object-Meta-Providers-List': 'HP', + 'X-Object-Meta-Backup-Current-Level': '0', + 'X-Object-Meta-Abs-File-Path': '', + 'X-Object-Meta-Backup-Name': 'mongod_dev-mongo-s1', + 'X-Object-Meta-Tar-Meta-Obj-Name': 'tar_metadata_dev-mongo-s1-r1_mongod_dev-mongo-s1_1395734461_0', + 'X-Object-Meta-Hostname': 'dev-mongo-s1-r1', + 'X-Object-Meta-Container-Segments': 'socorro-backup-dev_segments' + } + +3) The most relevant data taken in consideration for incremental are: + +- 'X-Object-Meta-Maximum-Backup-level': '7' + +Value set by the option: --max-level int + +Assuming we are executing the backup daily, let's say managed from the +crontab, the first backup will start from Level 0, that is, a full +backup. At every daily execution, the current backup level will be +incremented by 1. Then current backup level is equal to the maximum +backup level, then the backup restart to level 0. That is, every week a +full backup will be executed. + +- 'X-Object-Meta-Always-Backup-Level': '' + +Value set by the option: --always-level int + +When current level is equal to 'Always-Backup-Level', every next backup +will be executed to the specified level. Let's say --always-level is set +to 1, the first backup will be a level 0 (complete backup) and every +next execution will backup the data exactly from the where the level 0 +ended. The main difference between Always-Backup-Level and +Maximum-Backup-level is that the counter level doesn't restart from +level 0 + +- 'X-Object-Manifest': + u'socorro-backup-dev/dev-mongo-s1-r1\_mongod\_dev-mongo-s1\_1395734461\_0' + +Through this meta data, we can identify the exact Manifest name of the +provided backup name. The syntax is: +container\_name/hostname\_backup\_name\_timestamp\_initiallevel + +- 'X-Object-Meta-Providers-List': 'HP' + +This option is NOT implemented yet The idea of Freezer is to support +every Cloud provider that provide Object Storage service using OpenStack +Swift. The meta data allows you to specify multiple provider and +therefore store your data in different Geographic location. + +- 'X-Object-Meta-Backup-Current-Level': '0' + +Record the current backup level. This is important as the value is +incremented by 1 in the next freezer execution. + +- 'X-Object-Meta-Backup-Name': 'mongod\_dev-mongo-s1' + +Value set by the option: -N BACKUP\_NAME, --backup-name BACKUP\_NAME The +option is used to identify the backup. It is a mandatory option and +fundamental to execute incremental backup. 'Meta-Backup-Name' and +'Meta-Hostname' are used to uniquely identify the current and next +incremental backups + +- 'X-Object-Meta-Tar-Meta-Obj-Name': + 'tar\_metadata\_dev-mongo-s1-r1\_mongod\_dev-mongo-s1\_1395734461\_0' + +Freezer use tar to execute incremental backup. What tar do is to store +in a meta data file the inode information of every file archived. Thus, +on the next Freezer execution, the tar meta data file is retrieved and +download from swift and it is used to generate the next backup level. +After the next level backup execution is terminated, the file update tar +meta data file will be uploaded and recorded in the Manifest file. The +naming convention used for this file is: +tar\_metadata\_backupname\_hostname\_timestamp\_backuplevel + +- 'X-Object-Meta-Hostname': 'dev-mongo-s1-r1' + +The hostname of the node where the Freezer perform the backup. This meta +data is important to identify a backup with a specific node, thus avoid +possible confusion and associate backup to the wrong node. diff --git a/INSTALL.rst b/INSTALL.rst new file mode 100644 index 00000000..5cd4a4ae --- /dev/null +++ b/INSTALL.rst @@ -0,0 +1,16 @@ +Install instructions +==================== + +Install from sources:: +---------------------- + +You have now the freezerc tool installed in /usr/local/bin/freezerc + +Please execute the following command to all the available options: + +$ freezerc --help [...] + +Please read README.txt or HACKING.txt to see the requirement and more +technical details about how to run freezer + +Thanks, The Freezer Team. diff --git a/LICENSE.rst b/LICENSE.rst new file mode 100644 index 00000000..1d7b6740 --- /dev/null +++ b/LICENSE.rst @@ -0,0 +1,19 @@ +Copyright 2014 Hewlett-Packard + +Licensed under the Apache License, Version 2.0 (the "License"); you may +not use this file except in compliance with the License. You may obtain +a copy of the License at + +:: + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +This product includes cryptographic software written by Eric Young +(eay@cryptsoft.com). This product includes software written by Tim +Hudson (tjh@cryptsoft.com). diff --git a/MANIFEST b/MANIFEST new file mode 100644 index 00000000..d47a61b8 --- /dev/null +++ b/MANIFEST @@ -0,0 +1,18 @@ +# file GENERATED by distutils, do NOT edit +CHANGES.txt +CREDITS.txt +HACKING.txt +LICENSE.txt +README.txt +TODO.txt +setup.py +bin/freezerc +freezer/__init__.py +freezer/arguments.py +freezer/backup.py +freezer/lvm.py +freezer/main.py +freezer/restore.py +freezer/swift.py +freezer/tar.py +freezer/utils.py diff --git a/MANIFEST.in b/MANIFEST.in new file mode 100644 index 00000000..5885706c --- /dev/null +++ b/MANIFEST.in @@ -0,0 +1,5 @@ +include *.rst +include *.txt +include bin/freezerc +recursive-include docs *.rst +recursive-include freezer *.py diff --git a/README.rst b/README.rst new file mode 100644 index 00000000..f5647cb9 --- /dev/null +++ b/README.rst @@ -0,0 +1,357 @@ +======= +Freezer +======= + +Freezer is a Python tool that helps you to automate the data backup and +restore process. + +The following features are avaialble: + +- Backup your filesystem using snapshot to swift +- Strong encryption supported: AES-256-CFB +- Backup your file system tree directly (without volume snapshot) +- Backup your journaled MongoDB directory tree using lvm snap to swift +- Backup MySQL DB with lvm snapshot +- Restore your data automatically from Swift to your file system +- Low storage consumption as the backup are uploaded as a stream +- Flexible Incremental backup policy +- Data is archived in GNU Tar format +- Data compression with gzip +- Remove old backup automatically according the provided parameters + +Requirements +============ + +- OpenStack Swift Account (Auth V2 used) +- python >= 2.6 (2.7 advised) +- GNU Tar >= 1.26 +- gzip +- OpenSSL +- python-swiftclient >= 2.0.3 +- python-keystoneclient >= 0.8.0 +- pymongo >= 2.6.2 (if MongoDB backups will be executed) +- At least 128 MB of memory reserved for freezer + +Installation & Env Setup +======================== + +Install required packages +------------------------- + +Ubuntu / Debian +--------------- + +Swift client and Keystone client:: + + $ sudo apt-get install -y python-swiftclient python-keystoneclient + +MongoDB backup:: + + $ sudo apt-get install -y python-pymongo + +MySQL backup:: + + $ sudo apt-get install -y python-mysqldb + +Freezer installation from Python package repo:: + + $ sudo pip install freezer + +OR:: + + $ sudo easy\_install freezer + +The basic Swift account configuration is needed to use freezer. Make +sure python-swiftclient is installed. + +Also the following ENV var are needed you can put them in ~/.bashrc:: + + export OS_REGION_NAME=region-a.geo-1 + export OS_TENANT_ID= + export OS_PASSWORD= + export OS_AUTH_URL=https://region-a.geo-1.identity.hpcloudsvc.com:35357/v2.0 + export OS_USERNAME=automationbackup + export OS_TENANT_NAME=automationbackup + + $ source ~/.barshrc + +Let's say you have a container called foobar-contaienr, by executing +"swift list" you should see something like:: + + $ swift list + foobar-container-2 + $ + +These are just use case example using Swift in the HP Cloud. + +*Is strongly advised to use execute a backup using LVM snapshot, so +freezer will execute a backup on point-in-time data. This avoid risks of +data inconsistencies and corruption.* + +Usage Example +============= + +Backup +------ + +The most simple backup execution is a direct file system backup:: + + $ sudo freezerc --file-to-backup /data/dir/to/backup + --container new-data-backup --backup-name my-backup-name + +By default --mode fs is set. The command would generate a compressed tar +gzip file of the directory /data/dir/to/backup. The generated file will +be segmented in stream and uploaded in the swift container called +new-data-backup, with backup name my-backup-name + +Now check if your backup is executing correctly looking at +/var/log/freezer.log + +Execute a MongoDB backup using lvm snapshot: + +We need to check before on which volume group and logical volume our +mongo data is. These information can be obtained as per following:: + + $ mount + [...] + +Once we know the volume where our mongo data is mounted on, we can get +the volume group and logical volume info:: + + $ sudo vgdisplay + [...] + $ sudo lvdisplay + [...] + +We assume our mongo volume is "/dev/mongo/mongolv" and the volume group +is "mongo":: + + $ sudo freezerc --lvm-srcvol /dev/mongo/mongolv --lvm-dirmount /var/lib/snapshot-backup + --lvm-volgroup mongo --file-to-backup /var/lib/snapshot-backup/mongod_ops2 + --container mongodb-backup-prod --exclude "*.lock" --mode mongo --backup-name mongod-ops2 + +Now freezerc create a lvm snapshot of the volume /dev/mongo/mongolv. If +no options are provided, default snapshot name is freezer\_backup\_snap. +The snap vol will be mounted automatically on /var/lib/snapshot-backup +and the backup meta and segments will be upload in the container +mongodb-backup-prod with the namemongod-ops2. + +Execute a file system backup using lvm snapshot:: + + $ sudo freezerc --lvm-srcvol /dev/jenkins/jenkins-home --lvm-dirmount + /var/snapshot-backup --lvm-volgroup jenkins + --file-to-backup /var/snapshot-backup --container jenkins-backup-prod + --exclude "\*.lock" --mode fs --backup-name jenkins-ops2 + +MySQL backup require a basic configuration file. The following is an +example of the config:: + + $ sudo cat /root/.freezer/db.conf + host = your.mysql.host.ip + user = backup + password = userpassword + +Every listed option is mandatory. There's no need to stop the mysql +service before the backup execution. + +Execute a MySQL backup using lvm snapshot:: + + $ sudo freezerc --lvm-srcvol /dev/mysqlvg/mysqlvol + --lvm-dirmount /var/snapshot-backup + --lvm-volgroup mysqlvg --file-to-backup /var/snapshot-backup + --mysql-conf /root/.freezer/freezer-mysql.conf--container + mysql-backup-prod --mode mysql --backup-name mysql-ops002 + +All the freezerc activities are logged into /var/log/freezer.log. + +Restore +------- + +As a general rule, when you execute a restore, the application that +write or read data should be stopped. + +There are 3 main options that need to be set for data restore + +File System Restore: + +Execute a file system restore of the backup name +adminui.git:: + + $ sudo freezerc --container foobar-container-2 + --backup-name adminui.git + --restore-from-host git-HP-DL380-host-001 --restore-abs-path + /home/git/repositories/adminui.git/ + --restore-from-date "23-05-2014T23:23:23" + +MySQL restore: + +Execute a MySQL restore of the backup name holly-mysql. +Let's stop mysql service first:: + + $ sudo service mysql stop + +Execute Restore:: + + $ sudo freezerc --container foobar-container-2 + --backup-name mysq-prod --restore-from-host db-HP-DL380-host-001 + --restore-abs-path /var/lib/mysql --restore-from-date "23-05-2014T23:23:23" + +And finally restart mysql:: + + $ sudo service mysql start + +Execute a MongoDB restore of the backup name mongobigdata:: + + $ sudo freezerc --container foobar-container-2 --backup-name mongobigdata + --restore-from-host db-HP-DL380-host-001 --restore-abs-path + /var/lib/mongo --restore-from-date "23-05-2014T23:23:23" + +Architecture +============ + +Freezer architecture is simple. The components are: + +- OpenStack Swift (the storage) +- freezer client running on the node you want to execute the backups or + restore + +Frezeer use GNU Tar under the hood to execute incremental backup and +restore. When a key is provided, it uses OpenSSL to encrypt data +(AES-256-CFB) + +Low resources requirement +------------------------- + +Freezer is designed to reduce at the minimum I/O, CPU and Memory Usage. +This is achieved by generating a data stream from tar (for archiving) +and gzip (for compressing). Freezer segment the stream in a configurable +chunk size (with the option --max-seg-size). The default segment size is +128MB, so it can be safely stored in memory, encrypted if the key is +provided, and uploaded to Swift as segment. + +Multiple segments are sequentially uploaded using the Swift Manifest. +All the segments are uploaded first, and then the Manifest file is +uploaded too, so the data segments cannot be accessed directly. This +ensue data consistency. + +By keeping small segments in memory, I/O usage is reduced. Also as +there's no need to store locally the final compressed archive +(tar-gziped), no additional or dedicated storage is required for the +backup execution. The only additional storage needed is the LVM snapshot +size (set by default at 5GB). The lvm snapshot size can be set with the +option --lvm-snapsize. It is important to not specify a too small snap +size, because in case a quantity of data is being wrote to the source +volume and consequently the lvm snapshot is filled up, then the data is +corrupted. + +If the more memory is available for the backup process, the maximum +segment size can be increased, this will speed up the process. Please +note, the segments must be smaller then 5GB, is that is the maximum +object size in the Swift server. + +Au contraire, if a server have small memory availability, the +--max-seg-size option can be set to lower values. The unit of this +option is in bytes. + +How the incremental works +------------------------- + +The incremental backups is one of the most crucial feature. The +following basic logic happens when Freezer execute: + +1) Freezer start the execution and check if the provided backup name for + the current node already exist in Swift + +2) If the backup exists, the Manifest file is retrieved. This is + important as the Manifest file contains the information of the + previous Freezer execution. + +The following is what the Swift Manifest looks like:: + + { + 'X-Object-Meta-Encrypt-Data': 'Yes', + 'X-Object-Meta-Segments-Size-Bytes': '134217728', + 'X-Object-Meta-Backup-Created-Timestamp': '1395734461', + 'X-Object-Meta-Remove-Backup-Older-Than-Days': '', + 'X-Object-Meta-Src-File-To-Backup': '/var/lib/snapshot-backup/mongod_dev-mongo-s1', + 'X-Object-Meta-Maximum-Backup-level': '0', + 'X-Object-Meta-Always-Backup-Level': '', + 'X-Object-Manifest': u'socorro-backup-dev_segments/dev-mongo-s1-r1_mongod_dev-mongo-s1_1395734461_0', + 'X-Object-Meta-Providers-List': 'HP', + 'X-Object-Meta-Backup-Current-Level': '0', + 'X-Object-Meta-Abs-File-Path': '', + 'X-Object-Meta-Backup-Name': 'mongod_dev-mongo-s1', + 'X-Object-Meta-Tar-Meta-Obj-Name': 'tar_metadata_dev-mongo-s1-r1_mongod_dev-mongo-s1_1395734461_0', + 'X-Object-Meta-Hostname': 'dev-mongo-s1-r1', + 'X-Object-Meta-Container-Segments': 'socorro-backup-dev_segments' + } + +3) The most relevant data taken in consideration for incremental are: + +- 'X-Object-Meta-Maximum-Backup-level': '7' + +Value set by the option: --max-level int + +Assuming we are executing the backup daily, let's say managed from the +crontab, the first backup will start from Level 0, that is, a full +backup. At every daily execution, the current backup level will be +incremented by 1. Then current backup level is equal to the maximum +backup level, then the backup restart to level 0. That is, every week a +full backup will be executed. + +- 'X-Object-Meta-Always-Backup-Level': '' + +Value set by the option: --always-level int + +When current level is equal to 'Always-Backup-Level', every next backup +will be executed to the specified level. Let's say --always-level is set +to 1, the first backup will be a level 0 (complete backup) and every +next execution will backup the data exactly from the where the level 0 +ended. The main difference between Always-Backup-Level and +Maximum-Backup-level is that the counter level doesn't restart from +level 0 + +- 'X-Object-Manifest': + u'socorro-backup-dev/dev-mongo-s1-r1\_mongod\_dev-mongo-s1\_1395734461\_0' + +Through this meta data, we can identify the exact Manifest name of the +provided backup name. The syntax is: +container\_name/hostname\_backup\_name\_timestamp\_initiallevel + +- 'X-Object-Meta-Providers-List': 'HP' + +This option is NOT implemented yet The idea of Freezer is to support +every Cloud provider that provide Object Storage service using OpenStack +Swift. The meta data allows you to specify multiple provider and +therefore store your data in different Geographic location. + +- 'X-Object-Meta-Backup-Current-Level': '0' + +Record the current backup level. This is important as the value is +incremented by 1 in the next freezer execution. + +- 'X-Object-Meta-Backup-Name': 'mongod\_dev-mongo-s1' + +Value set by the option: -N BACKUP\_NAME, --backup-name BACKUP\_NAME The +option is used to identify the backup. It is a mandatory option and +fundamental to execute incremental backup. 'Meta-Backup-Name' and +'Meta-Hostname' are used to uniquely identify the current and next +incremental backups + +- 'X-Object-Meta-Tar-Meta-Obj-Name': + 'tar\_metadata\_dev-mongo-s1-r1\_mongod\_dev-mongo-s1\_1395734461\_0' + +Freezer use tar to execute incremental backup. What tar do is to store +in a meta data file the inode information of every file archived. Thus, +on the next Freezer execution, the tar meta data file is retrieved and +download from swift and it is used to generate the next backup level. +After the next level backup execution is terminated, the file update tar +meta data file will be uploaded and recorded in the Manifest file. The +naming convention used for this file is: +tar\_metadata\_backupname\_hostname\_timestamp\_backuplevel + +- 'X-Object-Meta-Hostname': 'dev-mongo-s1-r1' + +The hostname of the node where the Freezer perform the backup. This meta +data is important to identify a backup with a specific node, thus avoid +possible confusion and associate backup to the wrong node. diff --git a/TODO.rst b/TODO.rst new file mode 100644 index 00000000..8b137891 --- /dev/null +++ b/TODO.rst @@ -0,0 +1 @@ + diff --git a/bin/freezerc b/bin/freezerc new file mode 100755 index 00000000..ac495313 --- /dev/null +++ b/bin/freezerc @@ -0,0 +1,82 @@ +#!/usr/bin/env python +''' +Copyright 2014 Hewlett-Packard + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +This product includes cryptographic software written by Eric Young +(eay@cryptsoft.com). This product includes software written by Tim +Hudson (tjh@cryptsoft.com). +======================================================================== + +Freezer offer the following features: +[*] Backup your filesystem using lvm snapshot to swift +[*] Data Encryption (AES-256-CFB) +[*] Backup your file system tree directly (without volume snapshot) +[*] Backup your journaled mongodb directory tree using lvm snap to swift +[*] Backup MySQL DB with lvm snapshot +[*] Restore automatically your data from swift to your filesystems +[*] Low storage consumption as the backup are uploaded as a stream +[*] Flexible Incremental backup policy +''' + +from freezer.main import freezer_main +from freezer.arguments import backup_arguments + +import os +import subprocess +import logging +import sys + +# Initialize backup options +(backup_args, arg_parse) = backup_arguments() + +# Configure logging +logging.basicConfig( + filename=backup_args.log_file, + level=logging.INFO, + format='%(asctime)s %(name)s %(levelname)s %(message)s') + +# Try to execute freezer with highest priority execution if +# backup_args.max_priority is True. New priority will be set for child +# processes too, as niceness is inherited from father +if backup_args.max_priority: + try: + logging.warning( + '[*] Setting freezer execution with high CPU and I/O priority') + PID = os.getpid() + # Set cpu priority + os.nice(-19) + # Set I/O Priority to Real Time class with level 0 + subprocess.call( + [u'{0}'.format(backup_args.ionice), + u'-c', u'1', u'-n', u'0', u'-t', u'-p', u'{0}'.format(PID)]) + except Exception as priority_error: + logging.warning('[*] Priority: {0}'.format(priority_error)) + +if backup_args.version: + print "freezer version {0}".format(backup_args.__version__) + sys.exit(1) + +if len(sys.argv) < 2: + arg_parse.print_help() + sys.exit(1) + +if __name__ == '__main__': + + try: + freezer_main(backup_args) + except ValueError as err: + logging.critical('[*] ValueError: {0}'.format(err)) + except Exception as err: + logging.critical('[*] Error: {0}'.format(err)) diff --git a/freezer/__init__.py b/freezer/__init__.py new file mode 100644 index 00000000..c338da34 --- /dev/null +++ b/freezer/__init__.py @@ -0,0 +1,20 @@ +''' +Copyright 2014 Hewlett-Packard + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +This product includes cryptographic software written by Eric Young +(eay@cryptsoft.com). This product includes software written by Tim +Hudson (tjh@cryptsoft.com). +======================================================================== +''' diff --git a/freezer/arguments.py b/freezer/arguments.py new file mode 100644 index 00000000..58a28e91 --- /dev/null +++ b/freezer/arguments.py @@ -0,0 +1,244 @@ +''' +Copyright 2014 Hewlett-Packard + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +This product includes cryptographic software written by Eric Young +(eay@cryptsoft.com). This product includes software written by Tim +Hudson (tjh@cryptsoft.com). +======================================================================== + +Arguments and general parameters definitions +''' + +import sys +import argparse +import distutils.spawn as distspawn +import os +import logging + + +def backup_arguments(): + ''' + Default arguments and command line options interface. The function return + a name space called backup_args. + ''' + arg_parser = argparse.ArgumentParser(prog='freezerc') + arg_parser.add_argument( + '-F', '--file-to-backup', action='store', + help="The file or directory you want to back up to Swift", + dest='src_file', default=False) + arg_parser.add_argument( + '-N', '--backup-name', action='store', + help="The backup name you want to use to identify your backup \ + on Swift", dest='backup_name', default=False) + arg_parser.add_argument( + '-m', '--mode', action='store', + help="Set the technology to back from. Options are, fs (filesystem),\ + mongo (MongoDB), mysql (MySQL). Default set to fs", dest='mode', + default='fs') + arg_parser.add_argument( + '-C', '--container', action='store', + help="The Swift container used to upload files to", + dest='container', default=False) + arg_parser.add_argument( + '-L', '--list-containers', action='store_true', + help='''List the Swift containers on remote Object Storage Server''', + dest='list_container', default=False) + arg_parser.add_argument( + '-l', '--list-objects', action='store_true', + help='''List the Swift objects stored in a container on remote Object\ + Storage Server.''', dest='list_objects', default=False) + arg_parser.add_argument( + '-o', '--get-object', action='store', + help="The Object Name you want to download. This options is mandatory \ + when --restore is used", dest='object', default=False) + arg_parser.add_argument( + '-d', '--dst-file', action='store', + help="The file name used to save the object on your local disk and\ + upload file in swift", dest='dst_file', default=False) + arg_parser.add_argument( + '--lvm-srcvol', action='store', + help="Set the lvm volume you want to take a snaphost from. Default\ + no volume", dest='lvm_srcvol', default=False) + arg_parser.add_argument( + '--lvm-snapname', action='store', + help="Set the lvm snapshot name to use. If the snapshot name already\ + exists, the old one will be used a no new one will be created. Default\ + freezer_backup_snap.", dest='lvm_snapname', default=False) + arg_parser.add_argument( + '--lvm-snapsize', action='store', + help="Set the lvm snapshot size when creating a new snapshot.\ + Please add G for Gigabytes or M for Megabytes, i.e. 500M or 8G.\ + Default 5G.", dest='lvm_snapsize', default=False) + arg_parser.add_argument( + '--lvm-dirmount', action='store', + help="Set the directory you want to mount the lvm snapshot to.\ + Default not set", dest='lvm_dirmount', default=False) + arg_parser.add_argument( + '--lvm-volgroup', action='store', + help="Specify the volume group of your logical volume.\ + This is important to mount your snapshot volume. Default not set", + dest='lvm_volgroup', default=False) + arg_parser.add_argument( + '--max-level', action='store', + help="Set the backup level used with tar to implement incremental \ + backup. If a level 1 is specified but no level 0 is already \ + available, a level 0 will be done and subesequently backs to level 1.\ + Default 0 (No Incremental)", dest='max_backup_level', + type=int, default=False) + arg_parser.add_argument( + '--always-level', action='store', help="Set backup\ + maximum level used with tar to implement incremental backup. If a \ + level 3 is specified, the backup will be executed from level 0 to \ + level 3 and to that point always a backup level 3 will be executed. \ + It will not restart from level 0. This option has precedence over \ + --max-backup-level. Default False (Disabled)", + dest='always_backup_level', type=int, default=False) + arg_parser.add_argument( + '--restart-always-level', action='store', help="Restart the backup \ + from level 0 after n days. Valid only if --always-level option \ + if set. If --always-level is used together with --remove-older-then, \ + there might be the chance where the initial level 0 will be removed \ + Default False (Disabled)", + dest='restart_always_backup', type=float, default=False) + arg_parser.add_argument( + '-R', '--remove-older-then', action='store', + help="Checks in the specified container for object older then the \ + specified days. If i.e. 30 is specified, it will remove the remote \ + object older than 30 days. Default False (Disabled)", + dest='remove_older_than', type=float, default=False) + arg_parser.add_argument( + '--no-incremental', action='store_true', + help='''Disable incremantal feature. By default freezer build the + meta data even for level 0 backup. By setting this option incremental + meta data is not created at all. Default disabled''', + dest='no_incremental', default=False) + arg_parser.add_argument( + '--hostname', action='store', + help='''Set hostname to execute actions. If you are executing freezer + from one host but you want to delete objects belonging to another + host then you can set this option that hostname and execute appropriate + actions. Default current node hostname.''', + dest='hostname', default=False) + arg_parser.add_argument( + '--mysql-conf', action='store', + help='''Set the MySQL configuration file where freezer retrieve + important information as db_name, user, password, host. + Following is an example of config file: + # cat ~/.freezer/backup_mysql_conf + host = + user = + password = ''', + dest='mysql_conf_file', default=False) + arg_parser.add_argument( + '--log-file', action='store', + help='Set log file. By default logs to /var/log/freezer.log', + dest='log_file', default='/var/log/freezer.log') + arg_parser.add_argument( + '--exclude', action='store', help="Exclude files,\ + given as a PATTERN.Ex: --exclude '*.log' will exclude any file with \ + name ending with .log. Default no exclude", dest='exclude', + default=False) + arg_parser.add_argument( + '-U', '--upload', action='store_true', + help="Upload to Swift the destination file passed to the -d option.\ + Default upload the data", dest='upload', default=True) + arg_parser.add_argument( + '--encrypt-pass-file', action='store', + help="Passing a private key to this option, allow you to encrypt the \ + files before to be uploaded in Swift. Default do not encrypt.", + dest='encrypt_pass_file', default=False) + arg_parser.add_argument( + '-M', '--max-segment-size', action='store', + help="Set the maximum file chunk size in bytes to upload to swift\ + Default 134217728 bytes (128MB)", + dest='max_seg_size', type=int, default=134217728) + arg_parser.add_argument( + '--restore-abs-path', action='store', + help='Set the absolute path where you want your data restored. \ + option --restore need to be set along with --get-object in order \ + to execute data restore. Default False.', + dest='restore_abs_path', default=False) + arg_parser.add_argument( + '--restore-from-host', action='store', + help='''Set the hostname used to identify the data you want to restore + from. If you want to restore data in the same host where the backup + was executed just type from your shell: "$ hostname" and the output is + the value that needs to be passed to this option. Mandatory with + Restore Default False.''', dest='restore_from_host', default=False) + arg_parser.add_argument( + '--restore-from-date', action='store', + help='''Set the absolute path where you want your data restored. + Please provide datime in forma "YYYY-MM-DDThh:mm:ss" + i.e. "1979-10-03T23:23:23". Make sure the "T" is between date and time + Default False.''', dest='restore_from_date', default=False) + arg_parser.add_argument( + '--max-priority', action='store_true', + help='''Set the cpu process to the highest priority (i.e. -20 on Linux) + and real-time for I/O. The process priority will be set only if nice + and ionice are installed Default disabled. Use with caution.''', + dest='max_priority', default=False) + arg_parser.add_argument( + '-V', '--version', action='store_true', + help='''Print the release version and exit''', + dest='version', default=False) + + backup_args = arg_parser.parse_args() + # Set additional namespace attributes + backup_args.__dict__['remote_match_backup'] = [] + backup_args.__dict__['remote_objects'] = [] + backup_args.__dict__['remote_obj_list'] = [] + backup_args.__dict__['remote_newest_backup'] = u'' + # Set default workdir to ~/.freezer + backup_args.__dict__['workdir'] = os.path.expanduser(u'~/.freezer') + # Create a new namespace attribute for container_segments + backup_args.__dict__['container_segments'] = u'{0}_segments'.format( + backup_args.container) + # If hostname is not set, hostname of the current node will be used + if not backup_args.hostname: + backup_args.__dict__['hostname'] = os.uname()[1] + backup_args.__dict__['manifest_meta_dict'] = {} + backup_args.__dict__['curr_backup_level'] = '' + backup_args.__dict__['manifest_meta_dict'] = '' + backup_args.__dict__['tar_path'] = distspawn.find_executable('tar') + # If freezer is being used under OSX, please install gnutar and + # rename the executable as gnutar + if 'darwin' in sys.platform or 'bsd' in sys.platform: + if distspawn.find_executable('gtar'): + backup_args.__dict__['tar_path'] = \ + distspawn.find_executable('gtar') + else: + logging.critical('[*] Please install gnu tar (gtar) as it is a \ + mandatory requirement to use freezer.') + raise Exception + + # Get absolute path of other commands used by freezer + backup_args.__dict__['lvcreate_path'] = distspawn.find_executable( + 'lvcreate') + backup_args.__dict__['lvremove_path'] = distspawn.find_executable( + 'lvremove') + backup_args.__dict__['bash_path'] = distspawn.find_executable('bash') + backup_args.__dict__['openssl_path'] = distspawn.find_executable('openssl') + backup_args.__dict__['file_path'] = distspawn.find_executable('file') + backup_args.__dict__['mount_path'] = distspawn.find_executable('mount') + backup_args.__dict__['umount_path'] = distspawn.find_executable('umount') + backup_args.__dict__['ionice'] = distspawn.find_executable('ionice') + + # MySQLdb object + backup_args.__dict__['mysql_db_inst'] = '' + + # Freezer version + backup_args.__dict__['__version__'] = '1.0.8' + + return backup_args, arg_parser diff --git a/freezer/backup.py b/freezer/backup.py new file mode 100644 index 00000000..681a0ac3 --- /dev/null +++ b/freezer/backup.py @@ -0,0 +1,179 @@ +''' +Copyright 2014 Hewlett-Packard + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +This product includes cryptographic software written by Eric Young +(eay@cryptsoft.com). This product includes software written by Tim +Hudson (tjh@cryptsoft.com). +======================================================================== + +Freezer Backup modes related functions +''' + +from freezer.lvm import lvm_snap, lvm_snap_remove +from freezer.tar import tar_backup, gen_tar_command +from freezer.swift import add_object, manifest_upload +from freezer.utils import gen_manifest_meta, add_host_name_ts_level + +from multiprocessing import Process, Queue +import logging +import os + + +def backup_mode_mysql(backup_opt_dict, time_stamp, manifest_meta_dict): + ''' + Execute a MySQL DB backup. currently only backup with lvm snapshots + are supported. This mean, just before the lvm snap vol is created, + the db tables will be flushed and locked for read, then the lvm create + command will be executed and after that, the table will be unlocked and + the backup will be executed. It is important to have the available in + backup_args.mysql_conf_file the file where the database host, name, user, + and password are set. + ''' + try: + import MySQLdb + except ImportError as error: + logging.critical('[*] Error: please install MySQLdb module') + raise ImportError('[*] Error: please install MySQLdb module') + + if not backup_opt_dict.mysql_conf_file: + logging.critical( + '[*] MySQL Error: please provide a valid config file') + raise ValueError + # Open the file provided in backup_args.mysql_conf_file and extract the + # db host, name, user and password. + db_user = db_host = db_pass = False + with open(backup_opt_dict.mysql_conf_file, 'r') as mysql_file_fd: + for line in mysql_file_fd: + if 'host' in line: + db_host = line.split('=')[1].strip() + continue + elif 'user' in line: + db_user = line.split('=')[1].strip() + continue + elif 'password' in line: + db_pass = line.split('=')[1].strip() + continue + + # Initialize the DB object and connect to the db according to + # the db mysql backup file config + try: + backup_opt_dict.mysql_db_inst = MySQLdb.connect( + host=db_host, user=db_user, passwd=db_pass) + except Exception as error: + logging.critical('[*] MySQL Error: {0}'.format(error)) + raise Exception + + # Execute LVM backup + backup_mode_fs(backup_opt_dict, time_stamp, manifest_meta_dict) + + +def backup_mode_mongo(backup_opt_dict, time_stamp, manifest_meta_dict): + ''' + Execute the necessary tasks for file system backup mode + ''' + try: + from pymongo import MongoClient + except ImportError: + logging.critical('[*] Error: please install pymongo module') + raise ImportError('[*] Error: please install pymongo module') + + logging.info('[*] MongoDB backup is being executed...') + logging.info('[*] Checking is the localhost is Master/Primary...') + mongodb_port = '27017' + local_hostname = backup_opt_dict.hostname + db_host_port = '{0}:{1}'.format(local_hostname, mongodb_port) + mongo_client = MongoClient(db_host_port) + master_dict = dict(mongo_client.admin.command("isMaster")) + mongo_me = master_dict['me'] + mongo_primary = master_dict['primary'] + + if mongo_me == mongo_primary: + backup_mode_fs(backup_opt_dict, time_stamp, manifest_meta_dict) + else: + logging.warning('[*] localhost {0} is not Master/Primary,\ + exiting...'.format(local_hostname)) + return True + + +def backup_mode_fs(backup_opt_dict, time_stamp, manifest_meta_dict): + ''' + Execute the necessary tasks for file system backup mode + ''' + + logging.info('[*] File System backup is being executed...') + lvm_snap(backup_opt_dict) + # Extract some values from arguments that will be used later on + # Initialize swift client object, generate container segments name + # and extract backup name + sw_connector = backup_opt_dict.sw_connector + + # Execute a tar gzip of the specified directory and return + # small chunks (default 128MB), timestamp, backup, filename, + # file chunk index and the tar meta-data file + + # Generate a string hostname, backup name, timestamp and backup level + file_name = add_host_name_ts_level(backup_opt_dict, time_stamp) + meta_data_backup_file = u'tar_metadata_{0}'.format(file_name) + + (backup_opt_dict, tar_command, manifest_meta_dict) = gen_tar_command( + opt_dict=backup_opt_dict, time_stamp=time_stamp, + remote_manifest_meta=manifest_meta_dict) + # Initialize a Queue for a maximum of 2 items + tar_backup_queue = Queue(maxsize=2) + tar_backup_stream = Process( + target=tar_backup, args=( + backup_opt_dict, tar_command, tar_backup_queue,)) + tar_backup_stream.daemon = True + tar_backup_stream.start() + + add_object_stream = Process( + target=add_object, args=( + backup_opt_dict, tar_backup_queue, file_name, time_stamp)) + add_object_stream.daemon = True + add_object_stream.start() + + tar_backup_stream.join() + tar_backup_queue.put( + ({False : False})) + tar_backup_queue.close() + add_object_stream.join() + + (backup_opt_dict, manifest_meta_dict, tar_meta_to_upload, + tar_meta_prev) = gen_manifest_meta( + backup_opt_dict, manifest_meta_dict, meta_data_backup_file) + + manifest_file = u'' + meta_data_abs_path = '{0}/{1}'.format( + backup_opt_dict.workdir, tar_meta_prev) + # Upload swift manifest for segments + if backup_opt_dict.upload: + if not backup_opt_dict.no_incremental: + # Upload tar incremental meta data file and remove it + logging.info('[*] Uploading tar meta data file: {0}'.format( + tar_meta_to_upload)) + with open(meta_data_abs_path, 'r') as meta_fd: + sw_connector.put_object( + backup_opt_dict.container, tar_meta_to_upload, meta_fd) + # Removing tar meta data file, so we have only one authoritative + # version on swift + logging.info('[*] Removing tar meta data file: {0}'.format( + meta_data_abs_path)) + os.remove(meta_data_abs_path) + # Upload manifest to swift + manifest_upload( + manifest_file, backup_opt_dict, file_name, manifest_meta_dict) + + # Unmount and remove lvm snapshot volume + lvm_snap_remove(backup_opt_dict) diff --git a/freezer/lvm.py b/freezer/lvm.py new file mode 100644 index 00000000..b5cee3c0 --- /dev/null +++ b/freezer/lvm.py @@ -0,0 +1,206 @@ +''' +Copyright 2014 Hewlett-Packard + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +This product includes cryptographic software written by Eric Young +(eay@cryptsoft.com). This product includes software written by Tim +Hudson (tjh@cryptsoft.com). +======================================================================== + +Freezer LVM related functions +''' + +from freezer.utils import ( + create_dir, get_vol_fs_type, validate_all_args) + +import re +import os +import subprocess +import logging + + +def lvm_eval(backup_opt_dict): + ''' + Evaluate if the backup must be executed using lvm snapshot + or just directly on the plain filesystem. If no lvm options are specified + the backup will be executed directly on the file system and without + use lvm snapshot. If one of the lvm options are set, then the lvm snap + will be used to execute backup. This mean all the required options + must be set accordingly + ''' + + required_list = [ + backup_opt_dict.lvm_volgroup, + backup_opt_dict.lvm_srcvol, + backup_opt_dict.lvm_dirmount] + + if not validate_all_args(required_list): + logging.warning('[*] Required lvm options not set. The backup will \ + execute without lvm snapshot.') + return False + + # Create lvm_dirmount dir if it doesn't exists and write action in logs + create_dir(backup_opt_dict.lvm_dirmount) + + return True + + +def lvm_snap_remove(backup_opt_dict): + ''' + Remove the specified lvm_snapshot. If the volume is mounted + it will unmount it and then removed + ''' + + if not lvm_eval(backup_opt_dict): + return True + + vol_group = backup_opt_dict.lvm_volgroup.replace('-', '--') + snap_name = backup_opt_dict.lvm_snapname.replace('-', '--') + mapper_snap_vol = '/dev/mapper/{0}-{1}'.format(vol_group, snap_name) + with open('/proc/mounts', 'r') as proc_mount_fd: + for mount_line in proc_mount_fd: + if mapper_snap_vol.lower() in mount_line.lower(): + mount_list = mount_line.split(' ') + (dev_vol, mount_point) = mount_list[0], mount_list[1] + logging.warning('[*] Found lvm snapshot {0} mounted on {1}\ + '.format(dev_vol, mount_point)) + umount_proc = subprocess.Popen('{0} -l -f {1}'.format( + backup_opt_dict.umount_path, mount_point), + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, stderr=subprocess.PIPE, + shell=True, executable=backup_opt_dict.bash_path) + (umount_out, mount_err) = umount_proc.communicate() + if re.search(r'\S+', umount_out): + logging.critical('[*] Error: impossible to umount {0} {1}\ + '.format(mount_point, mount_err)) + raise Exception + else: + # Change working directory to be able to unmount + os.chdir(backup_opt_dict.workdir) + logging.info('[*] Volume {0} unmounted'.format( + mapper_snap_vol)) + snap_rm_proc = subprocess.Popen( + '{0} -f {1}'.format( + backup_opt_dict.lvremove_path, mapper_snap_vol), + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, stderr=subprocess.PIPE, + shell=True, executable=backup_opt_dict.bash_path) + (lvm_rm_out, lvm_rm_err) = snap_rm_proc.communicate() + if 'successfully removed' in lvm_rm_out: + logging.info('[*] {0}'.format(lvm_rm_out)) + return True + else: + logging.critical( + '[*] Error: lvm_snap_rm {0}'.format(lvm_rm_err)) + raise Exception + raise Exception + + +def lvm_snap(backup_opt_dict): + ''' + Implement checks on lvm volumes availability. According to these checks + we might create an lvm snapshot and mount it or use an existing one + ''' + + if lvm_eval(backup_opt_dict) is not True: + return True + # Setting lvm snapsize to 5G is not set + if backup_opt_dict.lvm_snapsize is False: + backup_opt_dict.lvm_snapsize = '5G' + logging.warning('[*] lvm_snapsize not configured. Setting the \ + lvm snapshot size to 5G') + + # Setting lvm snapshot name to freezer_backup_snap it not set + if backup_opt_dict.lvm_snapname is False: + backup_opt_dict.lvm_snapname = 'freezer_backup_snap' + logging.warning('[*] lvm_snapname not configured. Setting default \ + name "freezer_backup_snap" for the lvm backup snap session') + + logging.info('[*] Source LVM Volume: {0}'.format( + backup_opt_dict.lvm_srcvol)) + logging.info('[*] LVM Volume Group: {0}'.format( + backup_opt_dict.lvm_volgroup)) + logging.info('[*] Snapshot name: {0}'.format( + backup_opt_dict.lvm_snapname)) + logging.info('[*] Snapshot size: {0}'.format( + backup_opt_dict.lvm_snapsize)) + logging.info('[*] Directory where the lvm snaphost will be mounted on:\ + {0}'.format(backup_opt_dict.lvm_dirmount.strip())) + + # Create the snapshot according the values passed from command line + lvm_create_snap = '{0} --size {1} --snapshot --name {2} {3}\ + '.format( + backup_opt_dict.lvcreate_path, + backup_opt_dict.lvm_snapsize, + backup_opt_dict.lvm_snapname, + backup_opt_dict.lvm_srcvol) + + # If backup mode is mysql, then the db will be flushed and read locked + # before the creation of the lvm snap + if backup_opt_dict.mode == 'mysql': + cursor = backup_opt_dict.mysql_db_inst.cursor() + cursor.execute('FLUSH TABLES WITH READ LOCK') + backup_opt_dict.mysql_db_inst.commit() + + lvm_process = subprocess.Popen( + lvm_create_snap, stdin=subprocess.PIPE, stdout=subprocess.PIPE, + stderr=subprocess.PIPE, shell=True, + executable=backup_opt_dict.bash_path) + (lvm_out, lvm_err) = lvm_process.communicate() + if lvm_err is False: + logging.critical('[*] lvm snapshot creation error: {0}\ + '.format(lvm_err)) + raise Exception + else: + logging.warning('[*] {0}'.format(lvm_out)) + + # Unlock MySQL Tables if backup is == mysql + if backup_opt_dict.mode == 'mysql': + cursor.execute('UNLOCK TABLES') + backup_opt_dict.mysql_db_inst.commit() + cursor.close() + backup_opt_dict.mysql_db_inst.close() + + # Guess the file system of the provided source volume and st mount + # options accordingly + filesys_type = get_vol_fs_type(backup_opt_dict) + mount_options = ' ' + if 'xfs' == filesys_type: + mount_options = ' -onouuid ' + # Mount the newly created snapshot to dir_mount + abs_snap_name = '/dev/{0}/{1}'.format( + backup_opt_dict.lvm_volgroup, + backup_opt_dict.lvm_snapname) + mount_snap = '{0} {1} {2} {3}'.format( + backup_opt_dict.mount_path, + mount_options, + abs_snap_name, + backup_opt_dict.lvm_dirmount) + mount_process = subprocess.Popen( + mount_snap, stdin=subprocess.PIPE, stdout=subprocess.PIPE, + stderr=subprocess.PIPE, shell=True, + executable=backup_opt_dict.bash_path) + mount_err = mount_process.communicate()[1] + if 'already mounted' in mount_err: + logging.warning('[*] Volume {0} already mounted on {1}\ + '.format(abs_snap_name, backup_opt_dict.lvm_dirmount)) + return True + if mount_err: + logging.critical('[*] lvm snapshot mounting error: {0}'.format( + mount_err)) + raise Exception + else: + logging.warning('[*] Volume {0} succesfully mounted on {1}\ + '.format(abs_snap_name, backup_opt_dict.lvm_dirmount)) + return True diff --git a/freezer/main.py b/freezer/main.py new file mode 100644 index 00000000..10253024 --- /dev/null +++ b/freezer/main.py @@ -0,0 +1,107 @@ +''' +Copyright 2014 Hewlett-Packard + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +This product includes cryptographic software written by Eric Young +(eay@cryptsoft.com). This product includes software written by Tim +Hudson (tjh@cryptsoft.com). +======================================================================== + +Freezer main execution function +''' + +from freezer.utils import ( + start_time, elapsed_time, set_backup_level, validate_any_args, + check_backup_existance) +from freezer.swift import ( + get_client, get_containers_list, show_containers, + check_container_existance, get_container_content, remove_obj_older_than, + show_objects) +from freezer.backup import ( + backup_mode_fs, backup_mode_mongo, backup_mode_mysql) +from freezer.restore import restore_fs + +import logging + + +def freezer_main(backup_args): + ''' + Program Main Execution. This main function is a wrapper for most + of the other functions. By calling main() the program execution start + and the respective actions are taken. If you want only use the single + function is probably better to not import main() + ''' + + # Computing execution start datetime and Timestamp + (time_stamp, today_start) = start_time() + # Add timestamp to the arguments namespace + backup_args.__dict__['time_stamp'] = time_stamp + + # Initialize the swift connector and store it in the same dict passed + # as argument under the dict.sw_connector namespace. This is helpful + # so the swift client object doesn't need to be initialized every time + backup_args = get_client(backup_args) + + # Get the list of the containers + backup_args = get_containers_list(backup_args) + + if show_containers(backup_args): + elapsed_time(today_start) + return True + + # Check if the provided container already exists in swift. + # If it doesn't exist a new one will be created along with the segments + # container as container_segments + backup_args = check_container_existance(backup_args) + + # Get the object list of the remote containers and store id in the + # same dict passes as argument under the dict.remote_obj_list namespace + backup_args = get_container_content(backup_args) + + if show_objects(backup_args): + elapsed_time(today_start) + return True + + # Check if a backup exist in swift with same name. If not, set + # backup level to 0 + manifest_meta_dict = check_backup_existance(backup_args) + + # Set the right backup level for incremental backup + (backup_args, manifest_meta_dict) = set_backup_level( + backup_args, manifest_meta_dict) + + backup_args.manifest_meta_dict = manifest_meta_dict + # File system backup mode selected + if backup_args.mode == 'fs': + # If any of the restore options was specified, then a data restore + # will be executed + if validate_any_args([ + backup_args.restore_from_date, backup_args.restore_from_host, + backup_args.restore_abs_path]): + logging.info('[*] Executing FS restore...') + restore_fs(backup_args) + else: + backup_mode_fs(backup_args, time_stamp, manifest_meta_dict) + elif backup_args.mode == 'mongo': + backup_mode_mongo(backup_args, time_stamp, manifest_meta_dict) + elif backup_args.mode == 'mysql': + backup_mode_mysql(backup_args, time_stamp, manifest_meta_dict) + else: + logging.critical('[*] Error: Please provide a valid backup mode') + raise ValueError + + remove_obj_older_than(backup_args) + + # Elapsed time: + elapsed_time(today_start) diff --git a/freezer/restore.py b/freezer/restore.py new file mode 100644 index 00000000..60653587 --- /dev/null +++ b/freezer/restore.py @@ -0,0 +1,164 @@ +''' +Copyright 2014 Hewlett-Packard + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +This product includes cryptographic software written by Eric Young +(eay@cryptsoft.com). This product includes software written by Tim +Hudson (tjh@cryptsoft.com). +======================================================================== + +Freezer restore modes related functions +''' + +from freezer.tar import tar_restore +from freezer.swift import object_to_stream +from freezer.utils import ( + validate_all_args, get_match_backup, sort_backup_list) + +from multiprocessing import Process, Pipe +import os +import logging +import re +import datetime +import time + + +def restore_fs(backup_opt_dict): + ''' + Restore data from swift server to your local node. Data will be restored + in the directory specified in backup_opt_dict.restore_abs_path. The + object specified with the --get-object option will be downloaded from + the Swift server and will be downloaded inside the parent directory of + backup_opt_dict.restore_abs_path. If the object was compressed during + backup time, then it is decrypted, decompressed and de-archived to + backup_opt_dict.restore_abs_path. Before download the file, the size of + the local volume/disk/partition will be computed. If there is enough space + the full restore will be executed. Please remember to stop any service + that require access to the data before to start the restore execution + and to start the service at the end of the restore execution + ''' + + # List of mandatory values + required_list = [ + os.path.exists(backup_opt_dict.restore_abs_path), + backup_opt_dict.remote_obj_list, + backup_opt_dict.container, + backup_opt_dict.backup_name + ] + + # Arugment validation. Raise ValueError is all the arguments are not True + if not validate_all_args(required_list): + logging.critical("[*] Error: please provide ALL the following \ + arguments: {0}".format(' '.join(required_list))) + raise ValueError + + if not backup_opt_dict.restore_from_date: + logging.warning('[*] Restore date time not available. Setting to \ + current datetime') + backup_opt_dict.restore_from_date = \ + re.sub( + r'^(\S+?) (.+?:\d{,2})\.\d+?$', r'\1T\2', + str(datetime.datetime.now())) + + # If restore_from_host is set to local hostname is not set in + # backup_opt_dict.restore_from_host + if backup_opt_dict.restore_from_host: + backup_opt_dict.hostname = backup_opt_dict.restore_from_host + + # Check if there's a backup matching. If not raise Exception + backup_opt_dict = get_match_backup(backup_opt_dict) + if not backup_opt_dict.remote_match_backup: + logging.critical( + '[*] Not backup found matching with name: {0},\ + hostname: {1}'.format( + backup_opt_dict.backup_name, backup_opt_dict.hostname)) + raise ValueError + + restore_fs_sort_obj(backup_opt_dict) + + +def restore_fs_sort_obj(backup_opt_dict): + ''' + Take options dict as argument and sort/remove duplicate elements from + backup_opt_dict.remote_match_backup and find the closes backup to the + provided from backup_opt_dict.restore_from_date. Once the objects are + looped backwards and the level 0 backup is found, along with the other + level 1,2,n, is download the object from swift and untar them locally + starting from level 0 to level N. + ''' + + # Convert backup_opt_dict.restore_from_date to timestamp + fmt = '%Y-%m-%dT%H:%M:%S' + opt_backup_date = datetime.datetime.strptime( + backup_opt_dict.restore_from_date, fmt) + opt_backup_timestamp = int(time.mktime(opt_backup_date .timetuple())) + + # Sort remote backup list using timestamp in reverse order, + # that is from the newest to the oldest executed backup + sorted_backups_list = sort_backup_list(backup_opt_dict) + # Get the closest earlier backup to date set in + # backup_opt_dict.restore_from_date + closest_backup_list = [] + for backup_obj in sorted_backups_list: + if backup_obj.startswith('tar_metadata'): + continue + obj_name_match = re.search( + r'\S+?_{0}_(\d+)_(\d+?)$'.format(backup_opt_dict.backup_name), + backup_obj, re.I) + if not obj_name_match: + continue + # Ensure provided timestamp is bigger then object timestamp + if opt_backup_timestamp >= int(obj_name_match.group(1)): + closest_backup_list.append(backup_obj) + # If level 0 is reached, break the loop as level 0 is the first + # backup we want to restore + if int(obj_name_match.group(2)) == 0: + break + + if not closest_backup_list: + logging.info('[*] No matching backup name {0} found in \ + container {1} for hostname {2}'.format( + backup_opt_dict.backup_name, backup_opt_dict.container, + backup_opt_dict.hostname)) + raise ValueError + + # Backups are looped from the last element of the list going + # backwards, as we want to restore starting from the oldest object + for backup in closest_backup_list[::-1]: + write_pipe, read_pipe = Pipe() + process_stream = Process( + target=object_to_stream, args=( + backup_opt_dict, write_pipe, backup,)) + process_stream.daemon = True + process_stream.start() + + tar_stream = Process( + target=tar_restore, args=(backup_opt_dict, read_pipe,)) + tar_stream.daemon = True + tar_stream.start() + + process_stream.join() + tar_stream.join() + + logging.info( + '[*] Restore execution successfully finished for backup name {0}, \ + from container {1}, into directory {2}'.format( + backup_opt_dict.backup_name, backup_opt_dict.container, + backup_opt_dict.restore_abs_path)) + + logging.info( + '[*] Restore execution successfully executed for backup name {0}, \ + from container {1}, into directory {2}'.format( + backup_opt_dict.backup_name, backup_opt_dict.container, + backup_opt_dict.restore_abs_path)) diff --git a/freezer/swift.py b/freezer/swift.py new file mode 100644 index 00000000..c51b6e32 --- /dev/null +++ b/freezer/swift.py @@ -0,0 +1,398 @@ +''' +Copyright 2014 Hewlett-Packard + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +This product includes cryptographic software written by Eric Young +(eay@cryptsoft.com). This product includes software written by Tim +Hudson (tjh@cryptsoft.com). +======================================================================== + +Freezer functions to interact with OpenStack Swift client and server +''' + +from freezer.utils import ( + validate_all_args, get_match_backup, + sort_backup_list) + +import os +import swiftclient +import json +import re +from copy import deepcopy +import time +import logging + + +def show_containers(backup_opt_dict): + ''' + Print remote containers in sorted order + ''' + + if not backup_opt_dict.list_container: + return False + + ordered_container = {} + for container in backup_opt_dict.containers_list: + ordered_container['container_name'] = container['name'] + size = '{0}'.format((int(container['bytes']) / 1024) / 1024) + if size == '0': + size = '1' + ordered_container['size'] = '{0}MB'.format(size) + ordered_container['objects_count'] = container['count'] + print json.dumps( + ordered_container, indent=4, + separators=(',', ': '), sort_keys=True) + return True + + +def show_objects(backup_opt_dict): + ''' + Retreive the list of backups from backup_opt_dict for the specified \ + container and print them nicely to std out. + ''' + + if not backup_opt_dict.list_objects: + return False + + required_list = [ + backup_opt_dict.remote_obj_list] + + if not validate_all_args(required_list): + logging.critical('[*] Error: Remote Object list not avaiblale') + raise Exception + + ordered_objects = {} + remote_obj = backup_opt_dict.remote_obj_list + + for obj in remote_obj: + ordered_objects['object_name'] = obj['name'] + ordered_objects['upload_date'] = obj['last_modified'] + print json.dumps( + ordered_objects, indent=4, + separators=(',', ': '), sort_keys=True) + + return True + + +def remove_obj_older_than(backup_opt_dict): + ''' + Remove object in remote swift server older more tqhen days + ''' + + if not backup_opt_dict.remote_obj_list \ + or backup_opt_dict.remove_older_than is False: + logging.warning('[*] No remote objects will be removed') + return False + + backup_opt_dict.remove_older_than = int( + float(backup_opt_dict.remove_older_than)) + logging.info('[*] Removing object older {0} day(s)'.format( + backup_opt_dict.remove_older_than)) + # Compute the amount of seconds from days to compare with + # the remote backup timestamp + max_time = backup_opt_dict.remove_older_than * 86400 + current_timestamp = backup_opt_dict.time_stamp + backup_name = backup_opt_dict.backup_name + hostname = backup_opt_dict.hostname + backup_opt_dict = get_match_backup(backup_opt_dict) + sorted_remote_list = sort_backup_list(backup_opt_dict) + sw_connector = backup_opt_dict.sw_connector + for match_object in sorted_remote_list: + obj_name_match = re.search(r'{0}_({1})_(\d+)_\d+?$'.format( + hostname, backup_name), match_object, re.I) + if not obj_name_match: + continue + remote_obj_timestamp = int(obj_name_match.group(2)) + time_delta = current_timestamp - remote_obj_timestamp + if time_delta > max_time: + logging.info('[*] Removing backup object: {0}'.format( + match_object)) + sw_connector.delete_object( + backup_opt_dict.container, match_object) + # Try to remove also the corresponding tar_meta + # NEED TO BE IMPROVED! + try: + tar_match_object = 'tar_metadata_{0}'.format(match_object) + sw_connector.delete_object( + backup_opt_dict.container, tar_match_object) + logging.info( + '[*] Object tar meta data removed: {0}'.format( + tar_match_object)) + except Exception: + pass + + +def get_container_content(backup_opt_dict): + ''' + Download the list of object of the provided container + and print them out as container meta-data and container object list + ''' + + if not backup_opt_dict.container: + print '[*] Error: please provide a valid container name' + logging.critical( + '[*] Error: please provide a valid container name') + raise Exception + + sw_connector = backup_opt_dict.sw_connector + try: + backup_opt_dict.remote_obj_list = \ + sw_connector.get_container(backup_opt_dict.container)[1] + return backup_opt_dict + except Exception as error: + logging.critical('[*] Error: get_object_list: {0}'.format(error)) + raise Exception + + +def check_container_existance(backup_opt_dict): + ''' + Check if the provided container is already available on Swift. + The verification is done by exact matching between the provided container + name and the whole list of container available for the swift account. + If the container is not found, it will be automatically create and used + to execute the backup + ''' + + required_list = [ + backup_opt_dict.container_segments, + backup_opt_dict.container] + + if not validate_all_args(required_list): + logging.critical("[*] Error: please provide ALL the following args \ + {0}".format(','.join(required_list))) + raise Exception + logging.info( + "[*] Retrieving container {0}".format(backup_opt_dict.container)) + sw_connector = backup_opt_dict.sw_connector + containers_list = sw_connector.get_account()[1] + match_container = None + match_container_seg = None + + match_container = [ + container_object['name'] for container_object in containers_list + if container_object['name'] == backup_opt_dict.container] + match_container_seg = [ + container_object['name'] for container_object in containers_list + if container_object['name'] == backup_opt_dict.container_segments] + + # If no container is available, create it and write to logs + if not match_container: + logging.warning("[*] No such container {0} available... ".format( + backup_opt_dict.container)) + logging.warning( + "[*] Creating container {0}".format(backup_opt_dict.container)) + sw_connector.put_container(backup_opt_dict.container) + else: + logging.info( + "[*] Container {0} found!".format(backup_opt_dict.container)) + + if not match_container_seg: + logging.warning("[*] Creating segments container {0}".format( + backup_opt_dict.container_segments)) + sw_connector.put_container(backup_opt_dict.container_segments) + else: + logging.info("[*] Container Segments {0} found!".format( + backup_opt_dict.container_segments)) + + return backup_opt_dict + + +# This function is useless? Remove is and use the single env accordingly +def get_swift_os_env(): + ''' + Get the swift related environment variable + ''' + + environ_dict = os.environ + return environ_dict['OS_REGION_NAME'], environ_dict['OS_TENANT_ID'], \ + environ_dict['OS_PASSWORD'], environ_dict['OS_AUTH_URL'], \ + environ_dict['OS_USERNAME'], environ_dict['OS_TENANT_NAME'] + + +def get_client(backup_opt_dict): + ''' + Initialize a swift client object and return it in + backup_opt_dict + ''' + + sw_client = swiftclient.client + options = {} + (options['region_name'], options['tenant_id'], options['password'], + options['auth_url'], options['username'], + options['tenant_name']) = get_swift_os_env() + + backup_opt_dict.sw_connector = sw_client.Connection( + authurl=options['auth_url'], + user=options['username'], key=options['password'], os_options=options, + tenant_name=options['tenant_name'], auth_version='2', retries=6) + return backup_opt_dict + + +def manifest_upload( + manifest_file, backup_opt_dict, file_prefix, manifest_meta_dict): + ''' + Upload Manifest to manage segments in Swift + ''' + + if not manifest_meta_dict: + logging.critical('[*] Error Manifest Meta dictionary not available') + raise Exception + + sw_connector = backup_opt_dict.sw_connector + tmp_manifest_meta = dict() + for key, value in manifest_meta_dict.items(): + if key.startswith('x-object-meta'): + tmp_manifest_meta[key] = value + manifest_meta_dict = deepcopy(tmp_manifest_meta) + header = manifest_meta_dict + manifest_meta_dict['x-object-manifest'] = u'{0}/{1}'.format( + backup_opt_dict.container_segments.strip(), file_prefix.strip()) + logging.info('[*] Uploading Swift Manifest: {0}'.format(header)) + sw_connector.put_object( + backup_opt_dict.container, file_prefix, manifest_file, headers=header) + logging.info('[*] Manifest successfully uploaded!') + + +def add_object( + backup_opt_dict, backup_queue, absolute_file_path=None, + time_stamp=None): + ''' + Upload object on the remote swift server + ''' + + if not backup_opt_dict.container: + logging.critical('[*] Error: Please specify the container \ + name with -C option') + raise Exception + + if absolute_file_path is None and backup_queue is None: + logging.critical('[*] Error: Please specify the file you want to \ + upload on swift with -d option') + raise Exception + + max_segment_size = backup_opt_dict.max_seg_size + if not backup_opt_dict.max_seg_size: + max_segment_size = 134217728 + + sw_connector = backup_opt_dict.sw_connector + while True: + package_name = absolute_file_path.split('/')[-1] + file_chunk_index, file_chunk = backup_queue.get().popitem() + if file_chunk_index == False and file_chunk == False: + break + package_name = u'{0}/{1}/{2}/{3}'.format( + package_name, time_stamp, max_segment_size, file_chunk_index) + # If for some reason the swift client object is not available anymore + # an exception is generated and a new client object is initialized/ + # If the exception happens for 10 consecutive times for a total of + # 1 hour, then the program will exit with an Exception. + count = 0 + while True: + try: + logging.info( + '[*] Uploading file chunk index: {0}'.format( + package_name)) + sw_connector.put_object( + backup_opt_dict.container_segments, + package_name, file_chunk) + logging.info('[*] Data successfully uploaded!') + break + except Exception as error: + time.sleep(60) + logging.info( + '[*] Retrying to upload file chunk index: {0}'.format( + package_name)) + backup_opt_dict = get_client(backup_opt_dict) + count += 1 + if count == 10: + logging.info( + '[*] Error: add_object: {0}'.format(error)) + raise Exception + + +def get_containers_list(backup_opt_dict): + ''' + Get a list and information of all the available containers + ''' + + try: + sw_connector = backup_opt_dict.sw_connector + backup_opt_dict.containers_list = sw_connector.get_account()[1] + return backup_opt_dict + except Exception as error: + logging.error('[*] Get containers list error: {0}').format(error) + raise Exception + + +def object_to_file(backup_opt_dict, file_name_abs_path): + ''' + Take a payload downloaded from Swift + and save it to the disk as file_name + ''' + + required_list = [ + backup_opt_dict.container, + file_name_abs_path] + + if not validate_all_args(required_list): + logging.critical('[*] Error: Please provide ALL the following \ + arguments: {0}'.format(','.join(required_list))) + raise ValueError + + sw_connector = backup_opt_dict.sw_connector + file_name = file_name_abs_path.split('/')[-1] + logging.info('[*] Downloading object {0} on {1}'.format( + file_name, file_name_abs_path)) + + # As the file is download by chunks and each chunk will be appened + # to file_name_abs_path, we make sure file_name_abs_path does not + # exists by removing it before + if os.path.exists(file_name_abs_path): + os.remove(file_name_abs_path) + + with open(file_name_abs_path, 'ab') as obj_fd: + for obj_chunk in sw_connector.get_object( + backup_opt_dict.container, file_name, + resp_chunk_size=16000000)[1]: + obj_fd.write(obj_chunk) + + return True + + +def object_to_stream(backup_opt_dict, write_pipe, obj_name): + ''' + Take a payload downloaded from Swift + and generate a stream to be consumed from other processes + ''' + + required_list = [ + backup_opt_dict.container] + + if not validate_all_args(required_list): + logging.critical('[*] Error: Please provide ALL the following \ + arguments: {0}'.format(','.join(required_list))) + raise ValueError + + sw_connector = backup_opt_dict.sw_connector + logging.info('[*] Downloading data stream...') + + # As the file is download by chunks and each chunk will be appened + # to file_name_abs_path, we make sure file_name_abs_path does not + # exists by removing it before + #stream_write_pipe = os.fdopen(stream_write_pipe, 'w', 0) + for obj_chunk in sw_connector.get_object( + backup_opt_dict.container, obj_name, + resp_chunk_size=backup_opt_dict.max_seg_size)[1]: + + write_pipe.send(obj_chunk) diff --git a/freezer/tar.py b/freezer/tar.py new file mode 100644 index 00000000..f839dbee --- /dev/null +++ b/freezer/tar.py @@ -0,0 +1,225 @@ +''' +Copyright 2014 Hewlett-Packard + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +This product includes cryptographic software written by Eric Young +(eay@cryptsoft.com). This product includes software written by Tim +Hudson (tjh@cryptsoft.com). +======================================================================== + +Freezer Tar related functions +''' + +from freezer.utils import ( + validate_all_args, add_host_name_ts_level, create_dir) +from freezer.swift import object_to_file + +import os +import logging +import subprocess +import time + + +def tar_restore(backup_opt_dict, read_pipe): + ''' + Restore the provided file into backup_opt_dict.restore_abs_path + Descrypt the file if backup_opt_dict.encrypt_pass_file key is provided + ''' + + # Validate mandatory arguments + required_list = [ + os.path.exists(backup_opt_dict.restore_abs_path)] + + if not validate_all_args(required_list): + logging.critical("[*] Error: please provide ALL of the following \ + arguments: {0}".format(' '.join(required_list))) + raise ValueError + + # Set the default values for tar restore + tar_cmd = ' {0} -z --incremental --extract \ + --unlink-first --ignore-zeros --warning=none --overwrite \ + --directory {1} '.format( + backup_opt_dict.tar_path, backup_opt_dict.restore_abs_path) + + # Check if encryption file is provided and set the openssl decrypt + # command accordingly + if backup_opt_dict.encrypt_pass_file: + openssl_cmd = " {0} enc -d -aes-256-cfb -pass file:{1}".format( + backup_opt_dict.openssl_path, + backup_opt_dict.encrypt_pass_file) + tar_cmd = ' {0} | {1} '.format(openssl_cmd, tar_cmd) + + tar_cmd_proc = subprocess.Popen( + tar_cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, + stderr=subprocess.PIPE, shell=True, + executable=backup_opt_dict.bash_path) + + # Start loop reading the pipe and pass the data to the tar std input + while True: + data_stream = read_pipe.recv() + tar_cmd_proc.stdin.write(data_stream) + if len(data_stream) < int(backup_opt_dict.max_seg_size): + break + + tar_err = tar_cmd_proc.communicate()[1] + + if 'error' in tar_err.lower(): + logging.critical('[*] Restore error: {0}'.format(tar_err)) + raise Exception + + +def tar_incremental( + tar_cmd, backup_opt_dict, curr_tar_meta, remote_manifest_meta=None): + ''' + Check if the backup already exist in swift. If the backup already + exists, the related meta data and the tar incremental meta file will be + downloaded. According to the meta data content, new options will be + provided for the next meta data upload to swift and the existing tar meta + file will be used in the current incremental backup. Also the level + options will be checked and updated respectively + ''' + + if not tar_cmd or not backup_opt_dict: + logging.error('[*] Error: tar_incremental, please provide tar_cmd \ + and backup options') + raise ValueError + + if not remote_manifest_meta: + remote_manifest_meta = dict() + # If returned object from check_backup is not a dict, the backup + # is considered at first run, so a backup level 0 will be executed + curr_backup_level = remote_manifest_meta.get( + 'x-object-meta-backup-current-level', '0') + tar_meta = remote_manifest_meta.get( + 'x-object-meta-tar-meta-obj-name') + tar_cmd_level = '--level={0} '.format(curr_backup_level) + # Write the tar meta data file in ~/.freezer. It will be + # removed later on. If ~/.freezer does not exists it will be created'. + create_dir(backup_opt_dict.workdir) + + curr_tar_meta = '{0}/{1}'.format( + backup_opt_dict.workdir, curr_tar_meta) + tar_cmd_incr = ' --listed-incremental={0} '.format(curr_tar_meta) + if tar_meta: + # If tar meta data file is available, download it and use it + # as for tar incremental backup. Afte this, the + # remote_manifest_meta['x-object-meta-tar-meta-obj-name'] will be + # update with the current tar meta data name and uploaded again + tar_cmd_incr = ' --listed-incremental={0}/{1} '.format( + backup_opt_dict.workdir, tar_meta) + tar_meta_abs = "{0}/{1}".format(backup_opt_dict.workdir, tar_meta) + try: + object_to_file( + backup_opt_dict, tar_meta_abs) + except Exception: + logging.warning( + '[*] Tar metadata {0} not found. Executing level 0 backup\ + '.format(tar_meta)) + + tar_cmd = ' {0} {1} {2} '.format(tar_cmd, tar_cmd_level, tar_cmd_incr) + return tar_cmd, backup_opt_dict, remote_manifest_meta + + +def gen_tar_command( + opt_dict, meta_data_backup_file=False, time_stamp=int(time.time()), + remote_manifest_meta=False): + ''' + Generate tar command options. + ''' + + required_list = [ + opt_dict.backup_name, + opt_dict.src_file, + os.path.exists(opt_dict.src_file)] + + if not validate_all_args(required_list): + logging.critical( + 'Error: Please ALL the following options: {0}'.format( + ','.join(required_list))) + raise Exception + + # Change che current working directory to op_dict.src_file + os.chdir(os.path.normpath(opt_dict.src_file.strip())) + + logging.info('[*] Changing current working directory to: {0} \ + '.format(opt_dict.src_file)) + logging.info('[*] Backup started for: {0} \ + '.format(opt_dict.src_file)) + + # Tar option for default behavoir. Please refer to man tar to have + # a better options explanation + tar_command = ' {0} --create -z --warning=none \ + --dereference --hard-dereference --no-check-device --one-file-system \ + --preserve-permissions --same-owner --seek \ + --ignore-failed-read '.format(opt_dict.tar_path) + + file_name = add_host_name_ts_level(opt_dict, time_stamp) + meta_data_backup_file = u'tar_metadata_{0}'.format(file_name) + # Incremental backup section + if not opt_dict.no_incremental: + (tar_command, opt_dict, remote_manifest_meta) = tar_incremental( + tar_command, opt_dict, meta_data_backup_file, + remote_manifest_meta) + + # End incremental backup section + if opt_dict.exclude: + tar_command = ' {0} --exclude="{1}" '.format( + tar_command, + opt_dict.exclude) + + tar_command = ' {0} . '.format(tar_command) + # Encrypt data if passfile is provided + if opt_dict.encrypt_pass_file: + openssl_cmd = "{0} enc -aes-256-cfb -pass file:{1}".format( + opt_dict.openssl_path, opt_dict.encrypt_pass_file) + tar_command = '{0} | {1} '.format(tar_command, openssl_cmd) + + return opt_dict, tar_command, remote_manifest_meta + + +def tar_backup(opt_dict, tar_command, backup_queue): + ''' + Execute an incremental backup using tar options, specified as + function arguments + ''' + + # Set counters, index, limits and bufsize for subprocess + buf_size = 65535 + file_read_limit = 0 + file_chunk_index = 00000000 + file_block = b'' + tar_chunk = b'' + logging.info( + '[*] Archiving and compressing files from {0}'.format( + opt_dict.src_file)) + + tar_process = subprocess.Popen( + tar_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, + bufsize=buf_size, shell=True, executable=opt_dict.bash_path) + + # Iterate over tar process stdout + for file_block in tar_process.stdout: + tar_chunk += file_block + file_read_limit += len(file_block) + if file_read_limit >= opt_dict.max_seg_size: + backup_queue.put( + ({file_chunk_index : tar_chunk})) + file_chunk_index += 1 + tar_chunk = b'' + file_read_limit = 0 + # Upload segments smaller then opt_dict.max_seg_size + if len(tar_chunk) < opt_dict.max_seg_size: + backup_queue.put( + ({file_chunk_index : tar_chunk})) + file_chunk_index += 1 diff --git a/freezer/test/__init__.py b/freezer/test/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/freezer/test/test_arguments.py b/freezer/test/test_arguments.py new file mode 100644 index 00000000..fa4b33ad --- /dev/null +++ b/freezer/test/test_arguments.py @@ -0,0 +1,11 @@ +#!/usr/bin/env python + +from freezer.arguments import backup_arguments +import pytest +import argparse + +def test_backup_arguments(): + + backup_args, arg_parser = backup_arguments() + assert backup_args.tar_path is not False + assert backup_args.mode is ('fs' or 'mysql' or 'mongo') diff --git a/freezer/test/test_backup.py b/freezer/test/test_backup.py new file mode 100644 index 00000000..91569821 --- /dev/null +++ b/freezer/test/test_backup.py @@ -0,0 +1,16 @@ +#!/usr/bin/env python + +from freezer.backup import ( + backup_mode_mysql, backup_mode_fs, backup_mode_mongo) +from freezer.arguments import backup_arguments + +import time +import os + + +def test_backup_mode_mysql(): + + # THE WHOLE TEST NEED TO BE CHANGED USING MOCK!! + # Return backup options and arguments + backup_args = backup_arguments() + diff --git a/freezer/test/test_freezer.py b/freezer/test/test_freezer.py new file mode 100644 index 00000000..e69de29b diff --git a/freezer/utils.py b/freezer/utils.py new file mode 100644 index 00000000..06c15ac9 --- /dev/null +++ b/freezer/utils.py @@ -0,0 +1,524 @@ +''' +Copyright 2014 Hewlett-Packard + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +This product includes cryptographic software written by Eric Young +(eay@cryptsoft.com). This product includes software written by Tim +Hudson (tjh@cryptsoft.com). +======================================================================== + +Freezer general utils functions +''' + +import logging +import os +import time +import datetime +import re +import subprocess + + +def gen_manifest_meta( + backup_opt_dict, manifest_meta_dict, meta_data_backup_file): + ''' This function is used to load backup metadata information on Swift. + this is used to keep information between consecutive backup + executions. + If the manifest_meta_dict is available, most probably this is not + the first backup run for the provided backup name and host. + In this case we remove all the conflictive keys -> values from + the dictionary. + ''' + + if manifest_meta_dict.get('x-object-meta-tar-prev-meta-obj-name'): + tar_meta_prev = \ + manifest_meta_dict['x-object-meta-tar-prev-meta-obj-name'] + tar_meta_to_upload = \ + manifest_meta_dict['x-object-meta-tar-meta-obj-name'] = \ + manifest_meta_dict['x-object-meta-tar-prev-meta-obj-name'] = \ + meta_data_backup_file + else: + manifest_meta_dict['x-object-meta-tar-prev-meta-obj-name'] = \ + meta_data_backup_file + manifest_meta_dict['x-object-meta-backup-name'] = \ + backup_opt_dict.backup_name + manifest_meta_dict['x-object-meta-src-file-to-backup'] = \ + backup_opt_dict.src_file + manifest_meta_dict['x-object-meta-abs-file-path'] = '' + + # Set manifest meta if encrypt_pass_file is provided + # The file will contain a plain password that will be used + # to encrypt and decrypt tasks + manifest_meta_dict['x-object-meta-encrypt-data'] = 'Yes' + if backup_opt_dict.encrypt_pass_file is False: + manifest_meta_dict['x-object-meta-encrypt-data'] = '' + manifest_meta_dict['x-object-meta-always-backup-level'] = '' + if backup_opt_dict.always_backup_level: + manifest_meta_dict['x-object-meta-always-backup-level'] = \ + backup_opt_dict.always_backup_level + + # Set manifest meta if max_backup_level argument is provided + # Once the incremental backup arrive to max_backup_level, it will + # restart from level 0 + manifest_meta_dict['x-object-meta-maximum-backup-level'] = '' + if backup_opt_dict.max_backup_level is not False: + manifest_meta_dict['x-object-meta-maximum-backup-level'] = \ + str(backup_opt_dict.max_backup_level) + + # At the end of the execution, checks the objects ages for the + # specified swift container. If there are object older then the + # specified days they'll be removed. + # Unit is int and every int and 5 == five days. + manifest_meta_dict['x-object-meta-remove-backup-older-than-days'] = '' + if backup_opt_dict.remove_older_than is not False: + manifest_meta_dict['x-object-meta-remove-backup-older-than-days']\ + = backup_opt_dict.remove_older_than + manifest_meta_dict['x-object-meta-hostname'] = backup_opt_dict.hostname + manifest_meta_dict['x-object-meta-segments-size-bytes'] = \ + str(backup_opt_dict.max_seg_size) + manifest_meta_dict['x-object-meta-backup-created-timestamp'] = \ + str(backup_opt_dict.time_stamp) + manifest_meta_dict['x-object-meta-providers-list'] = 'HP' + manifest_meta_dict['x-object-meta-tar-meta-obj-name'] = \ + meta_data_backup_file + tar_meta_to_upload = tar_meta_prev = \ + manifest_meta_dict['x-object-meta-tar-meta-obj-name'] = \ + manifest_meta_dict['x-object-meta-tar-prev-meta-obj-name'] + + # Need to be processed from the last existing backup file found + # in Swift, matching with hostname and backup name + # the last existing file can be extracted from the timestamp + manifest_meta_dict['x-object-meta-container-segments'] = \ + backup_opt_dict.container_segments + + # Set the restart_always_backup value to n days. According + # to the following option, when the always_backup_level is set + # the backup will be reset to level 0 if the current backup + # times tamp is older then the days in x-object-meta-container-segments + manifest_meta_dict['x-object-meta-restart-always-backup'] = '' + if backup_opt_dict.restart_always_backup is not False: + manifest_meta_dict['x-object-meta-restart-always-backup'] = \ + backup_opt_dict.restart_always_backup + + return ( + backup_opt_dict, manifest_meta_dict, + tar_meta_to_upload, tar_meta_prev) + + +def validate_all_args(required_list): + ''' + Ensure ALL the elements of required_list are True. raise ValueError + Exception otherwise + ''' + + try: + for element in required_list: + if element is False or not element: + return False + except Exception as error: + logging.critical("[*] Error: {0} please provide ALL of the following \ + arguments: {1}".format(error, ' '.join(required_list))) + raise Exception + + return True + + +def validate_any_args(required_list): + ''' + Ensure ANY of the elements of required_list are True. raise ValueError + Exception otherwise + ''' + + try: + for element in required_list: + if element: + return True + except Exception: + logging.critical("[*] Error: please provide ANY of the following \ + arguments: {0}".format(' '.join(required_list))) + raise Exception + + return False + + +def sort_backup_list(backup_opt_dict): + ''' + Sort the backups by timestamp. The provided list contains strings in the + format hostname_backupname_timestamp_level + ''' + + # Remove duplicates objects + sorted_backups_list = list(set(backup_opt_dict.remote_match_backup)) + sorted_backups_list.sort(key=lambda x: x.split('_')[2], reverse=True) + return sorted_backups_list + + +def create_dir(directory): + ''' + Creates a directory if it doesn't exists and write the execution + in the logs + ''' + + try: + if not os.path.isdir(os.path.expanduser(directory)): + logging.warning('[*] Directory {0} does not exists,\ + creating...'.format(os.path.expanduser(directory))) + os.makedirs(os.path.expanduser(directory)) + else: + logging.warning('[*] Directory {0} found!'.format( + os.path.expanduser(directory))) + except Exception as error: + logging.warning('[*] Error while creating directory {0}: {1}\ + '.format(os.path.expanduser(directory, error))) + raise Exception + + +def get_match_backup(backup_opt_dict): + ''' + Return a dictionary containing a list of remote matching backups from + backup_opt_dict.remote_obj_list. + Backup have to exactly match against backup name and hostname of the + node where freezer is executed. The matching objects are stored and + available in backup_opt_dict.remote_match_backup + ''' + + if not backup_opt_dict.backup_name or not backup_opt_dict.container \ + or not backup_opt_dict.remote_obj_list: + logging.critical("[*] Error: please provide a valid Swift container,\ + backup name and the container contents") + raise Exception + + backup_name = backup_opt_dict.backup_name.lower() + if backup_opt_dict.remote_obj_list: + hostname = backup_opt_dict.hostname + for container_object in backup_opt_dict.remote_obj_list: + object_name = container_object.get('name', None) + if object_name: + obj_name_match = re.search(r'{0}_({1})_\d+?_\d+?$'.format( + hostname, backup_name), object_name.lower(), re.I) + if obj_name_match: + backup_opt_dict.remote_match_backup.append( + object_name) + backup_opt_dict.remote_objects.append(container_object) + + return backup_opt_dict + + +def get_newest_backup(backup_opt_dict): + ''' + Return from backup_opt_dict.remote_match_backup, the newest backup + matching the provided backup name and hostname of the node where + freezer is executed. It correspond to the previous backup executed. + ''' + + if not backup_opt_dict.remote_match_backup: + return backup_opt_dict + + backup_timestamp = 0 + hostname = backup_opt_dict.hostname + # Sort remote backup list using timestamp in reverse order, + # that is from the newest to the oldest executed backup + sorted_backups_list = sort_backup_list(backup_opt_dict) + for remote_obj in sorted_backups_list: + obj_name_match = re.search(r'^{0}_({1})_(\d+)_\d+?$'.format( + hostname, backup_opt_dict.backup_name), remote_obj, re.I) + if not obj_name_match: + continue + remote_obj_timestamp = int(obj_name_match.group(2)) + if remote_obj_timestamp > backup_timestamp: + backup_timestamp = remote_obj_timestamp + backup_opt_dict.remote_newest_backup = remote_obj + break + + return backup_opt_dict + + +def get_rel_oldest_backup(backup_opt_dict): + ''' + Return from swift, the relative oldest backup matching the provided + backup name and hostname of the node where freezer is executed. + The relative oldest backup correspond the oldest backup from the + last level 0 backup. + ''' + + if not backup_opt_dict.backup_name: + logging.critical("[*] Error: please provide a valid backup name in \ + backup_opt_dict.backup_name") + raise Exception + + backup_opt_dict.remote_rel_oldest = u'' + backup_name = backup_opt_dict.backup_name + hostname = backup_opt_dict.hostname + first_backup_name = False + first_backup_ts = 0 + for container_object in backup_opt_dict.remote_obj_list: + object_name = container_object.get('name', None) + if not object_name: + continue + obj_name_match = re.search(r'{0}_({1})_(\d+)_(\d+?)$'.format( + hostname, backup_name), object_name, re.I) + if not obj_name_match: + continue + remote_obj_timestamp = int(obj_name_match.group(2)) + remote_obj_level = int(obj_name_match.group(3)) + if remote_obj_level == 0 and (remote_obj_timestamp > first_backup_ts): + first_backup_name = object_name + first_backup_ts = remote_obj_timestamp + + backup_opt_dict.remote_rel_oldest = first_backup_name + return backup_opt_dict + + +def get_abs_oldest_backup(backup_opt_dict): + ''' + Return from swift, the absolute oldest backup matching the provided + backup name and hostname of the node where freezer is executed. + The absolute oldest backup correspond the oldest available level 0 backup. + ''' + if not backup_opt_dict.backup_name: + + logging.critical("[*] Error: please provide a valid backup name in \ + backup_opt_dict.backup_name") + raise Exception + + backup_opt_dict.remote_abs_oldest = u'' + if len(backup_opt_dict.remote_match_backup) == 0: + return backup_opt_dict + + backup_timestamp = 0 + hostname = backup_opt_dict.hostname + for remote_obj in backup_opt_dict.remote_match_backup: + object_name = remote_obj.get('name', None) + obj_name_match = re.search(r'{0}_({1})_(\d+)_(\d+?)$'.format( + hostname, backup_opt_dict.backup_name), remote_obj, re.I) + if not obj_name_match: + continue + remote_obj_timestamp = int(obj_name_match.group(2)) + if backup_timestamp == 0: + backup_timestamp = remote_obj_timestamp + + if remote_obj_timestamp <= backup_timestamp: + backup_timestamp = remote_obj_timestamp + backup_opt_dict.remote_abs_oldest = object_name + + return backup_opt_dict + + +def eval_restart_backup(backup_opt_dict): + ''' + Restart backup level if the first backup execute with always_backup_level + is older then restart_always_backup + ''' + + if not backup_opt_dict.restart_always_backup: + logging.info('[*] No need to set Backup {0} to level 0.'.format( + backup_opt_dict.backup_name)) + return False + + logging.info('[*] Checking always backup level timestamp...') + # Compute the amount of seconds to be compared with + # the remote backup timestamp + max_time = int(float(backup_opt_dict.restart_always_backup) * 86400) + current_timestamp = backup_opt_dict.time_stamp + backup_name = backup_opt_dict.backup_name + hostname = backup_opt_dict.hostname + first_backup_ts = 0 + # Get relative oldest backup by calling get_rel_oldes_backup() + backup_opt_dict = get_rel_oldest_backup(backup_opt_dict) + if not backup_opt_dict.remote_rel_oldest: + logging.info('[*] Relative oldest backup for backup name {0} on \ + host {1} not available. The backup level is NOT restarted'.format( + backup_name, hostname)) + return False + + obj_name_match = re.search(r'{0}_({1})_(\d+)_(\d+?)$'.format( + hostname, backup_name), backup_opt_dict.remote_rel_oldest, re.I) + if not obj_name_match: + logging.info('[*] No backup match available for backup {0} \ + and host {1}'.format(backup_name, hostname)) + return Exception + + first_backup_ts = int(obj_name_match.group(2)) + if (current_timestamp - first_backup_ts) > max_time: + logging.info( + '[*] Backup {0} older then {1} days. Backup level set to 0'.format( + backup_name, backup_opt_dict.restart_always_backup)) + + return True + else: + logging.info('[*] No need to set level 0 for Backup {0}.'.format( + backup_name)) + + return False + + +def start_time(): + ''' + Compute start execution time, write it in the logs and return timestamp + ''' + + fmt = '%Y-%m-%d %H:%M:%S' + today_start = datetime.datetime.now() + time_stamp = int(time.mktime(today_start.timetuple())) + fmt_date_start = today_start.strftime(fmt) + logging.info('[*] Execution Started at: {0}'.format(fmt_date_start)) + return time_stamp, today_start + + +def elapsed_time(today_start): + ''' + Compute elapsed time from today_start and write basic stats + in the log file + ''' + + fmt = '%Y-%m-%d %H:%M:%S' + today_finish = datetime.datetime.now() + fmt_date_finish = today_finish.strftime(fmt) + time_elapsed = today_finish - today_start + # Logging end execution information + logging.info('[*] Execution Finished, at: {0}'.format(fmt_date_finish)) + logging.info('[*] Time Elapsed: {0}'.format(time_elapsed)) + + +def set_backup_level(backup_opt_dict, manifest_meta_dict): + ''' + Set the backup level params in backup_opt_dict and the swift + manifest. This is a fundamental part of the incremental backup + ''' + + if manifest_meta_dict.get('x-object-meta-backup-name'): + backup_opt_dict.curr_backup_level = int( + manifest_meta_dict.get('x-object-meta-backup-current-level')) + max_backup_level = manifest_meta_dict.get( + 'x-object-meta-maximum-backup-level') + always_backup_level = manifest_meta_dict.get( + 'x-object-meta-always-backup-level') + restart_always_backup = manifest_meta_dict.get( + 'x-object-meta-restart-always-backup') + if max_backup_level: + max_backup_level = int(max_backup_level) + if backup_opt_dict.curr_backup_level < max_backup_level: + backup_opt_dict.curr_backup_level += 1 + manifest_meta_dict['x-object-meta-backup-current-level'] = \ + str(backup_opt_dict.curr_backup_level) + else: + manifest_meta_dict['x-object-meta-backup-current-level'] = \ + backup_opt_dict.curr_backup_level = '0' + elif always_backup_level: + always_backup_level = int(always_backup_level) + if backup_opt_dict.curr_backup_level < always_backup_level: + backup_opt_dict.curr_backup_level += 1 + manifest_meta_dict['x-object-meta-backup-current-level'] = \ + str(backup_opt_dict.curr_backup_level) + # If restart_always_backup is set, the backup_age will be computed + # and if the backup age in days is >= restart_always_backup, then + # backup-current-level will be set to 0 + if restart_always_backup: + backup_opt_dict.restart_always_backup = restart_always_backup + if eval_restart_backup(backup_opt_dict): + backup_opt_dict.curr_backup_level = '0' + manifest_meta_dict['x-object-meta-backup-current-level'] \ + = '0' + else: + backup_opt_dict.curr_backup_level = \ + manifest_meta_dict['x-object-meta-backup-current-level'] = '0' + + return backup_opt_dict, manifest_meta_dict + + +def get_vol_fs_type(backup_opt_dict): + ''' + The argument need to be a full path lvm name i.e. /dev/vg0/var + or a disk partition like /dev/sda1. The returnet value is the + file system type + ''' + + vol_name = backup_opt_dict.lvm_srcvol + if os.path.exists(vol_name) is False: + logging.critical('[*] Provided volume name not found: {0} \ + '.format(vol_name)) + raise Exception + + file_cmd = '{0} -0 -bLs --no-pad --no-buffer --preserve-date \ + {1}'.format(backup_opt_dict.file_path, vol_name) + file_process = subprocess.Popen( + file_cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, + stderr=subprocess.PIPE, shell=True, + executable=backup_opt_dict.bash_path) + (file_out, file_err) = file_process.communicate() + file_match = re.search(r'(\S+?) filesystem data', file_out, re.I) + if file_match is None: + logging.critical('[*] File system type not guessable: {0}\ + '.format(file_err)) + raise Exception + else: + filesys_type = file_match.group(1) + logging.info('[*] File system {0} found for volume {1}'.format( + filesys_type, vol_name)) + return filesys_type.lower().strip() + + raise Exception + + +def check_backup_existance(backup_opt_dict): + ''' + Check if any backup is already available on Swift. + The verification is done by backup_name, which needs to be unique + in Swift. This function will return an empty dict if no backup are + found or the Manifest metadata if the backup_name is available + ''' + + if not backup_opt_dict.backup_name or not backup_opt_dict.container or \ + not backup_opt_dict.remote_obj_list: + logging.warning("[*] A valid Swift container,\ + or backup name or container content not available. \ + Level 0 backup is being executed ") + return dict() + + logging.info("[*] Retreiving backup name {0} on container \ + {1}".format( + backup_opt_dict.backup_name.lower(), backup_opt_dict.container)) + backup_opt_dict = get_match_backup(backup_opt_dict) + backup_opt_dict = get_newest_backup(backup_opt_dict) + + if backup_opt_dict.remote_newest_backup: + sw_connector = backup_opt_dict.sw_connector + logging.info("[*] Backup {0} found!".format( + backup_opt_dict.backup_name)) + backup_match = sw_connector.head_object( + backup_opt_dict.container, backup_opt_dict.remote_newest_backup) + + return backup_match + else: + logging.warning("[*] No such backup {0} available... Executing \ + level 0 backup".format(backup_opt_dict.backup_name)) + return dict() + + +def add_host_name_ts_level(backup_opt_dict, time_stamp=int(time.time())): + ''' + Create the object name as: + hostname_backupname_timestamp_backup_level + ''' + + if backup_opt_dict.backup_name is False: + logging.critical('[*] Error: Please specify the backup name with\ + --backup-name option') + raise Exception + + backup_name = u'{0}_{1}_{2}_{3}'.format( + backup_opt_dict.hostname, + backup_opt_dict.backup_name, + time_stamp, backup_opt_dict.curr_backup_level) + + return backup_name diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 00000000..790c2738 --- /dev/null +++ b/requirements.txt @@ -0,0 +1,7 @@ +python-swiftclient>=2.0.3 +python-keystoneclient>=0.8.0 +argparse>=1.2.1 +docutils>=0.8.1 + +[testing] +pytest \ No newline at end of file diff --git a/setup.py b/setup.py new file mode 100644 index 00000000..2b0f6b22 --- /dev/null +++ b/setup.py @@ -0,0 +1,79 @@ +''' +Freezer Setup Python file. +''' +from setuptools import setup, find_packages +from setuptools.command.test import test as TestCommand +import freezer +import os + +here = os.path.abspath(os.path.dirname(__file__)) + +class PyTest(TestCommand): + def finalize_options(self): + TestCommand.finalize_options(self) + self.test_args = [] + self.test_suite = True + + def run_tests(self): + import pytest + errcode = pytest.main(self.test_args) + sys.exit(errcode) + + +def read(*filenames, **kwargs): + encoding = kwargs.get('encoding', 'utf-8') + sep = kwargs.get('sep', '\n') + buf = [] + for filename in filenames: + with io.open(filename, encoding=encoding) as f: + buf.append(f.read()) + return sep.join(buf) + + +setup( + name='freezer', + version='1.0.8', + url='http://sourceforge.net/projects/openstack-freezer/', + license='Apache Software License', + author='Fausto Marzi, Ryszard Chojnacki, Emil Dimitrov', + author_email='fausto.marzi@hp.com, ryszard@hp.com, edimitrov@hp.com', + maintainer='Fausto Marzi, Ryszard Chojnacki, Emil Dimitrov', + maintainer_email='fausto.marzi@hp.com, ryszard@hp.com, edimitrov@hp.com', + tests_require=['pytest'], + description='''OpenStack incremental backup and restore automation tool for file system, MongoDB, MySQL. LVM snapshot and encryption support''', + long_description=open('README.rst').read(), + keywords="OpenStack Swift backup restore mongodb mysql lvm snapshot", + packages=find_packages(), + platforms='Linux, *BSD, OSX', + test_suite='freezer.test.test_freezer', + cmdclass={'test': PyTest}, + scripts=['bin/freezerc'], + classifiers=[ + 'Programming Language :: Python', + 'Development Status :: 5 - Production/Stable', + 'Natural Language :: English', + 'Environment :: OpenStack', + 'Intended Audience :: Developers', + 'Intended Audience :: Financial and Insurance Industry', + 'Intended Audience :: Information Technology', + 'Intended Audience :: System Administrators', + 'Intended Audience :: Telecommunications Industry', + 'License :: OSI Approved :: Apache Software License', + 'Operating System :: MacOS', + 'Operating System :: POSIX :: BSD :: FreeBSD', + 'Operating System :: POSIX :: BSD :: NetBSD', + 'Operating System :: POSIX :: BSD :: OpenBSD', + 'Operating System :: POSIX :: Linux', + 'Operating System :: Unix', + 'Topic :: System :: Archiving :: Backup', + 'Topic :: System :: Archiving :: Compression', + 'Topic :: System :: Archiving', + ], + install_requires=[ + 'python-swiftclient>=2.0.3', + 'python-keystoneclient>=0.8.0', + 'argparse>=1.2.1'], + extras_require={ + 'testing': ['pytest'], + } +)