Paxata Backup Basics
OverviewThere are three components that requires backup in case of data loss from the running servers:
- Metadata Storage (MongoDB)
- Data Library Storage (HDFS)
- Properties Files (particularly pes.properties)
Basic ToolsFor each component, there are many tools for backup. Here we are recommending the most basic tools that can achieve the backup task alone. For better reliability/manageability, more advanced tools may be available.
Metadata Storage (MongoDB)
mongodump --out /tmp/mongobackup_`date +"%m-%d-%y"`
Data Library Storage (HDFS)
Distcp allows you to copy directory from HDFS to another cluster/s3 bucket.
hadoop distcp hdfs://CDH5-nameservice/user/paxata/library s3a://bucket/librarybackup
Cloudera BDR is a Enterprise solution of Distcp
Properties Files (particularly pes.properties)
Upload Files from server local file system to S3 bucket
cd /usr/local/paxata/server/config aws s3 sync . s3://bucket/propertybackup