Tuesday, November 12, 2013

Getting Your Backup and Recovery Process On Foils - Part 2

In the first part of this series of blog posts, I described how I think snapshots are misunderstood and quite frankly abused. This abuse can ultimately make you look like a fool given the right disaster scenario. In this second part, I wanted to dive a little bit deeper into the bits and bytes on an example of how snaps can actually be implemented to make you a backup hero!

Let's take a look at the data flow of what I am alluding to here. This is a visual example of how you can leverage a Backup and Recovery architecture (in this case using EMC NetWorker Snapshot Management together with the NetWorker Module for SAP) to truly offload the Backup and Recovery burden from your SAP with Oracle landscape. The best part is that aside from automating the snap orchestration, the entire process end-to-end is designed to take advantage of Protection Storage with Deduplication for better cost-efficiency over leveraging production capacity for versioning and longer term data retention.

Let's look at this process flow step-by-step:
  1. Starting at the production host, with the NetWorker Module for SAP and NetWorker Snapshot Management installed, interfacing with the application is done seamlessly through SAP BR*Tools' brbackup mechanism. It is up to brbackup (or any applications native backup utility) to prepare the data in a consistent state for the ability to create an application consistent copy of the data.  In this same step, brbackup makes calls directly into NetWorker's backint process, providing NetWorker with a detailed list of all the data structures for backup processing.
  2. NetWorker Snapshot Management interfaces directly with the array (and depending on use case certain types of disaster recovery solutions such as EMC RecoverPoint) to orchestrate actual snap creation of the production volumes coinciding with the list of data structures received from SAP itself.  These snaps that are created can be local snaps on the same local array subsystem, or can be replicated clones on a remote array as I have highlighted here.
  3. The interaction with the production host (servicing our application to our end users) is done and is left undisturbed. The processing now shifts to a designated proxy/mount host that then automatically mounts the replica or snap of the production data structures, and is tasked with processing full deduplicated backup transfer to protection storage.
    In this particular case, we are leveraging advanced NetWorker integration to EMC Protection Storage based on Data Domain via the DD Boost Transport over an IP or Fibre Channel network. Keep in mind however you could leverage any storage which is supported as a backend storage device by NetWorker, you just won't get all the DD Boost goodies. :)
  4. With the backup process complete on our proxy host, the NetWorker Snapshot Management module communicates with it's counterpart over on the production host. This essentially starts the "success" reporting chain back up the stack through NMSAP's backint and finally back to brbackup completing the backup cycle with the application. 

Given the integrated and coordinated nature of NetWorker managing this entire process with the application, NetWorker also offers the ability to manage snapshot retention as well, so you can truly gain the flexibility in RPO and RTO I mentioned in the third bullet item in my previous post.

The key with this approach is that we minimized the amount of processing and "tasks" added to the production host, and leveraged the full potential of our storage investment to move backups to the next level. While this was going on, your application end users didn't even notice! Trust me, when there is a disaster they will notice just how quickly you can get them back up and running again, it's only a matter of time.

Going back to my sailing analogy in my first post, the main goal here is to design a backup and recovery solution that like a regatta crew do everything possible to make the boat go faster, not slow it down. Looking for a backup solution that not only fully protects your mission critical application data, but does so in a manner that allows you to enhance your application SLAs (gaining boat speed if you will) is certainly advantageous.