Friday, August 16, 2013

NeverFail: Disaster Recovery shouldn’t run on late nights and coffee alone (Webinar)

Summary of Webinar presented by Neverfail: Speaker Josh Mazgelis

This webinar covered following three topics plus preview of new Neverfail product

  1. Dependency Mapping
  2. DR planning
  3. Flavors of protection
  4. Neverfail product

  1. Mapping is important
    1. Business view
      1. Managers only see the Business needs (Services)
      2. There is little to no between Business needs and Infrastructure Support
      3. Don't understand Infrastructure Support is like insurance, only see money vanishing
    2. How IT sees it
      1. Inventory of servers, hosts, storage
      2. Don't understand the Business needs only to keep everything running
      3. May/May not understand what services are important/critical
      4. Very difficult to know if Backup Continuity / Disaster Recovery (BC/DR) is sufficient
      5. Difficult to justify money for better BC/DR
    3. Results
      1. IT staff spends more time determining how things work
        1. late nights figuring out how things work
      2. IT staff spends more time reacting/repairing when services break
      3. Business wants to know why took so long to restore/recovery
      4. Business wants to know why data is missing from restore
  2. DR Planning
    1. Planning originates from Business needs
      1. Business owners need to identify key services
      2. Business owners need to define target SLA
        1. Not IT
    2. How to do this 
      1. Start with what services are important/critical
      2. Identify components that keep these service running
      3. Identify the dependencies that support components/services
    3. Build DR to support Business services
      1. Easier to justify spending for application dependencies
      2. Reduce spending on extraneous infrastructure
        • By determining what levels of SLA (typically 2-3 for a company) can break services down and adjust resources to provide the needed protection
  3. Understanding challenges
    1. Off-target Recovery Plans
      1. With virtualization became easy to spread basic protection across Virtual Infrastructure
        1. May not meet Recovery Plan Objective (RPO) / Recovery Time Objective (RTO)
        2. May over protect smaller systems thus wasting resources
        3. Hard to balance funds to protection
      2. Spot solutions
        1. Don't provide complete protection
          • May protect Oracle database extermly well but not the SharePoint application that allows retrival and input
            • Database availability means nothing if no application to use it
    2. Keeping up with changes
      1. Visualization accelerated the pace of change
        1. New servers come online within minutes to hours versus days to weeks
        2. Old servers become abandoned forgotten easier as they are not physical without monitoring virtual enviroment easy to feed "zombie" system vitila resources
        3. VM's easily move between hosts and protection schemes
          • One host may be connected to a SAN were other is not
      2. Rouge IT ushers in undocumented changes
        1. Business units create "Ghost IT" infrastructure
          • Mac mini with server applications
        2. Cloud services compliment or replace internal resources
          1. Dropbox
      3. There's just a whole lot going on
        1. It is difficult enough keeping up with projects and issue without worrying about BC/DR updates
    3. Knowing where you stand
      1. DR plan last thing considered
        1. Doing more with less truely means there is more that doesn't get done
        2. BC/DR plans are rarely updated as changes are made
      2. Even if you have a plan hard to know status
        1. BCDR consultants could be used but not always through enough to catch everything
        2. Testing plans (not when there is an emergency)
          • Monthly/Qtrly/Semi-Annual/somthing
    4. Recovery Tool Taste Testing
      1. Basic backup & recovery
      2. Replicate of VM images/stores
      3. Trad. server cluster
      4. Replication with stand-by & failover
    5. Every blend own characteristics
      1. Different RTO/RPO
      2. Protection from different kinds of failures
      3. widely varying cost to protection ratio
    6. Delicate Balance
      1. SLA 
        1. Business wants to meet certain SLA;s
        2. Increasing # of threats to business continuity
        3. Everyone's stuff important, hard to determine
      2. Budgetary constraints
        1. Face It: Protection costs money, time, resources
        2. Hard to justify expense
        3. Not everything is going to get unlimited protection
    7. Building a better Coffee Maker
      1. Understand business needs
        1. Reference actual business needs and requirements
        2. Estimate application cost of downtime
          • Easier to justify funding when you know the cost
        3. Fully map out service dependencies
          • Ensure a small server/service running on other server is not missed that breaks entire service
      2. Apply protection appropriately
        1. Good, Fast, Cheap (Pick any two)
          • Never going to find a perfect solution
      3. Monitor results
        1. Regular testing
        2. Develop automation
Neverfail IT Continuity Architect
  1. Dashboard
    1. Progress and Summary reports
      1. Automatically inventories and analyzes IT infrastructure
      2. Summarizes availability and likelihood to meet SLA's
    2. Multiple Heatmap views of Inventory
      1. color coded by analysis state, protection, or tier ranking
      2. Bigger boxes indicate more dependent entities
    3. Create and define Biz services (SLA)
      1. combine dependent entities
    4. Dependency Graphs
      1. view inbound and outbound dependencies for any entity
  2. Learn More
    1. IT Continuity Architect tech preview
    2. IT Continuity Architect introduction video
    3. IT Continuity Architect - Discovery and Dependencies

No comments:

Post a Comment