The flashcards below were created by user Anonymous on FreezingBlue Flashcards.

  1. What is Data?
    • Data is a collection of raw facts from which conclusions can be drawn.
    • Letters, photographs, movies, word documents, etc. are all examples of data.
  2. What are the two catagories of data?
    Structured and Unstructured
  3. Describe Structured Data
    Structured data is organized in rows and columns in a rigidly defined format so that applications can retrieve and process it efficiently.
  4. Describe Unstructured Data
    • Data is unstructured if its elements cannot be stored in rows and columns, and is therefore difficult to query and retrieve by business applications.
    • Examples of unstructured data are images, PDFs, documents, audio / video, email attachments, x-rays, etc.
  5. Define Information
    • Information is the intelligence and knowledge derived from data.
    • Examples of intelligence could be the buying habits of customers and the health histories of patients.
  6. What is the value of information to a business?
    • Identifying new business opportunities.
    • Identifying patterns that lead to changes in existing business.
    • Creating a competitive advantage.
  7. How is the type of storage to be used determined?
    The type of storage used is based on the type of data and the rate at which it is created and used.
  8. Describe RAID
    Redundant Array of Independent Disks. RAID is used in all storage architectures such as DAS, SAN and so on.
  9. Describe DAS?
    Direct Attached Storage. Connects directly to the server (host) or a group of servers in a cluster. Storage can either be internal or external to the server.External DAS alleviated the challenges of limited internal storage capacity.
  10. Describe SAN?
    Storage Area Network. This is a dedicated, high performance Fibre Channel (FC) network to facilitate Block Level communication between servers and storage. Storage
  11. is partitioned and assigned to a server for accessing its data.
  12. What are the benefits of SAN?
    SAN offers scalability, availability, performance and cost benefits compared to DAS.
  13. Describe NAS
    Network Attached Storage. Dedicated storage for File Serving applications. Connects to an existing communication network (LAN) and provided file access to
  14. heterogeneous clients.
  15. What are the benefits of NAS?
    NAS offers higher availability, scalability, performance and cost benefits compared to general purpose file servers.
  16. What is IP SAN?
    Internet Protocol Storage Area Network. One of the latest evolutions in storage architecture. IP SAN is a convergence of technologies used in SAN and NAS. It
  17. provides Block Level communication across a LAN or WAN resulting in greater consolidation and availability of data.
  18. What are the five core elements of Data Center Infrastructure?
    • Application / User Interface
    • Database (More commonly referred to as a Database Management System)
    • Server and Operating System
    • Network
    • Storage Array
  19. What are the seven key requirements for data center elements?
    • Performance
    • Availability
    • Scalability
    • Security
    • Data Integrity
    • Capacity
    • Manageability
  20. What are the four activities within the Information Life Cycle Management Process?
    • Classifying data
    • Implementing Policies
    • Managing the Environment
    • Organizing Storage Resources
    • **Classifying Data is the most difficult activity in the process**
  21. What are the benefits of implementing Information Life Cycle Management?
    • Improved Utilization
    • Simplified Management
    • Simplified Backup and Recovery
    • Maintaining Compliance
    • Lower Cost of Total Ownership
  22. What are the three most basic components of a storage system environment?
    • Host
    • Connectivity (Network)
    • Storage Array
  23. What are the physical components of a host?
    • CPU
    • Storage
    • Input / Output (I/O) Device
  24. What are the three methods of communication between I/O devices and the host?
    • User to Host (Keyboard, Mouse, etc.)
    • Host to Host (via Network Interface Card)
    • Host to Storage Device (via Host Bus Adapter)
  25. What are the logical components of a host?
    • Applications
    • Operating System
    • File System
    • Volume Manager
    • Device Drivers
    • **Note: Host Bus Adaptors interface on the back end**
  26. What are the logical components of a host?
    • Application
    • Operating System
  27. What are the two application data access classifications?
    • Block Level (Data stored and retrieved in Blocks specifying the LBA)
    • File Level (Data stored and retrieved by specifying the name and path of the files)
  28. Define Protocol
    A defined Format for communication between sending and receiving devices.
  29. What are the three major communication protocols for system components?
    • Tightly Connected Entities
    • Directly Attached Entities
    • Network Connected Entities
  30. Give three storage media options
    • Magnetic Tape
    • Optical Disks
    • Disk Drives
  31. What are the key components of a disk drive?
    • Platter
    • Spindle
    • Read / Write Head
    • Actuator Arm Assembly
    • Controller
    • **All of these items are housed in the Head Disk Assembly**
  32. What are the two ways of accessing data on a platter?
    • Cylender, Head, Sector (CHS)
    • Logical Block Addressing (LBA)
  33. What are the things that affect disk drive performance?
    • Electromechanical Device
    • Disk Service Type
  34. What are the components that comprise service time?
    • Seek Time
    • Rotational Latency
    • Data Transfer Rate
  35. What are the three seek time specifications?
    • Full Stroke
    • Average
    • Track to Track
  36. Define 'Little's Law'
    • It is the relationship between the number of requests in a queue and the response time.
    • N=a x R
    • N = Total number of requests in the system
    • a = The arrival rate
    • R = Average response time
  37. What does RAID provide?
    • Increased Capacity
    • Higher Availability
    • Increased Performance
  38. What are the components of a RAID Array?
    • Host
    • RAID Controller
    • RAID Array
    • Physical Array
    • Logical Array
    • Hard Disks
  39. What are the common RAID Levels?
    • 0
    • 1
    • Nested RAID
    • 3
    • 4
    • 5
    • 6
  40. Describe RAID 0
    A striped array with no fault tolerance.
  41. Describe RAID 1
    Disk Mirroring
  42. Describe Nested RAID
    • Combines the benefits of multiple RAID configurations.
    • 0+1: Striping & Mirroring. Commonly Called a Mirrored Stripe. The process of striping across HDDs is performed then the entire stripe is mirrored.
    • 1+0: Mirroring & Striping. Refferred to as a Striped Mirror. The incoming data is first mirrored and then both copies of data are striped across multiple HDDs.
  43. Describe RAID Parity
    • Parity is a method of protecting striped data from HDD failure without the cost of mirroring.
    • An additional HDD is added to the strip width to hold parity.
    • Parity is a mathematical construct that allows re-creation of the missing data.
    • It is a redundancy check that ensures full protection of data without maintaining a full set of duplicate data.
  44. Describe RAID 3
    • Stripes data for high performance and uses parity for improved fault tolerance.
    • Parity information is stored on a dedicated disk drive so that data can be re-constructed it a drive fails.
    • ALWAYS reads and writes complete stripes of data across all disks.
    • Provides good bandwidth for the transfer of large volumes of data.
    • Used in applications that involve large amounts of sequential data such as video streaming.
  45. Describe RAID 4
    • Stripes data for high performance.
    • Uses parity for improved fault tolerance.
    • Unlike RAID 3, disks in RAID 4 can be accessed independently so that specific data elements can be read or written on a single disk without read or write of the entire stripe.
  46. Describe RAID 5
    • Drive (strips) are independently accessible
    • Parity is distributed across all disks
    • Preferred for messaging, data mining, medium performance media serving and Relational Database Management System (RDBMS) implementations in which Database administrators (DBAs) optimize data access.
  47. Describe RAID 6
    • Dual Parity
    • Distributes parity across all disks
    • Can survive two disk failures
    • Rebuild operation may take longer due to the presence of two parity sets.
  48. What is a 'Hot Spare'?
    Refers to a spare HDD in a RAID array that temporarily replaces a failed HDD of a RAID set.
  49. What is EMC^2's Best practice concerning Hot Spares?
    For every two Disk Array Enclosures (DAE) one Hot Spare will be used.
  50. What is an intelligent Storage System?
    • RAID Arrays that are:
    • Highly optimized for I/O processing
    • Hove large amounts of cache for improving I/O performance
    • Have operating environments that provide:
    • Intelligence for managing cache
    • Array resource allocation
    • Connectivity for heterogeneous hosts
    • Advanced array based local and remote replication options
  51. What are the benefits of an intelligent storage system?
    • Increased capacity
    • improved performance
    • easier data management
    • improved data availability & protection
    • Enhanced business continuity & support
    • Improved security and access control
  52. What are the components of an intelligent storage system?
    • Front end
    • Cache
    • Back end
    • Physical disks
  53. What is the function of the 'Front End' in an intelligent storage system?
    • The front end provides the interface between the storage system and the host. It consists of two components:
    • Front End Ports
    • Front End Controllers
  54. What is the function of a front end port?
    • The front end ports enable hosts to connect to the intelligent storage system.
    • Each front end port has processing logic that executes the appropriate transport protocol, such as SCSI, FC or iSCSI for storage connections.
  55. What is the function of a front end controller?
    The front end controllers route data to and from cache via the internal data bus. When cache receives write data, the controller sends and acknowledgement message back to the host.Controllers optimize I/O processing by using command queuing algorithms.
  56. Describe command queuing
    Command queuing is a technique implemented on front end controllers. It determines the execution order of received commands and can reduce unnecessary drive head movements and improve disk head movements and improve disk performance.
  57. What are the most commonly used command queuing algorithms?
    • First in First Out (FIFO): Default algorithm where commands are executed in the order in which they are received.
    • Seek Time Optimization: Commands executed based on optimizing read /write head movements which may result in reordering of commands.
    • Access Time Optimization: Commands are executed based on the combination of seek time optimization and an analysis of rotational latency for optimal performance.
  58. Describe Cache
    Cache is semiconductor memory where data is placed temporarily to reduce the time required to service I/O requests from the host.
  59. Describe the ways that cache is implemented in write operations
    • Write Through: Data is placed into the cache and immediately written to disk, and an acknowledgement is sent to the host.
    • Write Back: Data is placed in the cache and an acknowledgement is sent to the host immediately.
  60. What is a Read Cache Hit?
    If the requested data is found in the cache it is called a read cache hit or a read hit and the data is sent to the host without any disk operation.
  61. What is a Cache Miss?
    If the requested data is not found in the cache, it is called a cache miss and the data must be read from disk.
  62. Describe two cache management algorithms implemented by intelligent storage systems to proactively maintain a free set of pages.
    Least Recently Used (LRU): An algorithm that continuously monitors data access in cache and identifies the cache pages that have not been accessed for a long time.

    • Most Recently Used (MRU): An algorithm that is the converse of LRU. In
    • MRU the pages that have been accessed most recently are freed up or
    • marked for reuse.
  63. Describe 'Watermarking' in cache management
    • Flushing is the process of commuting data from the cache to the disk. On the basis of the I/O access rate and pattern, high and low levels called Watermarks are set in cache to manage the flushing process. This process provides headroom in the write cache for improved performance. There are three watermarks:
    • -100%
    • - High Water Mark
    • - Low Water mark
  64. Describe 'Idle Flushing'
    Idle Flushing occurs continuously, at a modest rate, when the cache utilization level is between the high and low watermark.
  65. Describe 'High Watermark Flushing'
    Activated when cache utilization hits the high watermark. The system dedicates some additional resources to flushing. This type of flushing has minimal impact on host I/O processing.
  66. Describe 'Forced Flushing'
    • Forced Flushing occurs in the event of a large I/O burst when the cache
    • reaches 100% of its capacity, which significantly affects the I/O
    • response time. In Forced Flushing, dirty pages are forcibly flushed to disk.
  67. Describe two methods of Cache Data Protection
    The risk of losing data held in the cache can be mitigated by:
  68. Cache mirroring: Each write to cache is held in two different memory locations on two independent memory cards.
    Cache vaulting: A set of physical disks called vault drives are used to dump the contents of the the cache in the event of a power failure.
  69. In an intelligent storage system, what is the 'back end'?
    The back end provides the interface between the cache and physical disks. From the cache data is sent to the back end and then routed to the destination disk. the back end consists of two components:
  70. Back End Ports:
    Back End Controllers: Communicates with the disks when performing reads and writes and also provides additional, but limited temporary data storage.
  71. What is a LUN?
    Physical drives or groups of RAID Protected drives can be logically split into volumes known as Logical Unit Numbers (LUN). The use of LUNs improves disk utilization by only allocating the portion of disk space needed by the host thereby leaving the remainder of disk space to be allocated to other hosts.
  72. What is LUN Masking?
    • LUN Masking is an access control mechanism that provides data access control by defining which LUNs a host can access.
    • LUN masking is typically implimented at the front end controller.
    • LUN Masking ensures that volume access by servers is controlled appropriately, preventing unauthorized or accidental use in a distributed environment.
    • Usually implimented on staorage arrays.
  73. Describe the capabilities of a high end storage array?
    **Also referred to as Active - Active Arrays**
    • Large storage capacity
    • - Huge cache to service host I/Os
    • - Fault tolerance architecture
    • - Multiple front end ports and support to interface protocols - High scalability ability to handle large amounts of concurrent I/Os

    • **Symmetrix is an example of a high end storage system**
    • - Designed for large enterprises
  74. Describe the capabilities of a Midrange storage array
    **Also referred to as Active - Passive Arrays**
    • Host can perform I/Os to LUNs only through active paths
    • - Other paths remain passive until active path fails
    • - Have two controllers, each with cache, RAID controllers and disk drive interfaces
    • - Designed for small and medium enterprises
    • - Less scalable than a high end array

    **CLARiiON is an example**
  75. Describe the characteristics of the CLARiiON CX-4
    • Support for Ultraflex technology
    • Scalable up to 960 disks
    • Supports flash drives
    • Supports RAID 0,1, 1+0, 3, 5, 6
    • Supports up to 16GB of cache per controller (2 controllers = 32GB total)
    • Supports storage based local and remote data replication via SnapView (Local) and MirrorView(Remote)
    • CLARiiON Messaging Interface (CMI)
    • Stanby power supply
    • FLARE Storage Operating Environment
  76. Describe the characteristics of the Symmetrix DMX-4
    • Incrementally scalable to 2,400 disks
    • Dynamic global cache memory (16GB - 512GB)
    • Advanced processing power
    • High data processing bandwidth (up to 128 GB/s)
    • Supports RAID 1, 1+0 (AKA 10 for mainframe), 5, 6
    • Storage based local and remote replication through TimeFinder (Local) and SRDF (Remote)
    • Utilizes Direct Matrix Architecture
    • Each memory director connects to each front end director
    • Uses the Enginuity OS
  77. Describe the characteristics of the Symmetrix VMAX Series
    • 96 to 2,400 drives up to 2 PB (3x more usable capacity)
    • One to eight VMAX engines
    • Upt to 1TB global memory
    • Twice the host ports (FC, iSCSI, Gb Ethernet, FICON) up to 128 ports
    • 8Gb/s FC, FICON and FC SRDF
    • Twice the back end connections for flash
    • Quad core 2.3GHz processors to provide more than twice the IOPS
  78. What is DAS?
    • Direct Attached Storage is an architecture where storage connects directly to servers. Uses Block Level protocol for access.Internal HDD and tape libraries are examples of DAS
    • ***Can be internal or external***
  79. Describe Internal DAS
    • Internal DAS is internally connected to the host by a serial or parallel bus.
    • The physical bus has distance limitations and can only be sustained over short distances for high speed connectivity.Most internal buses can only support a limited number of devices
  80. Describe External DAS
    In External DAS Architectures, the server connects directly to the external storage device. In most cases, communication between the host and the storage device takes place over SCSI or FC protocol.External DAS overcomes distance and device count limitations of Internal DAS
  81. What are the benefits of DAS?
    • Ideal for data provisioning
    • Quick deployment for small environments
    • Simple to deploy
    • Reliable
    • Low capital expense
    • low complexity
  82. What are the four DAS connectivity options?
    • ATA and SATA
    • SCSI
    • FC
    • Buss and Tag (primarily for external mainframe)
  83. What are the two types of DAS Management?
    • Internal: Host provides disk partitioning and file system layout.
    • External: Array based management, lower TCO for managing data and storage infrastructure.
  84. What are some of the challenges of DAS?
    • Scalability is limited
    • Number of connectivity ports to hosts
    • number of addressable disks
    • distance limitations
    • Downtime is required for maintenance with internal DAS
    • Limited ability to share resources
    • Array front end port, storage space
    • resulting in islands of over and underutilized storage pools
  85. What is the definition of SCSI?
    Small Computer System Initiative. SCSI is all about an initiator sending a command to a target.
  86. What does SCSI communication involve?
    SCSI Initiator Device: Issues commands to SCSI target devices.SCSI Target Device: Executes commands issued by initiators.
  87. What are the versions of SCSI?
    • SCSI -1: Defined cable length, signaling characters, commands, and transfer modes, Uses 8-bit narrow bus (supoports 8 devices)
    • SCSI -2: Defined common Set (CCS), 16 bit, improved performance and reliability
    • SCSI -3: Latest version, comprised different but related standards, rather than one large document.
    • *Can support between 8 and 16 devices
  88. What is SCSI Addressing?
    Used to uniquely number (0-15) identify hosts and devices. the UNIX naming convention is used to identify a disk and the three identifiers - initiator ID, target ID, and a LUN.
  89. Structure and Organization of FC Data
    • Exchange Operation (conversation): enables two N_ports to identify and manage a set of information units.
    • Sequence (Sentence): refers to a contiguous set of frames that are sent from one port to another.
    • Frame (word): the fundamental unit of data transfer at Layer 2. *Each frame can contain up to 2,112 bytes of payload
  90. What SCSI ID has the highest priority?
  91. What is a SCSI Port?
    • SCSI ports are physical connectors that the SCSI cable plugs into for communication with a SCSI device.
    • SCSI device may contain initiator port, target port and target / initiator port.
    • To cater to service requests from multiple devices, a SCSI device may also have multiple ports.
  92. WWN
    World Wide Names: a unique 64-bit identifier which is static to the port. Used to physically identify ports.Like a NIC's MAC Address Every HBA has one Burned into an array port.
  93. What is SAN?
    • Storage Area Network. Is a dedicated high speed network for block level access. Carries data between servers (AKA Hosts) and storage devices through FC switches.
    • Provides Block Level data access.
    • Consolidates resources centralizing storage and management
    • Scalability (theoretical limit 15 million nodes)
    • Secure access
    • Fibre Channel Addressing is dynamically assigned during fabric login. Used to communicate between nodes within SAN. Like an IP Address on a NIC Address format: 24 bit, dynamically assigned
  94. What are the components of SAN?
    • A SAN consists of three basic components:
    • Servers
    • Network infrastructure
    • Storage
    • These components can be further broken down into the following key elements:
    • Node Ports
    • Cabling
    • Interconnecting Devices (such as FC switches or hubs)
    • Storage Arrays
    • SAN Management Software
  95. Fibre Channel Protocol Stack (5)
    • FC-4: Upper Layer protocol
    • FC-2: Transport Layer
    • FC-1: transmission layer
    • FC-0: physical interface
    • FC-3 has not been implemented
  96. Fiber Channel Architecture Overview:
    • Used channel technology
    • high performance with low protocol overheads
    • FCP is SCSI-3 over FC network
    • Has five layers
  97. What is Fibre Channel SAN and its components?
    • moves blocks of data over fibre optic cables using SCSI commands between initiator and target.
    • Components: director/switch, host (node), storage (node), cables, management software to control ports/switches.
  98. FLOGI
    Fabric Log In: between N-Port to F_port Between node and switch (switch/array or initiator/target) 1st in process
  99. What are the two types of optical cables?
    • Single Mode: Can carry single beams of light with a distance of up to 10 KM.
    • Multi Mode:Can carry multiple beams of light simultaneously at a distance of up to 500M.
    • (Note: multi mode cable can suffer from modal dispersion)
  100. PLOGI
    Port Login : between N_Port to N_Port (initiator to target initial contact) 2nd in process
  101. PRLI
    Process login (figure out how to talk by a common language - SCSI) 3rd in series
  102. What are the different types of SAN connectors?
    • Node Connectors:
    • Standard Connector (SC) Duplex Connectors
    • Lucent Connector (LC) Duplex Connector
  103. Patch Panel Connectors
    Straight Tip (ST) Simplex Connectors
  104. ISL
    • Inter Switch Links - connects two or more FC Switches to each other using E-Ports.
    • Used to transfer host to storage data as well as the fabric management traffic from one switch to another. Also one of the scaling mechanisms in SAN connectivity
  105. What are the different port types on SAN?
    • N_Port (node port): end point in the fabric to the switch.
    • NL_Port (node loop port): supports arbitrated loop topology. Goes into a HUB.
    • E_Port (expansion port): FC port that forms the connection between two FC Switches.
    • F_Port (fabric port): a port on a switch that connects an
    • FL_Port (public loop): a fabric port that participates in FC-AL. Connected to the NL_Ports on an FC-AL loop.
    • G_Port (generic port): can operate as an E_Port or an F_Port and determines its functionality automatically during initialization.
  106. What are the three commonly used SAN Interconnecting Devices?
    • Hubs: Physically connect nodes in a logical loop or a physical star topology.
    • Switches: More intelligent than hubs and directly route data from one physical port to another.
    • Directors: Departmental switch.
  107. Describe the SAN Interconnectivity Option called FC-SW?
    Fibre Channel switched fabric (FC-SW) - provides interconnected devices, dedicated bandwidth, and scalability.Also know as fabric connect.
  108. Describe the SAN Interconnectivity Option called FC-AL?
    Fibre Channel Arbitrated Loop (FC-AL): devices are attached to a shared loop. Devices on the loop must arbitrate to gain control of the loop. At any given time, only ONE device can perform I/O operations on the loop.
  109. What is the simpliest form of SAN Interconnectivity?
    Point to Point - two devices are connected directly to each other (like DAS).
  110. Describe SAN Management Software?
    • A suite of tools used in a SAN to manage the interface between host and storage arrays.
    • Provides integrated management of SAN environment.
    • Web based GUI or CLI
  111. What is Core-Edge Fabric? & What are the two types?
    • Two types of switch tiers - the edge tier (comprised of switches) and the core tier (enterprise directors)
    • Single Core: all hosts are connected to the edge tier and the core tier.
    • Dual Core: can be expanded to include more core switches - enables load balancing.
  112. Describe the Fabric Topology Mesh and name the different types
    • Each switch is connected directly to the other switches by using ISLs. Promotes enhanced SAN connectivity.
    • Full Mesh: every switch is connected to another switch in the topology - appropriate for a smaller # of switches (4).
    • Partial Mesh: several hops or ISLs may be required for traffic to reach its destination. Can cause latency issues.
  113. Describe the term Zoning in Fabric Management
    • Zoning is an FC switch funtion that enables nodes within the fabric to be logically segmented into groups that can communicated with each other.
    • Access control done on the switch or fabric vs. LUN masking which is done on the array
    • Setting up a relationship between initiator and target.
  114. What are the Storage Over IP protocol Options?
    • iSCSI:
    • Is SCSI over IP.
    • Has IP encapsulation
    • Hardware-based gateway to Fibre Channel Storage
    • Used to connect servers

    • FCIP:
    • Fibre Channel-to-IP bridge / tunnel
    • Point to point
    • Fibre Channel end points

    **Used in DR Implementations**
  115. What is iSCSI?
    An IP base protocol that establishes and manages connections between storage, hosts and bridge devices over IP.

    • Carries block level data over IP based networks, including Ethernet networks and the Internet packets
    • Is built on the iSCSI protocol by encapsulating SCSI commands and data in order to allow these encapsulated commands and data blocks to be transported using TCP/IP
  116. Describe the components of a Zone
  117. Members: nodes within the SAN that can be included in a zone
  118. Zones: comprise a set of members
  119. Zone Set: comprise of a group of zones that can be activated or deactivated as a single entity fabric
  120. *Only one zone set per fabric can be active at a time
  121. Describe the Types of Zoning
  122. sets up the relationship to set what initiator can see what target
  123. Port Zoning (hard zoning): uses FC addressing of the physical ports to define the zones (most secure - EMC general Practice).
  124. WWN zoning (soft zoning): uses world wide names to define zones.
  125. .
  126. What are the components of iSCSI?
  127. iSCSI host initiators:
    • Host computer using a NIC or iSCSI HBA to connect to storage
    • iSCSI initiator software may need tobe installed
    • iSCSI Targets:
    • Storage array with embedded iSCSI capable network port
    • FC-iSCSI bridge
    • LAN for IP Storage Network:
    • Interconnected Ethernet switches and / or routers
  128. What is NAS and what are the benefits?
  129. Network attached storage. It is an IP based file-sharing device attached to a local area network.
  130. Efficiency
    • Flexibility
    • Centralized storage
    • Simplifies management
    • Scalable
    • High Availability
    • Secure
  131. What are the iSCSI host connectivity options
  132. Software Initiators:
    • Code that can be loaded onto a host to provide the translation between the storage I/O calls and the network interface
    • TCP Offload Engine (TOE):
    • Moves the TCP processing load off the host CPU onto the NIC Card to free up processing cycles for application execution.
    • iSCSI HBA:
    • A network interface adapter with and integrated SCSI ASIC (Application Specific Integrated Circuit)
    • Simplest option for boot from SAN
  133. What are the component of NAS?
  134. NAS Head (CPU and Memory)
    • NIC Card(s)
    • Operating System to manage NAS functions
    • NFS (unix) and CIFS (microsoft)
    • Industry-standard storage protocols
    • Storage Array
  135. Describe the NAS File Sharing Protocols
  136. CIFS - Common Internet File System Protocol
  137. Microsoft Environment based on the server message block protocol
  138. NFS - Network File System Protocol
  139. UNIX environment file sharing protocol
  140. Describe the NAS I/O Process
  141. requester packages the I/O request into TC/IP to a remote file system which is handled by the NAS
    • The NAS converts the I/O into an appropriate physical storage request (block level I/O)
    • When the data is returned from the physical storage pool, the NAS processes and repackages it into a file protocol response.
    • The NAS packages this response into TCP/IP again and forwards it to the client through the network.
  142. What are the three iSCSI Topologies?
  143. Native Connectivity: Do not have and FC components; perform all communication over IP.
    • Bridged Connectivity: Enable the co-existance of FC with IP by providing iSCSI to FC bridging functionality.
    • Combining FCP and Native Connectivity
  144. Describe the types of NAS Implementations
  145. Integrated NAS: - has all components of NAS in a single enclosure. Connects to the IP network to provides connectivity to the clients and service the file I/O
  146. requests.
  147. Gateway NAS: has independent NAS head and one or more storage arrays (2 protocols)
  148. What are the two ways in which iSCSI discovery takes place?
  149. Send Targets Discovery:
    • Initiator is mutually configured with the target
    • Internet Storage Name Service (iSNS):
    • Initiators and targets automatically register themselves with iSNS server
    • iSNS is a client / server model
  150. Describe how Managing an Integrated System (NAS Connectivity) works
  151. Both the NAS component and the storage array are managed via NAS management software
  152. Describe managing a Gateway System (NAS)
  153. NAS component managed via NAS Management software and the storage array is managed via array management software
  154. What are the two types of iSCSI names?
  155. IQN: iSCSI Qualified Name
    • IQN (ex:
    • EUI: Extended Unique Identifier
    • (ex: eui.020234k2034j03D34)
  156. EMC Celerra
  157. Celerra is a dedicated high-performance infrastructure for FILE LEVEL I/Os
  158. Celerra NS40G (gateway NAS) - Celerra NS-960 (Integrated NAS)
  159. Consists of:
  160. Data movers (file servers in cabinet)
  161. Control Station (sets up data movers and initially configs them)
  162. Specialized OS - DART - Linux Red Hat
  163. Describe how to join the building blocks in Integrated NAS and Gateway NAS?
  164. Integrated: the system is assigned to dedicated NAS storage. No other SAN hosts connected to the storage - whole array is dedicated solely to NAS provisioning.
  165. Gateway: the NAS system is assigned separately apportioned storage within the array. Two separate sections for SAN and NAS.
  166. What is an FCIP Frame?
  167. Encapsulates FC frames in IP packets
    • FCIP router is used for encapsulation
    • FC Router at other end removes IP wrapper and sends FC data to other fabric
    • Includes security, data integrity, congestion and performance specifications
  168. What is Fibre Channel over Ethernet?
  169. A new protocol that maps Fibre Channel protocol naively over Ethernet.
    • Based on two new standards that are currently in active development:
    • FCoE standard, being developed by T11 Fibre Channel Interfaces Technical Committee
    • Enhanced Ethernet standard, being developed by the Ethernet IEEE Data Center Bridging Task Group
    • Enables the consolidation of SAN traffic and Ethernet traffic onto a common 10 Gigabit network infrastructure
    • FCoE requires jumbo frames (2180 byte) support to prevent a Fibre Channel frame from being split into two Ethernet frames
  170. Describe Lossless Ethernet
  171. To support Fibre Channel frames over Ethernet, no frames can be dropped throughout the entire transmission.
  172. No frame drop due to congestion or buffer overflow.
    PAUSE capability of Ethernet is used to achieve the lossless fabric.
  173. Describe the FCoE Physical Elements
  174. Host Interface: CNA (converged network adapter) - ex: PCIs card on host consolidates NICs and HBAs
  175. 10 Gbps connectivity options: either copper or standard optical
  176. Describe the benefits of FCoE
  177. Lower capital expenditure
    • Reduced power and cooling requirements
    • Enabler for consolidated network infrastructure
    • Lower TOC
  178. What is Virtualization?
  179. The technique of abstracting physical resources into a logical view.
  180. Increases utilization
    • Simplifies resource management
    • Reduces downtime (planned and unplanned)
    • Improved performance of IT resources
  181. What are the challenges of storing fixed content?
  182. Fixed content is growing at more than 90% annually.
    • New regulations require retention and data protection
    • Often, long term preservation is required
    • Simultaneous multi-user online access is preferable to online storage
    • Need faster access to fixed content
    • Traditional storage methods are inadequate
  183. What is a swap file (used in memory virtulization)?
  184. is a portion of the hard disk that functions like physical memory (RAM) to the operating system.
  185. Describe Network Virtualization
  186. creates virtual networks whereby each application sees its own logical network independent of the physical networks.
  187. EX: Virtual LAN (VLAN) - centralized configuration of devices
  188. What are the traditional storage solutions for archive?
  189. Three categories of archival solutions are:
  190. Online
    • Nearline
    • Offline
  191. Based on the means of access
    Traditional archival solutions were offline
  192. What is Server Virtualization?
  193. enables multiple operating systems and applications to run simultaneously on different virtual machines created on the same physical server (or group of servers).
  194. Provides a layer of abstraction between the OS and the underlying hardware.
  195. Any # of virtualized servers can be established.
  196. What is storage virtulization?
  197. Process of presenting a logical view of physical storage resources to hosts
  198. Logical storage appears and behaves as physical storage directly connected to host
  199. Examples: Host Based, LUN Creation (thin LUN), Tape
    • Benefits
    • Increased storage utilization
    • Adding or deleting storage without affecting apps
    • Non-disruptive data migration
  200. What are the shortcomings of traditional archival solutions?
  201. Tape is slow
    • Optical is expensive and requires vast amounts of media
    • Recovering files from tape and optical is often time consuming
    • Data on tape and optical is subject to media degradation
    • Both solutions require sophisticated media management
  202. What does SNIA Storage Virtulization Taxonomy provide?
  203. the Storage Networking Industry Association (SNIA) storage virtulization taxonomy provides a systematic classification of storage virtulization, with three levels:
  204. WHAT, WHERE, and HOW
  205. Specifies the types of virtulization:
  206. block
    • File
    • Disk
    • Tape
    • Any
  207. What is Content Addressed Storage?
  208. Object oriented, location-independent approach to data storage
    • Repository for the "objects"
    • Access mechanism to interface with repository
    • Globally unique identifiers provide access to objects
  209. What are the benefits of CAS?
  210. Content authenticity
    • Content integrity
    • Location independance
    • Single instance storage
    • Retention enforcement
    • Record level protection and disposition
    • Technology independence
    • Fast record retrieval
  211. Describe block level storage virtualization
  212. -Ties together multiple independent storage arrays.
  213. Presented to host as a single storage device
    • Mapping used to redirect I/O on this device to underlying physical arrays.
    • -Deployed in a SAN environment
  214. *Non-disruptive data mobility and data migration
  215. -Enable significant cost and resource optimization
  216. What are the Physical Elements of CAS?
  217. Storage devices (CAS based)
    • Storage Node
    • Access node
    • Servers (to which storage devices get connected)
    • Client
  218. Describe the Application Programming Interface (API)?
  219. A set of function calls that enables communication between applications or between an application and an operating system.
  220. Describe file-level virtualization
  221. addresses the NAS challenges by eliminating the dependencies between the data accessed at the file level and the location where the files are physically stored.
  222. Before virtualization, each NAS device of file server is physically and logically independent.
  223. Describe EMCs Invista
  224. provides block-level storage virtualization in heterogeneous storage environments. Supports dynamic volume mobility for volume extension and data migration between
  225. different storage tier without any downtime.
  226. What is a BLOB?
  227. The distinct bit sequence (DBS) of user data represents the actual content of a file and is independent of the file name and physical location.
  228. Describe the difference between CPC and DPC and how they see targets and initiators
  229. Control Path Cluster - storage device running invista and is located OUTSIDE of the data path (handles any requests which are NOT I/Os)
  230. Data Path Controller - special purpose SAN switch/blade which operates inside the data path and handles the I/O requests. If its not an I/O then it routes the request
  231. to the CPC.
    • What are the key Functions of a RAID Controller?
    • - Management and controll of disk Aggregations
    • - Translation of I/O requests between logical disks and physical disks.
    • - Data regeneration in the event of disk failures.
  232. Describe EMC Centera Architecture
  233. deals with the storage and retrieval of fixed content
  234. Based on RAIN (redundant array of independant node - access and storage)
  235. Linux OS, CentraStar sw to impliment CAS functions
  236. 1 TB of usable capacity in each node
  237. two 24-port 2 gigabit internal switches
  238. Is self healing
  239. What is a C-Clip?
  240. A package containing the user's data and associated metadata
    C-Clip ID is the CA that the system returns to the client application
  241. What data protection does the Centera Use?
  242. CCP - content protection parity
  243. CPM - Content protection mirrioring
  244. Describe Content Address
  245. An identifier that uniquely addresses the content of a file and not its location.
    Unlike location based addresses, content addresses are inherently stable and, once calculated, they never change and always refer to the same content
  246. Describe the C-Clip Descriptor File (CDF)
  247. The additional XML file that the system creates when making a C-Clip. This file includes the content addresses for all referenced BLOBs and associated metadata.
  248. What are the features of CAS?
  249. Integrity Checking
    • Data protection (local and remote)
    • Load balancing
    • scalability
    • Self - diagnosis and repair
    • Report generation and event notification
    • Fault tolerance
    • Audit trails
  250. How does CAS store a data object?
  251. End users present the data to be archived to the CAS API via an application
    • The API separates the actual data (BLOB) from the metadata and the CA is calculated from the object's binary representation.
    • The content address and metadata of the object are then inserted into the C-Clip Descriptor File (CDF)
    • The CAS system recalculates the object's CA as a validation step and stores the object.
    • An acknowledgement is sent to the API after a mirrored copy of the CDF and protected copy of the BLOB have been safely stored in the CAS system
    • Using the C-Clip ID, the application can read the data back from the CAS system.
  252. How does CAS retrieves a Data Object?
  253. The end user or an application requests an object
    • The application queries the local table of C-Clip IDs stored in the local storage and located the C-Clip ID for the requested object
    • Using the API, a retrieval request is sent along with the C-Clip ID to the CAS System
    • The CAS system delivers the requested information to the application, which in turn delivers it to the end user
  254. What is virtualization?
  255. Its a technique of abstracting physical resources in to a logical view.
  256. Benefits: increases utilization, simplifies resource management, reduces downtime, improved performance of IT resources
  257. What are the four forms of virtualization?
    Memory, storage, servers, and storage
  258. How does virtual memory work?
  259. makes an application appear as if it has its own contiguous logical memory independent of the existing physical memory resource.
  260. Done by virtual memory managers (VMM)
  261. Space used by VMMs on the disk is known as a swap file
  262. What is a SWAP file?
  263. the portion of the hard disk that functions like physical memory (RAM) to the operating system.
  264. - gives the illusion physical space on the memory
  265. How does Network Virtualization Work?
  266. createsvirtual networks whereby each application sees its own logical network independent of the physical network.
  267. EX: Virtual LAN (VLAN) - enables centralized configuration of devices located in the physically diverse locations.
  268. What are the benefits of Virtual Memory?
  269. Removed physical - memory limits
    Run multiple applications at once
  270. What are the benefits of Virtual Networks?
  271. Common network links with access-control properties of separate links
    • Manage logical networks instead of physical networks
    • Virtual SANs provide similar benefits for SANs
  272. How does Server Virtualization work?
  273. Enables multiple operating systems and applications to run simultaneously on different virtual machines created on the same physical server (or group of servers).
  274. Provide a layer of abstraction between the OS and the underlying hardware
  275. VMWare
  276. What is Business Continuity
  277. Preparing for, responding to and recovering from an application outage that adversely affects business operations.
    • Addresses unavailability and degrades application performance
    • An integrated and enterprise wide process and set of activities to ensure "information availability"
  278. What are the benefits of server virtualization?
  279. break dependencies between operating system and hardware
    • Manage OS and application as a single unit
    • Strong fault tolerance
    • Hardware - independent
  280. How does storage virtualization work?
  281. the process of presenting a logical view of the physical storage resources to a host. Appears and behaves as physical storage
  282. Examples:
  283. 1. Host-based volume management
  284. 2. LUN Creation (thin LUN)
  285. 3. Tape virtualization
  286. What are the benefits of storage virtualization?
  287. Increased storage utilization
    • Adding or deleting storage without affecting applications availability
    • Non-disruptive data migration - KEY
  288. Describe SNIA Storage Virtualization Taxonomy?
  289. Storage Networking Industry Association: provides a systematic classification of storage virtualization, with three levels - what, where, and how
  290. It specifies the types of virtualization: block, file, disk, tape, or any other devices.
  291. What is Information Availability (IA)?
  292. Refers to the ability of an infrastructure of function according to business expectations during its specified time of operation.
    • Can be defined in terms of three parameters:
    • Accessibility:(information should be accessible in the right place and to the right user
    • Reliaability: Information should be reliable and correct
    • Timeliness: Information must be available whenever required
  293. Describe the Multi-Level Approach to Storage Virtualization
  294. Server: path management, volume management, replication
  295. Storage Network: path redirection, load balancing - ISL trunking, Access control - zoning (ex - powerpath
  296. Storage: volume management - LUNs, access control (LUN Masking), replication, RAID
  297. What are the two types of storage virtualization configs?
  298. Out of Band - the virt. env. confi is stored externally to the data path - minimal latency
  299. In Band - implementation places the virtualization function inside the data path - additional latency
  300. What are some causes of information unavailability?
  301. Planned Outages (80%)
    • Unplanned Outages (20%)
    • Disaster (<1%)
  302. What is block-level storage virtualization?
  303. Ties together multiple independent storage arrays and presents them to the host as a single storage device. Mapping is used to direct the I/O on this device to
  304. underlying physical arrays
  305. Deployed in a SAN environment
  306. *Non-disruptive data mobility and data migration
  307. Cost reduction
  308. What are some of the impacts of Downtime?
  309. Lost productivity
    • Damaged Reputation
    • Lost Revenue
    • Financial Performance
    • Other expenses
  310. What is file level virualization?
  311. address the NAS challenges by eliminating the depenpendancies between the data accessed at the file level and the location where the files are physically storage.
  312. EX: RAIN Finity
  313. How is Information Availability Measured?
  314. IA=Uptime / (Uptime + Downtime)
    • Uptime = Mean Time Between Failure (MTBF)
    • Downtime = Mean Time to Repair (MTTR)
  315. EMC Invista
  316. Enables NON-DISRUPTIVE data migration. Provides block-level storage virtualization in heterogeneous storage environments.
  317. What are main hardware components of Invista?
  318. Control path cluster (CPC): stores configurations parameters OUTSIDE of the data path.
  319. Data Path Controller (DPD): special purpose SAN switch blade which routes I/Os INSIDE the data path. If its not an I/O then it sends it to the CPC
  320. What is Disaster Recovery?
  321. Coordinated process of restoring systems, data and infrastructure required to support ongoing business operations in the event of a disaster
    • Restoring previous copies of data and applying logs to that copy to bring it to a known point of consistency
    • Generally implies the use of backup technology
  322. What are the benefits of virtual provisioning?
  323. Reduce administrative costs (people)
    • Reduce storage costs by deploying assets as needed
    • Reduce operating costs (fewer disks)
    • Reduce downtime
  324. What is Disaster Restart?
  325. The process of restarting from disaster using mirrored consistent copies of data and applications
    Generally implies the use of replication technologies
  326. In virtual provisioning - what is thin pool expansion?
  327. Adding drives to a thin pool on disruptively increases available shared capacity for all the Thin LUNs in the pool
  328. Describe the "Cloud" Approach to Storage
  329. A cost effective approach to handling internet era data growth.
  330. Five requirements: infinite scale, no boundaries, operationally efficient, self-managing, self-healing
  331. What is Recovery Point Objective?
  332. A point in time to which systems and data must be recovered after an outage
    The amount of data loss that a business can endure
  333. Define cloud computing
  334. is an emerging IT development, deployment, and delivery model, enabling real time delivery of products, services and solutions over the Internet
  335. Services Include: Saas, PaaS, Iaas
  336. Ex: Google aps,
  337. What are the key attributes of Cloud Services?
  338. Offsite third party provided
  339. Accessed via Internet
  340. Minimal to no IT skills required to implement
  341. Provisioning
  342. Pricing
  343. User interface
  344. system interface
  345. Shared resources
  346. What is Recovery Time Objective?
  347. The time within which systems, applications or functions must be recovered after an outage.
    The ammount of downtime that a business can endure and survive.
  348. What is EMCs Cloud Infrastructure?
    Atmos - offers scalability, is policy based, and increases operational efficiency
  349. What is Backup?
  350. is an additional copy of data that can be used for restore and recovery purposes. Used when the primary copy is lost or corrupted.
  351. Can be created by:
  352. -Simply copying the data
  353. -Mirroring the data
    What are the elements of the Business Continuity Planning Process?
  354. Identify the critical business functions
    • Collecting data on various business processes within those functions
    • Business Impact Analysis (BIA)
    • Risk Analysis
    • Assessing, prioritizing, mitigating and managing risk
    • Designing and developing contingency plans and disaster recovery (DR) plan
    • Testing, training and maintenance
  355. What do organizations perform backups?
  356. 1. Disaster recovery
  357. 2. Operational - restore in the event of data loss or corruption during routine process
  358. 3. Archival - preserver transactions for business and/or regulatory compliance
  359. What needs to be considered before a Backup/Restore Solution is Implemented?
  360. Recovery Point Objective (RPO)
  361. Recovery Time Objective (RTO)
  362. Media type to be used
  363. Where and when the restore operations occur
  364. When to perform the backup
  365. The granularity of the backup (Full, Incr., Cum)
  366. How long to keep the backup
  367. Do you copy the backup
  368. Data - size and location of it
  369. What are the solutions and supporting technologies that enable business continuity and uninterrupted data availability?
  370. Single point of failure
    • Multi-pathing software
    • Backup and replication
    • Backup and recovery
    • Local replication
    • remote replication
  371. What are the three types of backup granularity?
  372. Full Backup - all data once per week
  373. Incremental - copies the data that has changed since the last full - faster but slower to get back data
  374. Cumulative - copies the data that has changed since the last full backup. takes longer - easier/faster to recover data
  375. Define Single Point of Failure
  376. The failure of a component that can terminate the availability of the entire system or IT service.
  377. What are the differnt types of Backup Methods?
  378. Cold - offline
    • Hot - online
    • Open File (either have to retry or have a SW agent)
    • Point in Time (PIT) Replica
    • Backup file metadata for consistency
    • Bare metal recovery
  379. What are some advantages of Multi-pathing Software?
  380. Configures multiple paths to increase data availability
    • Helps to recognize and and utilize alternate I/O paths to data
    • Provides load balancing to improve data path utilization
  381. Backup Architecture and Process
  382. Backup Client - sends backup data to backup server or storage node
    • Backup Sever - manages backup operations and maintains backup catalog
    • Storage Node - Responsible for writing data to backup device
  383. What are the steps in the backup operation?
  384. 1.Start of scheduled backup
  385. 2.Backup server retrieves backup related info from catalog
  386. 3(a) backup sever instructs storage node to load backup media in backup device
  387. 3b. Backup server instructs backup client to send its metadata to the backup server and data to be backed up to storage nodes
  388. 4. Backup clients send data to storage node
  389. 5. Storage node send media information to backup server
  390. 7. Backup server update catalog and records the status
  391. What is local replication?
  392. Data from the production devices (LUN) is copied to replica devices within the same array
    The replicas can then be used for restore operations in the event of data corruption or other events
  393. Describe Local Replication
  394. Data from the production devices is copied to replica devices on a remote array
    In the event of a failure, applications can continue to run from the target device
  395. What are the steps in the restore operation?
  396. 1. Backup server scans backup catalog to identify data to be restored and the client that will receive the data
  397. 2. Backup server instructs storage node to load backup media in backup device
  398. 3. Storage node then reads the data and sends to backup client
  399. 4. Storage node sends restore metadata to backup server
  400. 5. Backup server updates catalog
  401. Describe Backup / Restore
  402. Backup to tape has been a predominant method to ensure business continuity
    The frequency of backup is dependant on RPO / RTO requirements
  403. What are Direct Attached Backups?
  404. a backup device is attached directly to the client. Only the metadata is sent to the backup server through the LAN.
  405. - Frees LAN from backup traffic
  406. What are LAN Based Backups?
  407. all servers are connected to the LAN and all storage devices are directly attached to the storage node. The data to be backed up is transferred from the backup client
  408. (source), to the backup device (destination), over the LAN, which may affect network performance.
  409. - can minimize impact by config sep networks
  410. Describe some attributes of EMC PowerPath
  411. Host based software
    • Resides between the application and SCSI Device Driver
    • Provides intelligent I/O path management
    • Is transparent to the application
    • Automatic detection and recovery from host to array path failures
  412. What are SAN based backups?
  413. backup devices and clients attached to the SAN
  415. Mixed Backup (2 Clients)
    uses both LAN and SAN - the data goes through both the LAN and the FCSAN
  416. Describe an application server based backups
  417. the NAS head retrieves the data from storage over the network and transfers it to the backup client running on the application server. The backup client sends this
  418. data to a storage node, which in turn write the data to the backup device.
  419. -overloads the network
  420. Descride a severless backup in NAS
  421. the network share is mounted directly on the storage node. Avoids overloading the network during backup. Storage node acting as backup client - reads the data from the
  422. NAS head and write it to the backup device without involving the application server.
  423. NAS Backup - NDMP- 2 -way
  424. backup is sent directly from the NAS head to the backup device, while metadata is sent to the backup server.
  425. Network traffic is minimized by isolating data from the NAS head to the locally attached tape library. Only metadata is transported on the network.
  426. -uses special protocol
  427. NAS Backup - NDMP - 3 Way
  428. data is not transferred over the public network. A seperate private backup network must be established between all NAS heads and the "backup" NAS head to prevent any
  429. data transfer on the public network in order to avoid congestion.
  430. - uses two NAS heads
  431. - Private network
  432. - used in a multibuilding env (college campus)
    What are the benefits of backing up to tape?
  433. Traditional backup destination
    • Low cost
    • Portable
    • Sequential/linear access
    • Multiple streaming
  434. What are the limitations of backing up to tape?
  435. Reliability (restore performance)
    • Sequential access
    • can not be accessed by multiple hosts simultaneously
    • Needs a controlled environment
    • Wear and tear
    • Shipping/handling charges
    • Tape management challenges
    • Need to encrypt data
  436. What are the benefits of backing up to disk?
  437. Ease of implementation
    • Fast access
    • More reliable
    • Random access
    • Multiple hosts can access
    • Enhanced overall backup and recovery
  438. What is the recovery time in minutes?
  439. The time from point of failure to return of service to e-mail users
  440. What is a virtual library and its components?
  441. Its an array with special software - tape emulation engine. emulation SW has a database with a list of virtual tapes, and each virtual tape is assigned a portion of a
  442. LUN on the disk.
  443. How does EMCs Networker Work?
  444. enables simultaneous access operations to a volume, for both reads and writes, as opposed to a single operation with tapes by making a copy of the production LUN and
  445. do a backup from the copy
  446. - works within existing frameworks
  447. -Accelerates and centralized backup process
  448. The client generates tracking info and sends it to the server to facilitate point-in-time recoveries
  449. What is local replication and its uses?
  450. replicating data within the same array or the same data center.
  451. -alernate source for backup
  452. -Fast recovery
  453. -Decision support
  454. -Testing platform
  455. -Data migration
  456. What do you consider when you're going to replicate data?
  457. Types
  458. -Point-in time (PIT): non-zero RPO (how much data you can loose)
  459. -Continuous: near zero RPO
  460. What makes a replica good? - recoverability
  461. /re-startability and consistency
  462. What is consistency in terms of backup?
  463. is the primary requirement to ensure the usability of replica device.
  464. Can be achieved in various ways:
  465. For File System:
  466. -offline - un-mount file system
  467. -Online - flush hot buffers (space in memory)
  468. For Database:
  469. -Offline- shutdown database
  470. -Online- data in hot backup mode
  471. Describe flushing host buffer
    Flush memory (buffer) on the host before you make the copy. Done by the sync daemon (unix)
  472. What is the dependent write I/O Principle?
  473. Dependant Write: a write I/O that will not be issued by an application until a prior related write I/O has completed - LOGICAL dependency - NOT a time dependency
  474. -is inherent in all DBMS and is nessecary for protection against local outages
  475. What is the process of holding an I/O in database consistency?
  476. the process of quiescing the database.
  477. Steps:
  478. 1. hold I/O to all the devices at the same instant
  479. 2.Create the replica
  480. 3. Release the I/O
  481. What are the two local replication technologies?
  482. Host based array
  483. -logical volume manager (LVM) based mirroring
  484. -File system Snapshot
  485. Storage Array Based
  486. -Full volume mirroring
  487. -Pointer based full volume replication
  488. -Pointer based virtual replication
  489. What is LVM Based Mirroring?
  490. the LVM is responsible for creating and controlling the host-level logical volume. Components: physical volumes (disk), volume groups, and logical volumes.
  491. Each logical partition in a logical volume is mapped two physical partitions on two different physical volumes.
  492. What is File System Snapshot?
  493. Is a pointer based replica that requires a friction of the space used by the original file system.
  494. -Uses copy on first write (COFW) principle
  495. -Uses bitmap (to track the blocks that have changed on the production/source FS after creation of snap - initially all zero
  496. -Block Map: used to indicate block address from which data is to be read when the data is accessed from the Snap FS - initially points to production/source FS
  497. -Requires a fraction of the space
  498. What are the limitations of host based replications?
  499. -LVM based replicas add overhead on host CPUs
  500. -If host volumes are already storage array LUNs then the added redundancy provided by LVM mirroring is unnecessary
  501. -Host based replicas can be usually presented back to the same server
  502. -Keeping track of changes is a challenge after the replica has been created
  503. Describe how a storage array based local replication works?
  504. -Replication is performed by the Array Operating Environment
  505. -Replicas are on the same array
  506. Types:
  507. -Full-volume mirroring
  508. -Pointer-based full volume replication - Clone
  509. -Pointer -based virtual replication - snap
    What is pointer based full volume replication?
  510. A clone
  511. -Provides a full copy of the source data on the target
  512. -Target device is made accessible for business operation as soon as the replication session is started
  513. -Point-in- time is determined by the time of session activation
  514. Two modes: Copy On First Access (COFA) and Full Copy Mode
  515. -Clone will be the same size of larger
  516. Describe detached full volume mirroring
  517. After synchronization is complete, the target can be detached from the source and made available for BC operations.
  518. -PIT is determined by the time of detachment
  519. -After detachment, re-synchronization can be incremental
    What is COFA?
  520. Copy On First Access - Deferred Mode - not a full clone
  521. Primarily used for testing and development
  522. Data is copied from the source to the target on when:
  523. -A write is issued for the first time after the PIT to a specific address on the source
  524. -A read or write is issued for the first time after the PIT to a specific address on the target.
  525. What is full copy mode?
  526. On session start, the entire contents of the source device is copied to the Targer device in the background.
  527. -most vendors also provide SW to track d changes made to the source or target
  528. What is pointer based virtual replication?
  529. SNAPS
  530. -Targets do not hold actual data, but hold pointers to where the data is located.
  531. -A replication session is setup between source and target device. Target devices are accessible immediately when session is started
  532. -Moves data into a resource LUN Pool (RLP)
  533. How are changes tracked in a database after PIT has been created?
    Done using bitmaps. The bits in the source and target bitmaps are all set to 0 when the replica is created. Any changes to the source or target are then flagged by
  534. setting the appropriate bits to 1 in the bitmap.
  535. What are the two methods of Restore/Restart Operations?
  536. Restore the data from the target source (done incrementally, apps can be restarted before sync is complete
  537. or
  538. Start production on the target (must copy target before you start production)
  539. What are some considerations of Restore/Restart?
  540. Before a restore: stop all access to the source and target. Based on RPO and data consistency identify target for restore, then perform restore.
  541. Before starting production from Target: stop all access, identify target based on RPO, create a "gold" copy of target, start production on target
  542. Restore/Restart Considerations for Pointer Based Full Volume (clone) and Virtual (snap) Replications
  543. Clone - restores can be performed to either the original source device or to any other device of like size
  544. Snap - can be performed to the original source or to any other device of like size as long as the original source is healthy
  545. Describe Local Replication Management on the Array
  546. - Replication management software resides on the storage array
  547. -Provides an interface for easy and reliable replication management
  548. -Two types of interfaces: command line (CLI) and GUI
  549. What are EMCs local replication Solutions?
  550. Symmetrix Arrays
  551. -TimeFinder/Clone (full)/Mirror(full)/Snap(pointer)
  552. CLARiiON Arrays
  553. SnapView Clone (full) and Snapshot (pointer)
  554. What is Remote Replicating?
  555. the process of creating replicas of information assets at remote sites.
  556. What is synchronous Replication?
  557. data is committed at both the source site and the target site before the write is acknowledged to the host. Any write to the source must be transmitted to and
  558. acknowledged by the target before signaling a write complete to the host.
  559. - Provides the zero RPO and low RTO
  560. What are the challenges of synchronous replication?
  561. - Response time extension for applications (data must trans to target site before write can be acknowledged)
  562. -Bandwith - needs high bandwith
  563. -Rarely deployed beyond 200 Km (125 miles)
  564. What does asynchronous replication?
  565. - a write is committed to the source and immediately acknowledged to the host
  566. -Data is buffered at the source and transmitted to the remote site later
  567. - Finite RPO (replica will be behind by a little)
  568. -The writes are timed stamped and applied in the order it was received. The written to target.
  569. -Needs average bandwidth
  570. -can be deployed over the long distance
  571. What are the two remote replication technologies?
  572. Host Based: Logical Volume Manager (LVM) and supports both synchronous and asynchronous mode and logs shipping
  573. Storage Array Based: support both synchronous and asynchronous mode. Disk buffered - consistent PITs - combines local and remote replication
  574. LVM Based Replication
  575. is performed and managed at the volume group level. Writes to the source volume are transmitted to the remote host by LVM. The LVM on the remote host receives the
  576. writes and commits them to the remote volume group.
  577. - created at the source
  578. What are the advantages and disadvantages of LVM?
  579. Adv - different storage arrays and RAID protection can be used at the source and target sites.
  580. -Response time issues can be eliminated with asynchronous mode, with extended RPO.
  581. Disadvantage - Extended network outages require large log files and results in higher CPU overhead on host.
  582. What is host based log shipping?
  583. Transactions to the source database are capture in logs, which are periodically transmitted by the source host to the remote host. The remote host rec. the logs and
  584. applies then to the remote database.
  585. - Advantages: minimal CPU, low bandwidth, standby database consistent to last applied log
  586. What is storage array based remote replication?
  587. -Replication is performed by the array operating environment so that host CPU resources can be devoted to production, arrays communicate over dedicated channels
  588. -Replicas are on different arrays. Most used for disaster recovery.
  589. How does array based synchronous replication work?
  590. 1. Write is rec. by the source array from the host
  591. 2. Write is transmitted by the source array to the target array
  592. 3. Target array sends acknowledgement to the source array
  593. 4. Source array signals write complete to host
  594. How does array based asynchronous replication work?
  595. 1. Write is received by the source array from the host
  596. 2. Soure array signals write complete to host
  597. 3. Write is transmitted by source array to the target array
  598. 4. Target array send acknowledgement to the source array
  599. - no impact on response time, extended distances between arrays, lower bandwidth
  600. How do you ensure consistency in asynchronous replication?
  601. You can maintain a write order - attach a time stamp
  602. or
  603. Dependent write consistency (buffer the writes in the cache of the source array for a period of time)
  604. What is Array based Disk Buffered Replication?
  605. local and remote replication technologies can be combined to create consistent PIT copies of data on target arrays.
  606. -RPO is in HOURS
  607. -Lower bandwidth is required
  608. -Extended distance
  609. What is three site replication
  610. and what are the two types?
  611. -there is a bunker site between replication sites.
  612. 1. Cascade/Multihop
  613. 2. Triangle/Multi-taget - SRDF/Star in Symmertrix - concurrent replication of source to two different arrays.
  614. Dscribe SAN Based Remote Replication
  615. -Replicate from one storage array to any other storage array over SAN/WAN
  616. - can implement tiered storage, do data migrations, and remote vault
  617. -heterogeneous array support
  618. -No impact on LAN or servers
  619. What are the terminologies in SAN Based Replication?
  620. Control Array: responsible for replication operations
  621. Remote Array: to/from which data is being replicated
  622. Operation (2):
  623. -Push: data is pushed from control array to remote array
  624. -Pull: data is pulled to the control array from remote array
  625. * the names control/remote DO NOT indicate the direction of data flow, they only indicate which operation is being performed
  626. What are the network options for Remote Replication?
  627. a dedicated or a shared network must be in place for remote replication
  628. -uses an optical network for for extended distances: DWDM and SONET
  629. What is DWDM?
  630. Dense wavelength division multiplexing (DWDM) - puts data from different sources together on an optical fiber with each signal carried on its own separate light
  631. wavelength.
  632. - up to 32 protected and 64 unprotected separate wavelengths of data can be multiplexed into a light stream transmitted on a single optical fiber.
  633. What is SONET?
  634. -Synchronous Optical Network is Time Division Multiplexing (TDM) technology
  635. -Implemented over long distances
  636. What are the types of EMC remote replication?
  637. Symmetrix Arrays: SRDF/Synchronous and Asynchronous/Automatic Replication
  638. CLARiiON Arrays: MirrorView (synchronous/asynchronous)
  639. SAN Copy: SAN based remote replication solution for EMC CLARiiON
  640. What is Storage Security?
  641. The application of security principles and practices to storage networking (data storage + networking) technologies.
    • The focus is secured access to information
    • Begins with building a framework
  642. Describe a Storage Security Framework
  643. A systemic way of defining security requirements
    • The framework should incorporate:
    • Anticipated security attacks
    • Actions that comprise the security information
    • Security measures
    • Control designed to protect from these security attacks
  644. What are the attributes of a Storage Security Framework?
  645. Confidentiality
    • Provides the required secrecy of information
    • Ensures only authorized users have access to data
    • Integrity of data
    • Ensures that the information is unaltered
    • Availability of data
    • Ensures that authorized users have reliable and timely access to data
    • Accountability
    • Accounting for all events and operations that take place in the data center infrastructure that can be audited or traced later
    • Help to uniquely identify the actor that performed the action
  646. Define the Risk Triad
  647. Referrs to the risk in terms of threats, assets and vulnerabilies.
  648. In terms of security, what are the most important assets for any organization?
  649. Information is one of the most important assets for any organization
  650. Other assets include hardware, software and network infrastructure
  651. What are some considerations for security mechanisms?
  652. It must provide easy access to information assets for authorized users
    • Make it difficult for potential attackers to access and compromise the system
    • It should only cost a fraction of the value of the protected asset
    • It should cost a potential attacker more, in terms of money and time
  653. What are the two types of attacks that can be carried out on an IT infrastructure?
  654. Active Attacks:
    • Data Modification, Denial of Service (DoS), Repudiation attacks
    • Passive Attacks:
    • Attempts to gain unauthorized access to the system
    • Threats to the confidentiality of information
  655. Where can vulnerabilities occur in an information system?
  656. Vulnerabilities can occur anywhere in a system.
  657. An attacker can bypass controls implemented at a single point in the system
    • Failure anywhere in the system can jeopardize the security of information assets
    • Loss of authenticationmay jeopardize confidentiality
    • Loss of a devicejeopardizes availability
    • Requires Defense in Depth
  658. What is Defense in Depth?
  659. The practice of protecting all access points within an environment.
  660. Reduces vulnerability to an attacker who can gain access to storage resources by bypassing inadequate security controls implemented at the vulnerable single point
  661. of access
  662. What are three factors to consider when assessing the extent to which an environment is vulnerable to security threats?
  663. Attack Surface: The various entry points that an attacker can use to launch an attack
    • Attack Vector: A step in a series of steps necessary to complete an attack
    • Work Factor: The amount of time and effort required to exploit an attack vector.
  664. In terms of security vulnerabilities, what are some of the solutions to protect critical assets?
  665. Minimize the attack surface
    • Maximize the work factor
    • Manage vulnerabilities
    • Detect and remove vulnerabilities
    • Install countermeasures to lessen impact
  666. What are some technical countermeasures to network vulnerabilities?
  667. Implementations in computer hardware, software and firmware
  668. What are some non-technical countermeasures to network vulnerability?
  669. Administrative Policies and Standards
    • Physical Standards
    • Guards
    • Gates
  670. What are the three Security Domains?
  671. Application: Involves access to stored data through the storage network
    • Management: Involves access to storage and interconnect devices and to the data residing on those devices
    • Backup & Data Storage: BURA access
  672. What does BURA stand for?
    Backup, Recovery and Archive
  673. What are some of the threats in the Application Access Domain?
  674. Spoofing user / host identity
    Elevation of user / host privileges
  675. What are some of the threats in the Management Access Domain?
  676. Spoofing user / administrator identity
    • Elevation of user / administrator privileges
    • Tampering with Data
    • Denial of Service
    • Network snooping
  677. What are some of the threats in the BURA Domain?
  678. Spoofing of User / Administrator Identity
    • Elevation of User / Administrator privilege
    • Tampering with Data
    • Denial of Service
    • Network snooping
  679. What are some of the Security Controls used in Storage Infrastructure?
  680. User Authentication
    • User Authorization
    • Host and Storage authentication
    • Access Control to Storage Objects
    • Storage Access Monitoring
    • Infrastructure Integrity
    • Storage Network Encryption
    • Management Network Encryption
    • Management Access Control
    • Primary to secondary Access Control
    • Backup Encryption
    • Replication Network Encryption
  681. What are some of the security implementations in SAN?
  682. Traditional FC SANs are configured as an isolated private network making them inherently more secure
    • However storage consolidation has lead to larger SAN designs that span multiple sites across enterprises
    • This has led to the creation of Authenticating FC Entities
    • Setting up Session Keys
  683. What are some of the basic SAN Security Mechanisms?
  684. Array Based Volume Access Controls
    • Security on FC Switch Ports
    • Switch-Wide and Fabric-Wide Access Control
    • Logical Partitioning of a fabric: Vertual SAN (VSAN)
  685. What is Array Based Volume Access Control?
  686. LUN Masking: Filters the list of LUNS that an HBA can access
    • S_ID Lockdown (EMC Symmetrix Arrays): Stronger variant of masking
    • Port Zoning: Zone Member is of the form (Switch_Domain_ID_Port_Number).
    • Mitigates against WWPN spoofing attacks and route based attacks
  687. What are some security measures on FC Switch Ports?
  688. Port Binding: Limits devices that can attach to a particular switch port
    • A node must be connected to its corresponding switch port for fabric access
    • Mitigates but does not eliminate WWPN spoofing
    • Port Lockdown / Port Lockout: Restricts the type of initialization of a switch port
    • Typical variants include:
    • Port cannot function as an E-Port, cannot be used for ISL
    • Port role is restricted to just FL-Port, F-Port, E-Port or some combination
    • Persistent Port Disable: Prevents a switch port from being enabled, even after a port disable
  689. What are some of the components of Switch-Wide and Fabric-Wide Access Control?
  690. Access Control Lists (ACL):
    • Device Connection Control- prevents unauthorized devices (identified by WWPN) from accessing the fabric
    • Switch Connection Control - Prevents unauthorized switches (identified by WWPN) from joining the fabric
    • Fabric Binding: Prevents an unauthorized switch from joining any existing switch in the fabric
    • Role Based Access Control (RBAC): Specifies which user can have access to which device in a fabric
  691. What is Logical partitioning of a fabric?
  692. Deviding a physical topology into seperate logical fabrics.
  693. Administrator allocates switch ports to different VSANs
    • A switch port (and the HBA or storage port assigned to it) can be in only one VSAN at a time
    • Each VSAN has its own distinct active zone set and zones
    • Fabric events in one VSAN are not propagated to the others
    • Role based Management
  694. What are some authentication and authorization mechanisms used in NAS?
  695. NAS is open to multiple exploits, including viruses, worms, unauthorized acess, snooping and data tampering.
  696. Kerberos and Directory Services
    • Identity verification
    • Firewalls
    • Protection from unauthorized access and malicious attacks
  697. What are the types of Windows Access Control Lists used in NAS?
  698. Discretionary ACLs
    • Commonly referred to as ACL
    • Used to determine access control
    • System ACL (SACLs)
    • Determines what accesses need to be audited if auditing is enabled
  699. Describe the UNIX Permissions used in NAS file sharing
  700. User
    • Permissions tell UNIX what can be done with that file and by whom
    • Every file and directory (folder) has three access permissions
    • Rights for the file owner
    • Rights for the group you belong to
    • Rights for all others in the facility
    • File directory permission looks
  701. What is CHAP?
  702. Challenge Handshake Authentication Protocol.
  703. A basic authentication mechanism that has been widely adopted by network devices and hosts
    • Implemented as:
    • One Way - Authentication Password configured on only on e side of the connection
    • Two Way - Authentication password is configured on both sides of the connection requiring both nodes to validate the connection
  704. What is iSNS Discovery Domain?
  705. Internet Storage Name Server
  706. Functions the same way as FC Zones
    • Provides functional groupings of devices in an IP-SAN
    • In order for devices to communicate with one another, they must be configured in the same discovery domain
  707. What are the major storage infrastructure components that should be monitored?
  708. Servers, Databases and Applications
    • Network (SAN & IP)
    • Storage Arrays
  709. Of the major storage components of an information storage infrastructure that should be monitored, what should they be monitored for?
  710. Capacity
    • Accessibility
    • Performance
    • Security
  711. What are some of the consequences of an Array Port Failure?
  712. If one of the Storage Array Ports fails, all of the storage volumes that are accessed through the switch connected to that port may become unavailable, depending on
  713. the type of array.
  714. What are the levels of alerts used in monitoring?
  715. Information Alert - Provides useful information and may not require administrator intervention
    • Warning Alerts - Require administrator attention
    • Fatal Alert - Require immediate administrative attention
  716. What are some ways to ensure high availability in storage infrastructures?
  717. **Eliminate Single Points of Failure**
  718. Two or more:
    • HBAs
    • Switches
    • Fabrics
    • Multipathing software with failover capability
    • RAID Protection
    • Redundant Fabrics
    • Configuring data backup and replication
    • Deploying a virtualized environment
  719. What is Performance Management?
  720. Ensures the optimal operational efficiency of all components.
  721. Performance analysis is an important activity that helps identify the performance of storage infrastructure components
  722. Describe the architecture of the EMC Control Center
  723. EMC Control Center Storage Management Suite provides and end to end integrated approach for dealing with multi-vendor storage reporting, monitoring and configuration
  724. and control tasks. It is made up of three tiers:
  725. User interface Tier - Displays Data
    • Infrastructure Tier - Processes Data
    • Agent Tier - Collects Data
Card Set:
2012-04-18 01:12:25

Show Answers: