Ch 15 & 16 Big Data - Exam IV

The flashcards below were created by user mjweston on FreezingBlue Flashcards.

  1. exploratory stage
    codifying stage
    integration stage
    stages of big data analysis
  2. exploratory stage
    stage of BD analysis in which patterns are searched for - requires integration of big data software - uncovers relationships - used to pare down data to only that which is applicable
  3. codifying stage
    stage of BD analysis in which data is integrated into the business process of the org
  4. integration & incorporation stage
    stage of BD analysis that is all about ETL - basic questions are answered about the results of the analysis & its connection to the business process of the org (who,what, where,ect.)
  5. high quality
    trust is essential
    data consistent across org
    basic fundamentals of data in BD integration
  6. data transformation
    the process of changing the format of data so that it can be used by different applications - also includes mapping instructions so apps ar told how to get the data they need to process
  7. ELT - Extract, Load, and Transform
    tools that can transform the data in the source or target database - faster and more scalable
  8. data quality
    is a moving target - less important when doing exploratory data analysis;; critical when using data for decision support
  9. siloing (siloing can't be the status quo - ALL data throughout the org must be integrated)
    where data throughout the org is not integrated, but separate (accounting doesn't matter to marketing, ect.)
  10. data in motion
    the ability to be alerted to and react to , an event as it is occuring - is one of the greatest challenges
  11. streaming
    complex event procesing
    available technologies that allow data in motion
  12. streaming data
    a continuous flow of unstructured data
  13. data at rest
    data that is in storage & not currently being used
  14. data in motion
    data that is either in transition (moving from one place to another, such as the twitter service into the org) or has been copied to memory for processing in real time)
  15. speed
    the foundation for streaming data
  16. streaming technology
    the data in motion technology that is closely tied to the volume of the data
  17. complex event processing
    the data in motion technology where the volume of data is secondary to the capability to match data to rules
  18. complex event processing (CEP)
    typically deals with a few variables that need to be correlated with a specific business process - dependent on data streams, but is not required for streaming data
  19. streaming data
    useful when analytics need to be done in real time while data is in motion - where the value of the analysis decreases with time
  20. a single-pass analysis - (an important factor about streaming data analysis)
    the analyst cannot reanalyze the data after it is streamed
  21. XML - eXtensibile Markup Language
    a technique for presenting unstructured text files with meaningful tags
  22. 1. IBM InfoSphere Streams - ability to analyze wide variety of data
    2. Twitter's Storm - open source & can be used with any programming language
    3. Apache S4 (simple, scalable, streaming system) - interacts w/ any programming language
    streaming data technologies
  23. complex event processing
    a technique for tracking, analyzing, & processing as the event happens - the results of analysis are aligned with appropriate business processes or actions (ex: high pressure in plant generates a shutdown order)
  24. streaming data - (has no real focus)
    CEP - Complex Event Processing - (leads to an action & can have multiple CEP's focusing on diff things)
    two processes that allow an org to align stragegy with big data processing to support rapid decision & action deployment
Card Set:
Ch 15 & 16 Big Data - Exam IV
2013-11-08 11:54:13
Integrating Data Sources Streaming CEP

Ingegrating Data Sources & Streaming & CEP
Show Answers: