0%

Introduction to Big Data

Overview

  1. What started the big data era
  2. Three main big data sources
  3. How to get value from big data
  4. Big data's characteristics
  5. 5 steps to process to gain value from big data
  6. The main elements of the Hadoop stack

What started the big data era

Data Torrent + Computing(Anytime and Anywhere)

Three main big data sources

  • Machines
  • People
  • Origanization

How to get value from big data

Value come from integrating different types of data sources

Data intergation

  1. Reduce data complexity
  2. Increase data availability
  3. Unify your data system

Big data's characteristics

  1. Volume (Size)
  2. Varity (Complexity)
  3. Valence (Connectedness)
  4. Veracity (Quality)
  5. Velocity (Speed)

5 steps to process to gain value from big data

  1. Acquire
  • Indentify data sets
  • Retrieve data
  • Query data
  1. Prepare Explore data
  • Understand the nature of data
  • Preliminary analysis Pre-process Data
  • clean
  • Integrate
  • Package
  1. Analyze
  • Select analytical techiques
  • Build models
  1. Report
  • Communicate results
  1. Act
  • Apply results

The main elements of the Hadoop stack

  1. Enable Scalability
  2. Handle Fault Tolerance
  3. Optimized for a Variety Data Types
  4. Facilitate a Shared Environnment
  5. Provide Value