Big Data Batch Processing & Data Warehouse over Hadoop

...
This is now a virtual classroom course. You can find more information about our virtual classroom here

Is your work heavily related to data? Are you looking for a career shift in Big Data, one of the hottest areas in the tech industry? If so, this is the course you’ve searching for!

You will have the opportunity to expand your knowledge on Big Data Engineering on a Hadoop installation with real data. During the course we will drive you through Big Data architectures and provide a hands on experience on Big Data integration, management and manipulation techniques and best practices in order to help you build your next generation Data management platform.

Total price:
490.00600.00 final price
Variation Title Price Add To Cart Quantity
Coming soon € 490

Is your work heavily related to data? Are you looking for a career shift in Big Data, one of the hottest areas in the tech industry? If so, this is the course you’ve searching for!

You will have the opportunity to expand your knowledge on Big Data Engineering on a Hadoop installation with real data. During the course we will drive you through Big Data architectures and provide a hands on experience on Big Data integration, management and manipulation techniques and best practices in order to help you build your next generation Data management platform.

Who should attend

IT Professionals interested in crossing over into the challenging development territory in the Big Data domain.

Prerequisites

  • SQL fluency
  • Python basics or some other programming language basics
  • Data Modelling for Transactional Databases (Normal Forms) or Data Warehouse schema design
  • Be comfortable with an ETL tool (eg Microsoft SSIS or similar) and ETL terminology

What will you learn

During the course you will cover the following topics:

    • General
      • Big Data Architecture Overview
        • Describe the Big Data landscape including examples of real world big data problems
        • Provide an explanation of the architectural components and programming models used for scalable batch  big data analysis
        • Summarize the features and value of core Hadoop stack components including the YARN resource and job management system, the HDFS file system ,the MapReduce programming model (briefly) and advanced SQL Engines.
    • Big Data Tools & Practices
      • Data Lake design to host the new Data Warehouse
      • Batch (re)processing
        • Retrieve data from example database and big data management systems
        • Describe the connections between data management operations and the big data processing patterns needed to utilize them in large-scale analytical applications
        • Identify when a big data problem needs data integration
        • Execute real world big data integration and processing using advanced SQL query engines and Spark
        • Select a data model to suit the characteristics of your data
        • Differentiate between a traditional Database Management System and a Big Data Management System
        • Implementation of a data problem using big data algorithm using spark versus SQL
      • Present and work with Unified Tools & User Interfaces like Cloudera’s Hue, Data Science  Workbench or Hortonworks’ Hive View
      • Integration with Other tools
        • Recognize different data elements in your own work and in everyday life problems learn how to integrate traditional database management systems with Big Data platforms
        • Explain why your team needs to design a Big Data Infrastructure Plan and Information System Design
        • Identify the frequent data operations required for various types of data
        • Build your own reports with Excel and PowerBI on real data
      • Data Engineers’ daily Workflows

    All course material will be taught on a reference hadoop installation and all users will be required to run and/or develop examples using the tools that we will make available.

Schedule

Next 2-day course is coming soon