Is your work heavily related to data? Are you looking for a career shift in Big Data, one of the hottest areas in the tech industry? If so, this is the course you’ve searching for!
You will have the opportunity to expand your knowledge on Big Data Engineering on a Hadoop installation with real data. During the course we will drive you through Big Data architectures and provide a hands on experience on Big Data integration, management and manipulation techniques and best practices in order to help you build your next generation Data management platform.
Who should attend
IT Professionals interested in crossing over into the challenging development territory in the Big Data domain.
Prerequisites
- SQL fluency
- Python basics or some other programming language basics
- Data Modelling for Transactional Databases (Normal Forms) or Data Warehouse schema design
- Be comfortable with an ETL tool (eg Microsoft SSIS or similar) and ETL terminology
What will you learn
During the course you will cover the following topics:
-
- General
- Big Data Architecture Overview
- Describe the Big Data landscape including examples of real world big data problems
- Provide an explanation of the architectural components and programming models used for scalable batch big data analysis
- Summarize the features and value of core Hadoop stack components including the YARN resource and job management system, the HDFS file system ,the MapReduce programming model (briefly) and advanced SQL Engines.
- Big Data Architecture Overview
- Big Data Tools & Practices
- Data Lake design to host the new Data Warehouse
- Batch (re)processing
- Retrieve data from example database and big data management systems
- Describe the connections between data management operations and the big data processing patterns needed to utilize them in large-scale analytical applications
- Identify when a big data problem needs data integration
- Execute real world big data integration and processing using advanced SQL query engines and Spark
- Select a data model to suit the characteristics of your data
- Differentiate between a traditional Database Management System and a Big Data Management System
- Implementation of a data problem using big data algorithm using spark versus SQL
- Present and work with Unified Tools & User Interfaces like Cloudera’s Hue, Data Science Workbench or Hortonworks’ Hive View
- Integration with Other tools
- Recognize different data elements in your own work and in everyday life problems learn how to integrate traditional database management systems with Big Data platforms
- Explain why your team needs to design a Big Data Infrastructure Plan and Information System Design
- Identify the frequent data operations required for various types of data
- Build your own reports with Excel and PowerBI on real data
- Data Engineers’ daily Workflows
All course material will be taught on a reference hadoop installation and all users will be required to run and/or develop examples using the tools that we will make available.
- General
Schedule
Next 2-day course is coming soon