Data Warehousing Course
This course will take your skill in data warehouse modeling to the next level, allowing you to handle very large datasets and to build data model which provide high query performance in both SMP and MPP environments.
I will demystify some of the “hand waving” that is often thrown around about data warehouse modeling. We will get into the modeling tricks that drive performance, agility, maintainability and optimal compression.
The method I teach is specifically designed to make the best use of modern hardware and keep you close to the actual implementation details; the stuff that truly matters when you get your hands dirty with data modeling. I will cover the following topics:
- Introducing Data Models and Agile Development
- Logical data models: What are they good for?
- Comparing Inmon and Kimball
- Why normalisation is bad for data warehouses, and what you can do about it
- Demystifying meta data: Meta Models and auto generated ETL
- Understanding modern hardware: Co-Located data in SMP and MPP systems
- Choosing between column and row stores
- Key handling, master data and mapping layers
- Tracking history in the warehouse
- Handling changing user requirements
- High speed data loading – and reloading
- Big Data and Data Warehouses
It is expected that you are familiar with the basics of dimensional modeling as described in Ralph Kimball’s book: “The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling”. Some basic familiarity with 3rd normal form data modeling and/or UML is an advantage.
Even if you subscribe to another modeling technique than the one I will describe in the course, these days will help you understand why some models work and others fail. You will also learn what types of modifications you need to make in the Dimensional model to counter the typical critique of this modeling approach for large environments.
The course covers data modeling for all database engines. During the course, I will use some examples from SQL Server, but what you learn can be use no matter which database you use.
The course lasts two days and costs 750 GBP/person. A minimum of 10 people have to attend the course for it to run.