Join us for Module 2: DML and Schema - Tuesday, May 31
-Create, Insert, Update, Delete, Merge
-Schema Enforcement and Evolution
This 3-part workshop is intended to teach you what Delta Lake is and how to use Apache Spark™ and Delta Lake in your data architectures for reliable large-scale distributed data pipelines. This course will show the features of Delta Lake that, alongside Spark SQL and Spark Structured Streaming, introduce ACID transactions and time travel (data versioning) to your ETL batch and streaming workloads. Slides, demos, exercises, and Q&A sessions should all together help you understand the concepts of the modern data lakehouse architecture.
-Sign up for Databricks Community Edition
-Participants are recommended to have experience with Apache Spark SQL and Python (PySpark)
Module 3: Tuesday, June 14: SQL and the Transaction Log
IT Freelancer for Apache Spark, Delta Lake, Apache Kafka & Kafka Streams
Jacek is an IT freelancer specializing in Apache Spark, Delta Lake, Apache Kafka (with brief forays into a wider data engineering space, e.g. Trino and ksqlDB, mostly during Warsaw Data Engineering meetups).
Jacek offers software development and consultancy services with very hands-on in-depth workshops and mentoring. He is best known by "The Internals Of" online books available free of charge at https://books.japila.pl/.
Denny Lee is a Developer Advocate at Databricks. He is a hands-on distributed systems and data sciences engineer with extensive experience developing internet-scale infrastructure, data platforms, and predictive analytics systems for both on-premise and cloud environments. He also has a Masters of Biomedical Informatics from Oregon Health and Sciences University and has architected and implemented powerful data solutions for enterprise Healthcare customers. His current technical focuses include Distributed Systems, Apache Spark, Deep Learning, Machine Learning, and Genomics.