52 minutes | Feb 11th 2021

Episode 55: Making Apache Spark Developer-Friendly and Cost-Effective with Jean-Yves Stephan

Timestamps(2:07) JY discussed his college time studying Computer Science and Applied Math at Ecole Polytechnique — a leading French institute in science and technology.(3:04) JY reflected on time at Stanford getting a Master’s in Management Science and Engineering, where he served as a Teaching Assistant for CS 229 (Machine Learning) and CS 246 (Mining Massive Datasets).(6:14) JY walked over his ML engineering internship at LiveRamp — a data connectivity platform for the safe and effective use of data.(7:54) JY reflected on his next three years at Databricks, first as a software engineer and then as a tech lead for the Spark Infrastructure team.(10:00) JY unpacked the challenges of packaging/managing/monitoring Spark clusters and automating the launch of hundreds of thousands of nodes in the cloud every day.(14:48) JY shared the founding story behind Data Mechanics, whose mission is to give superpowers to the world's data engineers so they can make sense of their data and build applications at scale on top of it.(18:09) JY explained the three tenets of Data Mechanics: (1) managed and serverless, (2) integrated into clients’ workflows, and (3) built on top of open-source software (read the launch blog post).(22:06) JY unpacked the core concepts of Spark-On-Kubernetes and evaluated the benefits/drawbacks of this new deployment mode — as presented in “Pros and Cons of Running Apache Spark on Kubernetes.”(26:00) JY discussed Data Mechanics’ main improvements on the open-source version of Spark-On-Kubernetes — including an intuitive user interface, dynamic optimizations, integrations, and security — as explained in “Spark on Kubernetes Made Easy.”(28:35) JY went over Data Mechanics Delight, a customized Spark UI which was recently open-sourced.(35:40) JY shared the key ideas in his thought-leading piece on how to be successful with Apache Spark in 2021.(38:42) JY went over his experience going through the Y Combinator program in summer 2019.(40:56) JY reflected on the key decisions to get the first cohort of customers for Data Mechanics.(42:26) JY shared valuable hiring lessons for early-stage startup founders.(44:34) JY described the data and tech community in France.(47:19) Closing segment.His Contact InfoTwitterLinkedInData MechanicsHis Recommended ResourcesJure Leskovec (Associate Professor of Computer Science at Stanford / Chief Scientist at Pinterest)Jeff Bezos (Founder of Amazon)Matei Zaharia (CTO of Databricks and creator of Apache Spark)“Designing For Data-Intensive Applications” (by Martin Kleppmann)
Play Next