Site Reliability Engineering - Intro

by Maciej Jarosz
SRE, or Site Reliability Engineering, is an approach to solving operational and infrastructure problems. It started on Google and has been promoted by Google ever since. The main goal of the SRE approach is to produce reliable and highly scalable software systems. During the lecture, Maciej Jarosz will talk about several aspects of the SRE approach, such as the principles and practices of SRE: the concepts of "Service Level Objective" and "Error budget" or the concept of "Toil", i.e. repetitive manual work, and striving to reduce this type of work. We will also learn what is "Monitoring", "Observability" and "Service Level Indicators" and the concept of "Anti Fragility". Maciej will also talk about the models of adapting the SRE approach in the organization.

Additionally Required Software

  • Zoom