SRE PractitionerSM Exam Study Guide

SRE Practitioner

Exam Study Guide

DevOps Institute is dedicated to advancing the human elements of DevOps success.

We fulfill our mission through our SKIL framework of Skills, Knowledge, Ideas and

Learning.

Certification is one means of showcasing your skills. While we strongly support formal

training as the best learning experience and method for certification preparation,

DevOps Institute also recognizes that humans learn in different ways from different

resources and experiences. As the defacto certification body for DevOps, DevOps

Institute has now removed the barrier to certification by removing formal training

prerequisites and opening our testing program to anyone who believes that they have

the topical knowledge and experience to pass one or more of our certification exams.

This examination study guide will help test-takers prepare by defining the scope of the

exam and includes the following:

● Course Description

● Examination Requirements

● DevOps Glossary of Terms

● Value Added Resources

● Sample Exam(s) with Answer Key

These assets provide a guideline for the topics, concepts, vocabulary and definitions

that the exam candidate is expected to know and understand in order to pass the

exam. The knowledge itself will need to be gained on its own or through training by

one of our Global Education Partners.

Test-takers who successfully pass the exam will also receive a certificate and digital

badge from DevOps Institute, acknowledging their achievement, that can be shared

with their professional online networks.

If you have any questions, please contact our DevOps Institute Customer Service team

at CustomerService@DevOpsInstitute.com

Site Reliability Engineering (SRE) Practitioner

Course Description

DURATION - 24 Hours

Introduces a range of practices for advancing service reliability engineering through a mixture

of automation, organizational ways of working and business alignment. Tailored for those

focused on large-scale service scalability and reliability.

OVERVIEW

The SRE (Site Reliability Engineering) Practitioner course introduces ways to economically and

reliably scale services in an organization. It explores strategies to improve agility,

cross-functional collaboration and transparency of health of services towards building resiliency

by design, automation and closed loop remediations.

The course aims to equip participants with the practices, methods, and tools to engage people

across the organization involved in reliability through the use of real-life scenarios and case

stories. Upon completion of the course, participants will have tangible takeaways to leverage

when back in the oﬃce such as implementing SRE models that ﬁt their organizational context,

building advanced observability in distributed systems, building resiliency by design and

effective incident responses using SRE practices.

The course is developed by leveraging key SRE sources, engaging with thought-leaders in the

SRE space and working with organizations embracing SRE to extract real-life best practices and

has been designed to teach the key principles & practices necessary for starting SRE adoption.

This course positions learners to successfully complete the SRE Practitioner certiﬁcation exam.

COURSE OBJECTIVES

At the end of the course, the following learning objectives are expected to be achieved:

1. Practical view of how to successfully implement a ﬂourishing SRE culture in your

organization.

2. The underlying principles of SRE and an understanding of what it is not in terms of

anti-patterns, and how do you become aware of them to avoid them.

3. The organizational impact of introducing SRE.

4. Acing the art of SLIs and SLOs in a distributed ecosystem, and extending the usage of Error

Budgets beyond the normal to innovate and avoid risks.

5. Building security and resilience by design in a distributed, zero-trust environment.

6. How do you implement full stack observability, distributed tracing and bring about an

Observability-driven development culture?

©DevOps Institute SREP v1.0 Course Description July 2021

7. Curating data using AI to move from reactive to proactive and predictive incident

management. Also, how do you use DataOps to build clean data lineage.

8. Why is Platform Engineering so important in building consistency and predictability of SRE

culture?

9. Implementing practical Chaos Engineering.

10. Major incident response responsibilities for a SRE based on incident command framework,

and examples of anatomy of unmanaged incidents.

11. Perspective of why SRE can be considered as the purest implementation of DevOps.

12. SRE Execution model

13. Understanding the SRE role and understanding why reliability is everyone’s problem.

14. SRE success story learnings

COURSE OUTLINE

● Course Introduction

● Module 1: SRE Anti-patterns

● Module 2: SLO is a Proxy for Customer Happiness

● Module 3: Building Secure and Reliable Systems

● Module 4: Full-Stack Observability

● Module 5: Platform Engineering and AIOPs

● Module 6: SRE & Incident Response Management

● Module 7: Chaos Engineering

● Module 8: SRE is the Purest form of DevOps

● Post-class assignments

©DevOps Institute SREP v1.0 Course Description July 2021

Site Reliability

Engineering (SRE)

Practitioner

Examination Requirements

DevOps Institute SREP v1.0 Examination Requirements July 2021

Site Reliability Engineering (SRE) Practitioner

Certificate

Site Reliability Engineering (SRE) Practitioner is a freestanding certification from DevOps

Institute. The purpose of this certification and its associated course is to impart, test and

validate knowledge, comprehension and application of advanced SRE practices,

methods, and tools. The SRE Practitioner certification is tailored for anyone focused on

large-scale service scalability and reliability with an interest in modern IT leadership and

organizational change approaches.

Eligibility for Examination

The following prerequisite must be met before sitting for the SRE Practitioner certification

exam:

● Candidates must have successfully completed and earned the SRE Foundation

certification from DevOps Institute.

● Although there are no formal training prerequisites for the exam, DevOps Institute

highly recommends that candidates complete at least 24 contact hours of

formal, approved training delivered by an accredited Education Partner of

DevOps Institute in order to prepare for the exam.

Examination Administration

The SRE Practitioner certification is accredited, managed and administered under the

strict protocols and standards of DevOps Institute.

Level of Difficulty

The SRE Practitioner certification uses the Bloom Taxonomy of Educational Objectives in

the construction of both the content and the examination.

• The SRE Practitioner exam contains Bloom 1 questions that test learners’

knowledge of advanced SRE terms and concepts

• The SRE Practitioner exam contains Bloom 2 questions that test learners’

comprehension of advanced SRE terms and concepts.

• The exam also contains Bloom 3 questions that test learners’ application of

advanced SRE concepts in various contexts.

DevOps Institute SREP v1.0 Examination Requirements July 2021

Format of the Examination

Candidates must achieve a passing score to gain the SRE Practitioner Certificate.

Exam Type

40 multiple choice questions

Duration

90 minutes

Prerequisites

The SRE Foundation certification from DevOps Institute is a

mandatory prerequisite to sit the SRE Practitioner exam.

It is highly recommended that candidates complete the Site

Reliability Engineering (SRE) Practitioner course from an accredited

DevOps Institute Education Partner.

Supervised

Open Book

Yes

Passing Score

65%

Delivery

Web-based

Badge

SRE Practitioner Certified

Exam Topic Areas and Question Weighting

The SRE Practitioner exam requires knowledge and understanding of the topic areas

described below.

Topic Area

Description

Max

Questions

SREP – 1:

SRE Anti-Patterns

SREP – 2:

SLO is the Proxy for Customer Happiness

SREP – 3:

Building Secure and Reliable Systems

SREP – 4:

Full Stack Observability

DevOps Institute SREP v1.0 Examination Requirements July 2021

SREP – 5:

Using Platform Engineering & AIOps

SREP – 6:

SRE & Incident Response Management

SREP – 7:

Chaos Engineering

SREP - 8:

SRE is the Purest Form of DevOps

DevOps Institute SREP v1.0 Examination Requirements July 2021

Concept and Terminology List

The candidate is expected to understand, comprehend and apply the following SRE

concepts and terms at Bloom’s 1 (Knowledge), 2 (Comprehension), and 3 (Application)

levels.

AI/ML (Artificial Intelligence/Machine

Learning)

Disaster Recovery

AIOps

Distributed Ecosystems

Anti-Patterns

Distributed Tracing

Application Lifecycle Development

Domain Name System (DNS)

Auto Remediation

DREAD

Blameless Post-mortems

Error Budget

Break Glass Mechanism

Game Days

Business Context

Google's Golden Signals

Canary Deployment

Immutable Framework

Capacity Planning

Incident Command Framework

Change Advisory Board (CAB)

Incident Response Management

Chaos Engineering

Instrumentation

Circuit Breaker

ITOps

Closed Loop Response (CLR)

Key Principles of Incident Response

Containers

Kubernetes

Custom Resource Definition (CRD)

Lifecycle Management

DataOps

Major Incident Response

Development Lifecycle

Mean Time to Detect (MTTD)

DiRT

Mean Time to Repair (MTTR)

Microservices

DevOps Institute SREP v1.0 Examination Requirements July 2021

MITRE ATT&CK

Service Mesh

MLOps

Shift-Left

Monitoring

Site Reliability Engineering Center of

Excellence

Multiservice Architecture

STRIDE

Network Operations Center (NOC)

Swarming

Non-Abstract Large Scale Design

(NALSD)

System Boundaries

"North Star"

Telemetry

Observability

The Three Ways

OODA Loop

Three Pillars of Observability

Platform SRE

USE

Quality of Service (QOS)

Wheel of Misfortune

Rapid Elasticity

Reactive Manifesto

Real User Monitoring (RUM)

Reliability

Resiliency

Scale-up

Service Level Agreement (SLA)

Service Level Indicator (SLI)

Service Level Objective (SLO)

DevOps Institute SREP v1.0 Examination Requirements July 2021

DEVOPS GLOSSARY

OF TERMS

This glossary is provided for reference only as it contains key

terms that may or may not be examinable.

DevOps Glossary of Terms

Term

Definition

Course Appearances

12-Factor App Design

A methodology for building modern,

scalable, maintainable software-as-a-

service applications.

Continuous Delivery

Ecosystem Foundation

2-Factor or 2-Step

Authentication

Two-Factor Authentication, also known as

2FA or TFA or Two-Step Authentication is

when a user provides two authentication

factors; usually firstly a password and then

a second layer of verification such as a

code texted to their device, shared secret,

physical token or biometrics.

DevSecOps Foundation

A/B Testing

Deploy different versions of an EUT to

different customers and let the customer

feedback determine which is best.

Continuous Delivery

Ecosystem Foundation

A3 Problem Solving

A structured problem-solving approach

that uses a lean tool called the A3

Problem-Solving Report. The term "A3"

represents the paper size historically used

for the report (a size roughly equivalent to

11" x 17").

DevOps Foundation

Access Management

Granting an authenticated identity access

to an authorized resource (e.g., data,

service, environment) based on defined

criteria (e.g., a mapped role), while

preventing an unauthorized identity access

to a resource.

DevSecOps Foundation

Access Provisioning

Access provisioning is the process of

coordinating the creation of user accounts,

e-mail authorizations in the form of rules

and roles, and other tasks such as

provisioning of physical resources

associated with enabling new users to

systems or environments.

DevSecOps Foundation

Administration Testing

The purpose of the test is to determine if an

End User Test (EUT) is able to process

administration tasks as expected.

Continuous Delivery

Ecosystem Foundation

DevOps Glossary of Terms

Advice Process

Any person making a decision must seek

advice from everyone meaningfully

affected by the decision and people with

expertise in the matter. Advice received

must be taken into consideration, though it

does not have to be accepted or

followed. The objective of the advice

process is not to form consensus, but to

inform the decision-maker so that they can

make the best decision possible. Failure to

follow the advice process undermines trust

and unnecessarily introduces risk to the

business.

DevSecOps Foundation

Agile

A project management method for

complex projects that divides tasks into

small "sprints" of work with frequent

reassessment and adaptation of plans.

Certified Agile Service

Manager, Site Reliability

Engineering

Agile (adjective)

Able to move quickly and easily; well-

coordinated. Able to think and understand

quickly; able to solve problems and have

new ideas.

Certified Agile Service

Manager, DevOps

Foundation, DevSecOps

Foundation

Agile Coach

Help teams master Agile development and

DevOps practices; enables productive

ways of working and collaboration.

DevOps Leader

Agile Enterprise

Fast moving, flexible and robust company

capable of rapid response to unexpected

challenges, events, and opportunities.

DevOps

Foundation, DevSecOps

Foundation

Agile Manifesto

A formal proclamation of values and

principles to guide an iterative and people-

centric approach to software

development. http://agilemanifesto.org

Certified Agile Service

Manager, DevOps

Foundation

Agile Portfolio

Management

Involves evaluating in-flight projects and

proposed future initiatives to shape and

govern the ongoing investment in projects

and discretionary work. CA’s Agile Central

and VersionOne are examples.

Site Reliability Engineering

Agile Practice Owner

Role accountable for the overall quality of

a service management practice and

owner of the Practice Backlog.

Certified Agile Service

Manager

Agile Principles

The twelve principles that underpin the

Agile Manifesto.

Certified Agile Service

Manager

DevOps Glossary of Terms

Agile Process

Delivers "just enough" structure and control

to enable the organization to achieve its

service outcomes in the most expeditious,

effective and efficient way possible. It is

easy to understand, easy to follow and

prizes its collaboration and outcomes more

than its artifacts.

Certified Agile Service

Manager

Agile Process

Engineering

An iterative and incremental approach to

designing a process with short, iterative

designs of potentially shippable process

increments or microprocesses.

Certified Agile Service

Manager

Agile Process

Improvement

Ensures that IT Service Management agility

introduced through Agile Process

Engineering is continually reviewed and

adjusted as part of IT Service

Management’s commitment to continual

improvement.

Certified Agile Service

Manager

Agile Service

Management

Framework that ensures that ITSM processes

reflect Agile values and are designed with

"just enough" control and structure in order

to effectively and efficiently deliver services

that facilitate customer outcomes when

and how they are needed.

Certified Agile Service

Manager

Agile Service

Management

Artifacts

Practice Backlog, Sprint Backlog,

Increment

Certified Agile Service

Manager

Agile Service

Management Events

Practice/Microprocess Planning, The Sprint,

Sprint Planning, Process Standup, Sprint

Review, Sprint Retrospective

Certified Agile Service

Manager

Agile Service

Management Roles

Agile Practice Owner, Agile Service

Management Team, Agile Service

Manager

Certified Agile Service

Manager

Agile Service

Management Team

A team of at least 3 people (including a

customer or practitioner) that is

accountable for a single microprocess or a

complete service management practice.

Certified Agile Service

Manager

Agile Service

Manager

An Agile Service Management subject

matter expert who is the coach and

protector of the Agile Service

Management Team.

Certified Agile Service

Manager

Agile Software

Development

Group of software development methods

in which requirements and solutions evolve

through collaboration between self-

organizing, cross-functional teams. Usually

applied using the Scrum or Scaled Agile

Framework approach.

Continuous Delivery

Ecosystem Foundation,

DevOps Foundation,

DevSecOps Foundation

DevOps Glossary of Terms

Amazon Web Services

(AWS)

Amazon Web Services (AWS) is a secure

cloud services platform, offering compute

power, database storage, content delivery

and other functionality to help businesses

scale and grow.

DevSecOps Foundation,

Site Reliability Engineering

Analytics

Test results processed and presented in an

organized manner in accordance with

analysis methods and criterion.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Andon

A system gives an assembly line worker the

ability, and moreover the empowerment,

to stop production when a defect is found,

and immediately call for assistance.

Continuous Delivery

Ecosystem Foundation

Anti-pattern

A commonly reinvented but poor solution

to a problem.

DevOps Foundation

Anti-fragility

Antifragility is a property of systems that

increases its capability to thrive as a result

of stressors, shocks, volatility, noise,

mistakes, faults, attacks, or failures.

DevOps Foundation, Site

Reliability Engineering

API Testing

The purpose of the test is to determine if an

API for an EUT functions as expected.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Application

Performance

Management (APM)

APM is the monitoring and management of

performance and availability of software

applications. APM strives to detect and

diagnose complex application

performance problems to maintain an

expected level of service.

Site Reliability Engineering

Application

Programming

Interface (API)

A set of protocols used to create

applications for a specific OS or as an

interface between modules or

applications.

DevOps Foundation,

DevSecOps Foundation

Application

Programming

Interface (API) Testing

The purpose of the test is to determine if an

API for an EUT functions as

expected.fgdsgsgds

Continuous Delivery

Ecosystem Foundation

Application Release

Controlled continuous delivery pipeline

capabilities including automation (release

upon code commit).

Continuous Delivery

Ecosystem Foundation

DevOps Glossary of Terms

Application Release

Automation (ARA) or

Orchestration (ARO)

Controlled continuous delivery pipeline

capabilities including automation (release

upon code commit), environment

modeling (end-to-end pipeline stages, and

deploy application binaries, packages or

other artifacts to target environments) and

release coordination (project, calendar

and scheduling management, integrate

with change control and/or IT service

support management).

Continuous Delivery

Ecosystem Foundation

Application Test

Driven Development

(ATDD)

Acceptance Test Driven Development

(ATDD) is a practice in which the whole

team collaboratively discusses

acceptance criteria, with examples, and

then distills them into a set of concrete

acceptance tests before development

begins.

Continuous Delivery

Ecosystem Foundation

Application Testing

The purpose of the test is to determine if an

application is performing according to its

requirements and expected behaviors.

Continuous Delivery

Ecosystem Foundation

Application Under Test

(AUT)

The EUT is a software application. E.g.

Business application is being tested.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Architecture

The fundamental underlying design of

computer hardware, software or both in

combination.

DevSecOps Foundation

Artifact

Any element in a software development

project including documentation, test

plans, images, data files and executable

modules.

Continuous Delivery

Ecosystem Foundation,

DevOps Foundation,

DevSecOps Foundation

Artifact Repository

Store for binaries, reports and metadata.

Example tools include: JFrog Artifactory,

Sonatype Nexus.

Continuous Delivery

Ecosystem Foundation,

DevOps Foundation

Attack path

The chain of weaknesses a threat may

exploit to achieve the attacker's objective.

For example, an attack path may start by

compromising a user's credentials, which

are then used in a vulnerable system to

escalate privileges, which in turn is used to

access a protected database of

information, which is copied out to an

attacker's own server(s).

DevSecOps Foundation

DevOps Glossary of Terms

Audit Management

The use of automated tools to ensure

products and services are auditable,

including keeping audit logs of build, test

and deploy activities, auditing

configurations and users, as well as log files

from production operations.

Site Reliability Engineering

Authentication

The process of verifying an asserted

identity. Authentication can be based on

what you know (e.g., password or PIN),

what you have (token or one-time code),

what you are (biometrics) or contextual

information.

DevSecOps Foundation

Authorization

The process of granting roles to users to

have access to resources.

DevSecOps Foundation

Auto-DevOps

Auto DevOps brings DevOps best practices

to your project by automatically

configuring software development

lifecycles. It automatically detects, builds,

tests, deploys, and monitors applications.

Site Reliability Engineering

Auto-scaling

The ability to automatically and elastically

scale and de-scale infrastructure

depending on traffic and capacity

variations while maintaining control of

costs.

Continuous Delivery

Ecosystem Foundation

Automated rollback

If a failure is detected during a

deployment, an operator (or an

automated process) will verify the failure

and rollback the failing release to the

previous known working state.

Site Reliability Engineering

Availability

Availability is the proportion of time a

system is in a functioning condition and

therefore available (to users) to be used.

Site Reliability Engineering

Backdoor

A backdoor bypasses the usual

authentication used to access a system. Its

purpose is to grant the cybercriminals

future access to the system even if the

organization has remediated the

vulnerability initially used to attack the

system.

DevSecOps Foundation

Backlog

Requirements for a system, expressed as a

prioritized list of product backlog items

usually in the form of 'User Stories'. The

product backlog is prioritized by the

Product Owner and should include

functional, non‐functional and technical

team‐generated requirements.

Continuous Delivery

Ecosystem

Foundation, DevOps

Foundation

DevOps Glossary of Terms

Basic Security Hygiene

A common set of minimum-security

practices that must be applied to all

environments without exception. Practices

include basic network security (firewalls

and monitoring), hardening, vulnerability

and patch management, logging and

monitoring, basic policies and enforcement

(may be implemented under a "policies as

code" approach), and identity and access

management.

DevSecOps Foundation

Batch Sizes

Refers to the volume of features involved in

a single code release.

DevOps Leader

Bateson Stakeholder

Map

A tool for mapping stakeholder's

engagement with the initiative in progress.

DevOps Leader

Behavior Driven

Development (BDD)

Test cases are created by simulating an

EUT's externally observable inputs, and

outputs. Example tool: Cucumber.

Continuous Delivery

Ecosystem Foundation

Beyond Budgeting

A management model that looks beyond

command-and-control towards a more

empowered and adaptive state.

DevOps Leader

Black‐Box

Test case only uses knowledge of externally

observable behaviors of an EUT.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Blameless post

mortems

A process through which engineers whose

actions have contributed to a service

incident can give a detailed account of

what they did without fear of punishment

or retribution.

Site Reliability Engineering

Blast Radius

Used for impact analysis of service

incidents. When a particular IT service fails,

the users, customers, other dependent

services that are affected.

Site Reliability Engineering

Blue/Green Testing or

Deployments

Taking software from the final stage of

testing to live production using two

environments labelled Blue and Green.

Once the software is working in the green

environment, switch the router so that all

incoming requests go to the green

environment - the blue one is now idle.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Bug

An error or defect in software that results in

an unexpected or system-degrading

condition.

DevSecOps Foundation

DevOps Glossary of Terms

Bureaucratic Culture

Bureaucratic organizations are likely to use

standard channels or procedures which

may be insufficient in a crisis (Westrum).

DevOps Leader

Bursting

Public cloud resources are added as

needed to temporarily increase the total

computing capacity of a private cloud.

Continuous Delivery

Ecosystem Foundation

Business Case

Justification for a proposed project or

undertaking on the basis of its expected

commercial benefit.

DevOps Leader

Business Continuity

Business continuity is an organization's

ability to ensure operations and core

business functions are not severely

impacted by a disaster or unplanned

incident that take critical services offline.

Site Reliability Engineering

Business

Transformation

Changing how the business functions.

Making this a reality means changing

culture, processes, and technologies in

order to better align everyone around

delivering on the organization's mission.

DevSecOps Foundation

Business Value

The benefit of an approach to key business

KPIs.

DevOps Leader

Cadence

Flow or rhythm of events.

DevOps Foundation,

DevOps Leader,

DevSecOps Foundation

CALMS Model

Considered the pillars or values of DevOps:

Culture, Automation, Lean, Measurement,

Sharing (as put forth by John Willis, Damon

Edwards and Jez Humble).

DevOps Foundation

Canary Testing

A canary (also called a canary test) is a

push of code changes to a small number

of end users who have not volunteered to

test anything. Similar to incremental rollout,

it is where a small portion of the user base is

updated to a new version first. This subset,

the canaries, then serve as the proverbial

“canary in the coal mine”. If something

goes wrong then a release is rolled back

and only a small subset of the users are

impacted.

Continuous Delivery

Ecosystem Foundation,

Site Reliability Engineering

Capacity

An estimate of the total amount of

engineering time available for a given

Sprint.

Certified Agile Service

Manager

DevOps Glossary of Terms

Capacity Test

The purpose of the test is to determine if

the EUT can handle expected loads such

as number of users, number of sessions,

aggregate bandwidth.

Continuous Delivery

Ecosystem Foundation

Capture‐Replay

Test cases are created by capturing live

interactions with the EUT, in a format that

can be replayed by a tool. E.g. Selenium

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Carrots

Positive incentives, for encouraging and

rewarding desired behaviors.

DevSecOps Foundation

Chain of Goals

A method designed by Roman Pichler of

ensuring that goals are linked and shared

at all levels through the product

development process.

DevOps Leader

Change

Addition, modification or removal of

anything that could have an effect on IT

services. (ITIL

definition)

DevOps Foundation,

DevSecOps Foundation

Change Failure Rate

A measure of the percentage of

failed/rolled back changes.

Continuous Delivery

Ecosystem Foundation,

DevOps Foundation

Change Fatigue

A general sense of apathy or passive

resignation towards organizational

changes by individuals or teams.

DevSecOps Foundation

Change Lead Time

A measure of the time from a request for

change to delivery of the change.

DevOps Foundation

Change Leader

Development Model

Jim Canterucci's model for five levels of

change leader capability.

DevOps Leader

Change

Management

Process that controls all changes

throughout their lifecycle. (ITIL definition)

DevOps Foundation,

DevOps Leader,

DevSecOps Foundation

Change

Management

(Organizational)

An approach to shifting or

transitioning individuals, teams &

organizations from a current state to a

desired future state. Includes the process,

tools & techniques to manage the people-

side of change to achieve the required

business outcome(s).

DevOps Leader

Change-based Test

Selection Method

Tests are selected according to a criterion

that matches attributes of tests to attributes

of the code that is changed in a build.

Continuous Delivery

Ecosystem

Foundation, Continuous

Testing Foundation

DevOps Glossary of Terms

Chaos Engineering

The discipline of experimenting on a

software system in production in order to

build confidence in the system's capability

to withstand turbulent and unexpected

conditions.

Site Reliability Engineering

Chapter Lead

A squad line manager in the Spotify model

who is responsible for traditional people

management duties, is involved in day to

day work and grows individual and

chapter competence.

DevOps Leader

Chapters

A small family of people having similar skills

and who work within the same general

competency area within the same tribe.

Chapters meet regularly to discuss

challenges and area of expertise in order

to promote sharing, skill development, re-

use and problem solving.

DevOps Leader

ChatOps

An approach to managing technical and

business operations (coined by GitHub)

that involves a combination of group chat

and integration with DevOps tools.

Example tools include: Atlassian

HipChat/Stride, Microsoft Teams, Slack.

Continuous Delivery

Ecosystem Foundation,

DevOps Foundation,

Continuous Testing

Foundation, Site Reliability

Engineering

Check‐in

Action of submitting a software change

into a system version management system.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

CI Regression Test

A subset of regression tests that are run

immediately after a software component is

built. Same as Smoke Test.

Continuous Delivery

Ecosystem Foundation

Clear‐Box

Same as Glass‐Box Testing and White‐Box

Testing.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Cloud Computing

The practice of using remote servers hosted

on the internet to host applications rather

than local servers in a private datacenter.

DevSecOps Foundation,

Site Reliability Engineering

Cloud-Native

Native cloud applications (NCA) are

designed for cloud computing.

Continuous Delivery

Ecosystem Foundation

Cloudbees

Cloudbees is a commercially supported

proprietary automation framework tool

which works with and enhances Jenkins by

providing enterprise levels support and

add-on functionality.

Continuous Testing

Foundation

DevOps Glossary of Terms

Cluster Cost

Optimization

Tools like Kubecost, Replex, Cloudability use

monitoring to analyze container clusters

and optimize the resource deployment

model.

Site Reliability Engineering

Cluster Monitoring

Tools that let you know the health of your

deployment environments running in

clusters such as Kubernetes.

Site Reliability Engineering

Clustering

A group of computers (called nodes or

members) work together as a cluster

connected through a fast network acting

as a single system.

Continuous Delivery

Ecosystem Foundation

Code Coverage

A measure of white box test coverage by

counting code units that are executed by

a test. The code unit may be a code

statement, a code branch, or control path

or data path through a code module.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Code Quality

See also static code analysis, Sonar and

Checkmarks are examples of tools that

automatically check the seven main

dimensions of code quality – comments,

architecture, duplication, unit test

coverage, complexity, potential defects,

language rules.

Site Reliability Engineering

Code Repository

A repository where developers can commit

and collaborate on their code. It also

tracks historical versions and potentially

identifies conflicting versions of the same

code. Also referred to as "repository" or

"repo."

DevSecOps Foundation

Code Review

Software engineers inspect each other's

source code to detect coding or code

formatting errors.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Cognitive Bias

Cognitive bias is a limitation in objective

thinking that is caused by the tendency for

the human brain to perceive information

through a filter of personal experience and

preferences: a systematic pattern of

deviation from norm or rationality in

judgment.

DevOps Leader

Collaboration

People jointly working with others towards a

common goal.

DevOps Foundation,

DevSecOps Foundation

DevOps Glossary of Terms

Collaborative Culture

A culture that applies to everyone which

incorporates an expected set of behaviors,

language and accepted ways of working

with each other reinforcement by

leadership.

Continuous Delivery

Ecosystem Foundation

Compatibility Test

Test with the purpose to determine if and

EUT interoperates with another EUT such as

peer‐to‐peer applications or protocols.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Configuration

Management

Configuration management (CM) is a

systems engineering process for

establishing and maintaining consistency of

a product's performance, functional, and

physical attributes with its requirements,

design, and operational information

throughout its life.

Continuous Delivery

Ecosystem Foundation,

DevOps Foundation,

DevSecOps Foundation

Conformance Test

The purpose of the test is to determine if an

EUT complies to a standard.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Constraint

Limitation or restriction; something that

constrains. See also bottl eneck.

DevOps Foundation,

DevSecOps Foundation

Container

A way of packaging software into

lightweight, stand-alone, executable

packages including everything needed to

run it (code, runtime, system tools, system

libraries, settings) for development,

shipment and deployment.

DevOps Foundation,

DevSecOps Foundation,

Site Reliability Engineering

Container Network

Security

Used to prove that any app that can be

run on a container cluster with any other

app can be confident that there is no

unintended use of the other app or any

unintended network traffic between them.

Site Reliability Engineering

Container Registry

Secure and private registry for Container

images. Typically allowing for easy upload

and download of images from the build

tools. Docker Hub, Artifactory, Nexus are

examples.

Site Reliability Engineering

Container Scanning

When building a Container image for your

application, tools can run a security scan

to ensure it does not have any known

vulnerability in the environment where your

code is shipped. Blackduck, Synopsis, Synk,

Claire and klar are examples.

Site Reliability Engineering

DevOps Glossary of Terms

Continual Service

Improvement (CSI)

One of the ITIL Core publications and a

stage of the service lifecycle.

DevOps Foundation

Continuous Delivery

(CD)

A methodology that focuses on making

sure software is always in a releasable state

throughout its lifecycle.

Continuous Delivery

Ecosystem Foundation,

DevOps Foundation,

DevSecOps

Foundation, Continuous

Testing Foundation

Continuous Delivery

(CD) Architect

A person who is responsible to guide the

implementation and best practices for a

continuous delivery pipeline.

Continuous Delivery

Ecosystem Foundation

Continuous Delivery

Pipeline

A continuous delivery pipeline refers to the

series of processes which are performed on

product changes in stages. A change is

injected at the beginning of the pipeline. A

change may be new versions of code,

data or images for applications. Each

stage processes the artifacts resulting from

the prior stage. The last stage results in

deployment to production.

Continuous Delivery

Ecosystem Foundation,

DevOps Foundation

Course, DevOps Leader

Continuous Delivery

Pipeline Stage

Each process in a continuous delivery

pipeline. These are not standard. Examples

are Design: determine implementation

changes; Creation: implement an

unintegrated version of design changes;

Integration: merge

Continuous Delivery

Ecosystem Foundation

Continuous

Deployment

A set of practices that enable every

change that passes automated tests to be

automatically deployed to production.

DevOps Foundation,

DevSecOps Foundation

Continuous Flow

Smoothly moving people or products from

the first step of a process to the last with

minimal (or no) buffers between steps.

DevOps Foundation,

DevOps Leader,

DevSecOps Foundation

Continuous

Improvement

Based on Deming's Plan-Do-Check-Act, a

model for ensure ongoing efforts to

improve products, processes and services.

DevOps Foundation,

DevOps Leader

Continuous

Integration (CI)

A development practice that requires

developers to merge their code into trunk

or master ideally at least daily and perform

tests (i.e. unit, integration and

acceptance) at every code commit.

Continuous Delivery

Ecosystem Foundation,

DevOps

Foundation, Continuous

Testing

Foundation, DevSecOps

Foundation

DevOps Glossary of Terms

Continuous

Integration Tools

Tools that provide an immediate feedback

loop by regularly merging, building and

testing code. Example tools include:

Atlassian Bamboo, Jenkins, Microsoft

VSTS/Azure DevOps, TeamCity.

DevOps Foundation,

DevOps Leader

Continuous Monitoring

(CM)

This is a class of terms relevant to logging,

notifications, alerts, displays and analysis of

test results information.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Continuous Testing

(CT)

This is a class of terms relevant to testing

and verification of an EUT in a DevOps

environment.

DevOps

Foundation, Continuous

Delivery Ecosystem

Foundation, Continuous

Testing Foundation

Conversation Café

Conversation Cafés are open, hosted

conversations in cafés as well as

conferences and classrooms—anywhere

people gather to make sense of our world.

DevOps Leader

Conway's Law

Organizations which design systems are

constrained to produce designs which are

copies of the communication structures of

these organizations.

Continuous Delivery

Ecosystem Foundation,

DevOps Leader

Cooperation vs.

Competition

The key cultural value shift toward being

highly collaborative and cooperative, and

away from internal competitiveness and

divisiveness.

DevSecOps Foundation

COTS

Commercial‐off‐the‐shelf solution

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Critical Success Factor

(CSF)

Something that must happen for an IT

service, process, plan, project or other

activity to succeed.

DevSecOps Foundation

Cultural Iceberg

A metaphor that visualizes the difference

between observable (above the water)

and non-observable (below the waterline)

elements of culture.

DevOps Leader

Culture

(Organizational

Culture)

The values and behaviors that contribute to

the unique psychosocial environment of an

organization.

Continuous Delivery

Ecosystem Foundation,

DevOps Foundation,

DevSecOps Foundation

DevOps Glossary of Terms

Cumulative Flow

Diagram

A cumulative flow diagram is a tool used in

agile software development and lean

product development. It is an

area graph that depicts the quantity of

work in a given state, showing arrivals, time

in queue, quantity in queue, and

departure.

DevOps Leader

Current State Map

A form of value stream map that helps you

identify how the current process works and

where the disconnects are.

DevOps Leader

Customer Reliability

Engineer (CRE)

CRE is what you get when you take the

principles and lessons of SRE and apply

them towards customers.

Sire Reliability Engineering

Cycle Time

A measure of the time from start of work to

ready for delivery.

DevOps Foundation,

DevOps Leader.

DevSecOps Foundation

Daily Scrum

Daily timeboxed event of 15 minutes or less

for the Team to replan the next day of work

during a Sprint.

DevOps Foundation

Dashboard

Graphical display of summarized test

results.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Data Loss Protection

(DLP)

Tools that prevent files and content from

being removed from within a service

environment or organization.

Site Reliability Engineering

Database Reliability

Engineer (DBRE)

A person responsible for keeping database

systems that support all user facing services

in production running smoothly.

Site Reliability Engineering

Defect Density

The number of faults found in a unit E.g. #

defects per KLOC, # defects per change.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Definition of Done

A shared understanding of expectations

that the Increment must live up to in order

to be releasable into production.

(Scrum.org)

Certified Agile Service

Manager, DevOps Leader

Delivery Cadence

The frequency of deliveries. E.g. # deliveries

per day, per week, etc.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps Glossary of Terms

Delivery Package

Set of release items (files, images, etc.) that

are packaged for deployment.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Deming Cycle

A four-stage cycle for process

management, attributed to W. Edwards

Deming. Also called Plan-Do-Check-Act

(PDCA).

DevOps Foundation,

DevSecOps Foundation

Dependency Firewall

Many projects depend on packages that

may come from unknown or unverified

providers, introducing potential security

vulnerabilities. There are tools to scan

dependencies but that is after they are

downloaded. These tools prevent those

vulnerabilities from being downloaded to

begin with.

Site Reliability Engineering

Dependency Proxy

For many organizations, it is desirable to

have a local proxy for frequently used

upstream images/packages. In the case of

CI/CD, the proxy is responsible for receiving

a request and returning the upstream

image from a registry, acting as a pull-

through cache.

Site Reliability Engineering

Dependency

Scanning

Used to automatically find security

vulnerabilities in your dependencies while

you are developing and testing your

applications. Synopisis, Gemnasium,

Retire.js and bundler-audit are popular

tools in this area.

Site Reliability Engineering

Deployment

The installation of a specified version of

software to a given environment (e.g.,

promoting a new build into production).

DevOps Foundation,

DevSecOps Foundation

Design for Testability

An EUT is designed with features which

enable it to be tested.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Design Principles

Principles for designing, organizing, and

managing a DevOps delivery operating

model.

DevOps Leader

Dev

Individuals involved in software

development activities such as application

and software engineers.

DevOps Foundation,

DevSecOps Foundation

DevOps Glossary of Terms

Developer (Dev)

Individual who has responsibility to develop

changes for an EUT. Alternate: Individuals

involved in software development activities

such as application and software

engineers.

Continuous Delivery

Ecosystem

Foundation, Continuous

Testing Foundation

Development Test

Ensuring that the developer's test

environment is a good representation of

the production test environment.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Device Under Test

(DUT)

The DUT is a device (e.g. router or switch)

being tested.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps

A cultural and professional movement that

stresses communication, collaboration and

integration between software developers

and IT operations professionals while

automating the process of software

delivery and infrastructure changes. It aims

at establishing a culture and environment

where building, testing, and releasing

software, can happen rapidly, frequently,

and more reliably." (Source: Wikipedia)

Certified Agile Service

Manager, DevOps

Foundation, DevSecOps

Foundation

DevOps Coach

Help teams master Agile development and

DevOps practices; enables productive

ways of working and collaboration.

DevOps Leader

DevOps Infrastructure

The entire set of tools and facilities that

make up the DevOps system. Includes CI,

CT, CM and CD tools.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps Kaizen

Kaizen is a Japanese word that closely

translates to "change for better," the idea

of continuous improvement—large or

small—involving all employees and crossing

organisational boundaries. Damon

Edwards' DevOps Kaizen shows how

making small, incremental improvements

(little J's) has an improved impact on

productivity long term.

DevOps Leader

DevOps Pipeline

The entire set of interconnected processes

that make up a DevOps Infrastructure.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps Score

A metric showing DevOps adoption across

an organization and the corresponding

impact on delivery velocity.

Site Reliability Engineering

DevOps Glossary of Terms

DevOps Toolchain

The tools needed to support a DevOps

continuous development and delivery

cycle from idea to value realisation.

Continuous Delivery

Ecosystem Foundation,

DevOps Foundation,

DevSecOps

Foundation, Continuous

Testing Foundation

DevSecOps

A mindset that "everyone is responsible for

security" with the goal of safely distributing

security decisions at speed and scale to

those who hold the highest level of context

without sacrificing the safety required.

Continuous Delivery

Ecosystem Foundation,

DevOps Foundation,

DevSecOps Foundation

Distributed Version

Control System

(DVCS)

The software revisions are stored in a

distributed revision control system (DRCS),

also known as a distributed version control

system (DVCS).

Continuous Delivery

Ecosystem Foundation

DMZ (De-Militarized

Zone)

A DMZ in network security parlance is a

network zone in between the public

internet and internal protected resources.

Any application, server, or service

(including APIs) that need to be exposed

externally are typically placed in a DMZ. It

is not uncommon to have multiple DMZs in

parallel.

DevSecOps Foundation

Dynamic Analysis

Dynamic analysis is the testing of an

application by executing data in real-time

with the objective of detecting defects

while it is in operation, rather than by

repeatedly examining the code offline.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Dynamic Application

Security Testing (DAST)

A type of testing that runs against built

code to test exposed interfaces.

DevSecOps Foundation

EggPlant

Automated function and regression testing

of enterprise applications. Licensed by Test

Plant.

Continuous Testing

Foundation

Elastic Infrastructure

Elasticity is a term typically used in cloud

computing, to describe the ability of an

IT infrastructure to quickly expand or cut

back capacity and services without

hindering or jeopardizing

the infrastructure's stability, performance,

security, governance or compliance

protocols.

Continuous Delivery

Ecosystem Foundation

DevOps Glossary of Terms

eNPS

Employee Net Promoter Score (eNPS) is a

way for organizations to measure

employee loyalty. The Net Promoter Score,

originally a customer service tool, was later

used internally on employees instead of

customers.

DevOps Foundation,

DevOps Leader

Entity Under Test (EUT)

This is a class of terms which refers to names

of types of entities that are being tested.

These terms are often abbreviated to the

form xUT where "x" represents a type of

entity under test.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Epic

A collection of related user stories that may

need to be worked on across multiple

Sprints.

Certified Agile Service

Manager, DevOps

Foundation

Erickson (Stages of

Psychosocial

Development)

Erik Erikson (1950, 1963) proposed a

psychoanalytic theory of psychosocial

development comprising eight stages from

infancy to adulthood. During each stage,

the person experiences a psychosocial

crisis which could have a positive or

negative outcome for personality

development.

DevSecOps Foundation

Error Budget

The error budget provides a clear,

objective metric that determines how

unreliable a service is allowed to be within

a specific time period.

Site Reliability Engineering

Error Budget Policies

An error budget policy enumerates the

activity a team takes when they've

exhausted their error budget for a

particular service in a particular time

period.

Site Reliability Engineering

Error Tracking

Tools to easily discover and show the errors

that application may be generating, along

with the associated data.

Site Reliability Engineering

External Automation

Scripts and automation outside of a service

that is intended to reduce toil.

Site Reliability Engineering

Fail Early

A DevOps tenet referring to the preference

to find critical problems as early as possible

in a development and delivery pipeline.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Fail Often

A DevOps tenet which emphasizes a

preference to find critical problems as fast

as possible and therefore frequently.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps Glossary of Terms

Failure Rate

Fail verdicts per unit of time.

DevOps Foundation,

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

False Negative

A test incorrectly reports a verdict of "fail"

when the EUT actually passed the purpose

of the test.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

False Positive

A test incorrectly reports a verdict of "pass"

when the EUT actually failed the purpose of

the test.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Feature Toggle

The practice of using software switches to

hide or activate features. This enables

continuous integration and testing a

feature with selected stakeholders.

DevOps Foundation,

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Federated Identity

A central identity used for access to a wide

range of applications, systems, and

services, but with a particular skew toward

web-based applications. Also, often

referenced as Identity-as-a-Service (IDaas).

Any identity that can be reused across

multiple sites, particularly via SAML or

OAuth authentication mechanisms.

DevSecOps Foundation

Fire Drills

A planned failure testing process focussed

on the operation of live services including

service failure testing as well as

communication, documentation, and

other human factor testing.

Site Reliability Engineering

Flow

How people, products or information move

through a process. Flow is the first way of

The Three Ways.

DevOps Foundation,

DevOps Leader,

DevSecOps Foundation

Flow of Value

A form of map that shows the end-to-end

value stream. This view is usually not

available within the enterprise.

DevOps Leader

Framework

Backbone for plugging in tools. Launches

automated tasks, collects results from

automated tasks.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps Glossary of Terms

Freedom and

Responsibility

A core cultural value that with the freedom

of self-management (such as afforded by

DevOps) comes the responsibility to be

diligent, to follow the advice process and

to take ownership of both successes and

failures.

DevSecOps Foundation

Frequency

How often an application is released.

DevOps Leader

Functional Testing

Tests to determine if the functional

operation of the service is as expected.

Site Reliability Engineering

Future State Map

A form of value stream map that helps you

develop and communicate what the

target end state should look like and how

to tackle the necessary changes.

DevOps Leader

Fuzzing

Fuzzing or fuzz testing is an automated

software testing practice that inputs invalid,

unexpected, or random data into

applications.

DevSecOps Foundation

Gated Commits

Define and obtain consensus for criterion of

changes promoted between all CD

pipeline stages such as: Dev to CI stage /

CI to packaging / delivery stage / Delivery

to Deployment/Production stage.

Continuous Delivery

Ecosystem Foundation

Generative (DevOps)

Culture

In a generative organization alignment

takes place through identification with the

mission. The individual ''buys into'' what he

or she is supposed to do and its effect on

the outcome. Generative organizations

tend to be proactive in getting the

information to the right people by any

means. necessary. (Westrum)

DevOps Leader

Generativity

A cultural view wherein long-term

outcomes are of primary focus, which in

turn drives investments and cooperation

that enable an organization to achieve

those outcomes.

DevSecOps Foundation

Glass‐Box

Same as Clear‐Box Testing and White‐Box

Testing.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Goal‐seeking tests

The purpose of the test is to determine an

EUT's performance boundaries, using

incrementally stresses until the EUT reaches

a peak performance. E.g. Determine the

maximum throughput that can be handled

without errors.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps Glossary of Terms

Golden Circle

A model by Simon Sinek that emphasizes

an understanding of the business' "why"

before focusing on the "what" and "how".

DevOps Foundation

Golden Image

A template for a virtual machine (VM),

virtual desktop, server or hard disk drive.

(TechTarget)

DevSecOps Foundation

Goleman's Six Styles of

Leadership

Daniel Goleman (2002) created the Six

Leadership Styles and found, in his

research, that leaders used one of these

styles at any one time.

DevOps Leader

Governance, Risk

Management and

Compliance (GRC)

A software platform intended for

concentrating governance, compliance

and risk management data, including

policies, compliance requirements,

vulnerability data, and sometimes asset

inventory, business continuity plans, etc. In

essence, a specialized document and

data repository for security governance. Or

a team of people who specialize in

IT/security governance, risk management

and compliance activities. Most often non-

technical business analyst resources.

DevSecOps Foundation

Gray‐Box

Test cases use a limited knowledge of the

internal design structure of the EUT.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

GUI testing

The purpose of the test is to determine if

the graphical user interface operates as

expected.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Guilds

A "community of interest" group that

welcomes anyone and usually cuts across

an entire organization. Similar to a

Community of Practice.

DevOps Foundation,

DevOps Leader

Hand Offs

The procedure for transferring the

responsibility of a particular task from one

individual or team to another.

DevOps Foundation,

DevOps Leader

Hardening

Securing a server or infrastructure

environment by removing or disabling

unnecessary software, updating to known

good versions of the operating system,

restricting network-level access to only that

which is needed, configuring logging in

order to capture alerts, configuring

appropriate access management and

installing appropriate security tools.

DevSecOps Foundation

DevOps Glossary of Terms

Helm Chart Registry

Helm charts are what describe related

Kubernetes resources. Artifactory and

Codefresh support a registry for

maintaining master records of Helm Charts.

Site Reliability Engineering

Heritage Reliability

Engineer (HRE)

Applying the principles and practices of

SRE to legacy applications and

environments.

Site Reliability Engineering

High-Trust Culture

Organizations with a high-trust culture

encourage good information flow, cross-

functional collaboration, shared

responsibilities, learning from failures and

new ideas.

DevOps Foundation

Horizontal Scaling

Computing resources are scaled wider to

increase the volume of processing. E.g.

Add more computers and run more tasks in

parallel.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Idempotent

CM tools (e.g., Puppet, Chef, Ansible, and

Salt) claim that they are 'idempotent' by

allowing the desired state of a server to be

defined as code or declarations and

automate steps necessary to consistently

achieve the defined state time‐after‐time.

Continuous Delivery

Ecosystem Foundation

Identity

The unique name of a person, device, or

the combination of both that is recognized

by a digital system. Also referred to as an

"account" or "user."

DevSecOps Foundation

Identity and Access

Management (IAM)

Policies, procedures and tools for ensuring

the right people have the right access to

technology resources.

DevSecOps Foundation

Identity as a Service

(IDAAS)

Identity and access management services

that are offered through the cloud or on a

subscription basis.

DevSecOps Foundation

Image‐based test

selection method

Build images are pre‐assigned test cases.

Tests cases are selected for a build by

matching the image changes resulting

from a build.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Immersive learning

A learning approach that guides teams

with coaching and practice to help them

learn to work in a new way.

DevOps Leader

Immutable

An immutable object is an object whose

state cannot be modified after it is

created. The antonym is a mutable object,

which can be modified after it is created.

Continuous Delivery

Ecosystem Foundation

DevOps Glossary of Terms

Immutable

Infrastructures

Instead of instantiating an instance (server,

container, etc.), with error‐prone, time‐

consuming patches and upgrades (i.e.

mutations), replace it with another instance

to introduce changes or ensure proper

behavior.

Continuous Delivery

Ecosystem Foundation,

Site Reliability Engineering

Implementation Under

Test

The EUT is a software implementation. E.g.

Embedded program is being tested.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Improvement Kata

A structured way to create a culture of

continuous learning and improvement. (In

Japanese business, Kata is the idea of

doing things the "correct" way. An

organization's culture can be

characterized as its Kata through its

consistent role modeling, teaching and

coaching.)

DevOps Foundation

Incentive model

A system designed to motivate people to

complete tasks toward achieving

objectives. The system may employ either

positive or negative consequences for

motivation.

DevSecOps Foundation

Incident

Any unplanned interruption to an IT service

or reduction in the quality of an IT service.

Includes events that disrupt or could disrupt

the service. (ITIL definition)

DevSecOps Foundation

Incident

Management

Process that restores normal service

operation as quickly as possible to minimize

business impact and ensure that agreed

levels of service quality are maintained. (ITIL

definition). Involves capturing the who,

what, when of service incidents and the

onward use of this data in ensuring service

level objectives are being met.

DevSecOps Foundation,

Site Reliability Engineering

Incident Response

An organized approach to addressing and

managing the aftermath of a security

breach or attack (also known as an

incident). The goal is to handle the situation

in a way that limits damage and reduces

recovery time and costs.

DevSecOps Foundation,

Site Reliability Engineering

Increment

Potentially shippable completed work that

is the outcome of a Sprint.

Certified Agile Service

Manager, DevOps

Foundation

DevOps Glossary of Terms

Incremental Rollout

Incremental rollout means deploying many

small, gradual changes to a service instead

of a few large changes. Users are

incrementally moved across to the new

version of the service until eventually all

users are moved across. Sometimes

referred to by colored environments e.g.

Blue/green deployment.

Site Reliability Engineering

Infrastructure

All of the hardware, software, networks,

facilities, etc., required to develop, test,

deliver, monitor and control or support IT

services. The term IT infrastructure includes

all of the information technology but not

the associated people, processes and

documentation. (ITIL definition)

DevOps Foundation,

DevSecOps Foundation

Infrastructure as Code

The practice of using code (scripts) to

configure and manage infrastructure.

DevOps Foundation,

DevSecOps Foundation

Infrastructure Test

The purpose of the test is to verify the

framework for EUT operating. E.g. verify

specific operating system utilities function

as expected in the target environment.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Infrastructure‐as‐a‐

Service (IaaS)

On‐demand access to a shared pool of

configurable computing resources.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Integrated

development

environment (IDE)

An integrated development environment

(IDE) is a software suite that consolidates

the basic tools developers need to write

and test software. Typically, an IDE contains

a code editor, a compiler or interpreter

and a debugger that the developer

accesses through a single graphical user

interface (GUI). An IDE may be a

standalone application, or it may be

included as part of one or more existing

and compatible applications. (TechTarget)

DevSecOps Foundation

Integrated

development

environment (IDE) 'lint'

checks

Linting is the process of running a program

that will analyze code for potential errors

(e.g., formatting discrepancies, non-

adherence to coding standards and

conventions, logical errors).

DevSecOps Foundation

Internet of Things

A network of physical devices that connect

to the internet and potentially to each

other through web-based wireless services.

DevOps Foundation,

DevSecOps Foundation

Internal Automation

Scripts and automation delivered as part of

the service that is intended to reduce toil.

Site Reliability Engineering

DevOps Glossary of Terms

INVEST

A mnemonic was created by Bill Wake as a

reminder of the characteristics of a quality

user story.

Certified Agile Service

Manager

ISO 31000

A family of standards that provide

principles and generic guidelines on risk

management.

DevSecOps Foundation

Issue Management

A process for capturing, tracking, and

resolving bugs and issues throughout the

software development lifecycle.

DevSecOps Foundation

IT Service

Management (ITSM)

Implementation and management of

quality IT services that meet the needs of

the business. (ITIL definition)

DevOps Foundation, Site

Reliability Engineering

iTest

Tool licensed by Spirent Communications

for creating automated test cases.

Continuous Testing

Foundation

ITIL

Provides a best practices framework that

organizations can adapt to deliver and

maintain IT services to provide optimal

value for all stakeholders, including the

customer.

Certified Agile Service

Manager, DevOps

Foundation, Site Reliability

Engineering

Jenkins

Jenkins is a freeware tool. It is the most

popular master automation framework

tool, especially for continuous integration

task automation. Jenkins task automation

centers around timed processes. Many test

tools and other tools offer plugins to simplify

integration with Jenkins.

Continuous Delivery

Ecosystem

Foundation, Continuous

Testing Foundation

Kaizen

The practice of continuous improvement.

DevOps Foundation

Kanban

Method of work that pulls the flow of work

through a process at a manageable pace.

Certified Agile Service

Manager, DevOps

Foundation

Kanban Board

Tool that helps teams organize, visualize

and manage work.

DevOps Foundation

Karpman Drama

Triangle

The drama triangle is a social model of

human interaction. The triangle maps a

type of destructive interaction that can

occur between people in conflict.

DevOps Leader

Key Metrics

Something that is measured and reported

upon to help manage a process, IT service

or activity.

DevOps Foundation,

DevOps Leader

Keywords‐Based

Test cases are created using pre‐defined

names that reference programs useful for

testing.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps Glossary of Terms

Knowledge

Management

Process that ensures the right information is

delivered to the right place or person at

the right time to enable an informed

decision.

DevOps

Foundation, DevSecOps

Foundation

Known Error

Problem with a documented root cause

and a workaround. (ITIL definition)

DevSecOps Foundation

Kolb's Learning Styles

David Kolb published his learning styles

model in 1984; his experiential learning

theory works on two levels: a four

stage cycle of learning and four separate

learning styles.

DevOps Leader

Kotter's Dual

Operating System

John Kotter describes the need for a dual

operating system that combines the

entrepreneurial capability of a network

with the organisational efficiency of

traditional hierarchy.

DevOps Leader

Kubernetes

Kubernetes is an open-source container-

orchestration system for automating

application deployment, scaling, and

management. It was originally designed by

Google, and is now maintained by the

Cloud Native Computing Foundation.

Site Reliability Engineering

Kubler-Ross Change

Curve

Describes and predicts the stages of

personal and organizational reaction to

major changes.

DevOps Foundation

Lab‐as‐a‐Service

(LaaS)

Category of cloud computing services that

provides a laboratory allowing customers

to test applications without the complexity

of building and maintaining the lab

infrastructure.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Laloux (Culture

Models)

Frederic Laloux created a model for

understanding organizational culture.

DevSecOps Foundation

Latency

Latency is the delay incurred in

communicating a message, the time a

message spends “on the wire” between

the initial request being received e.g. by a

server and the response being recieved

e.g. by a client.

Site Reliability Engineering

Laws of Systems

Thinking

In his book 'The Fifth Discipline', Peter Senge

outlines eleven laws will help the

understanding of business systems and to

identify behaviors for addressing complex

business problems.

DevOps Leader

DevOps Glossary of Terms

Lean

Production philosophy that focuses on

reducing waste and improving the flow of

processes to improve overall customer

value.

Certified Agile Service

Manager, DevOps

Foundation, DevOps

Leader, DevSecOps

Foundation

Lean (adjective)

Spare, economical. Lacking richness or

abundance.

DevOps Foundation,

DevSecOps Foundation

Lean Canvas

Lean Canvas is a 1-page business plan

template.

DevOps Leader

Lean Enterprise

Organization that strategically applies the

key ideas behind lean production across

the enterprise.

DevOps Foundation,

DevSecOps Foundation

Lean IT

Applying the key ideas behind lean

production to the development and

management of IT products and services.

DevOps Foundation,

DevSecOps Foundation

Lean Manufacturing

Lean production philosophy derived mostly

from the Toyota Production System.

DevOps Foundation,

DevSecOps Foundation

Lean Product

Development

Lean Product Development, or LPD, utilizes

Lean principles to meet the challenges of

Product Development.

DevOps Leader

Lean Startup

A system for developing a business or

product in the most efficient way possible

to reduce the risk of failure.

DevOps Leader

License Scanning

Tools, such as Blackduck and Synopsis, that

check that licenses of your dependencies

are compatible with your application, and

approve or blacklist them.

Site Reliability Engineering

Little's Law

A theorem by John Little which states that

the long-term average number L of

customers in a stationary system is equal to

the long-term average effective arrival

rate λ multiplied by the average

time W that a customer spends in the

system.

DevOps Leader

LoadRunner

Tool used to test applications, measuring

system behavior and performance under

load. Licensed by HP.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Log

Serialized report of details such as test

activities and EUT console logs.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps Glossary of Terms

Log Management

The collective processes and policies used

to administer and facilitate the generation,

transmission, analysis, storage, archiving

and ultimate disposal of the large volumes

of log data created within an information

system.

DevSecOps Foundation

Logging

The capture, aggregation and storage of

all logs associated with system

performance including, but not limited to,

process calls, events, user data, responses,

error and status codes. Logstash and

Nagios are popular examples.

Site Reliability Engineering

Logic Bomb (Slag

Code)

A string of malicious code used to cause

harm to a system when the programmed

conditions are met.

DevSecOps Foundation

Longevity Test

The purpose of the test is to determine if a

complete system performs as expected

over an extended period of time

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Machine Learning

Data analysis that uses algorithms that

learn from data.

DevOps Foundation

Malware

A program designed to gain access to

computer systems, normally for the benefit

of some third party, without the user’s

permission

DevSecOps Foundation

Many-factor

Authentication

The practice of using at least 2 factors for

authentication. The two factors can be of

the same class.

DevSecOps Foundation

Mean Time Between

Deploys

Used to measure deployment frequency.

DevOps Foundation,

DevSecOps Foundation

Mean Time Between

Failures (MTBF)

Average time that a CI or IT service can

perform its agreed function without

interruption. Often used to measure

reliability. Measured from when the CI or

service starts working, until the time it fails

(uptime). (ITIL definition)

DevOps Foundation,

DevSecOps Foundation

Mean Time to Detect

Defects (MTTD)

Average time required to detect a failed

component or device.

Continuous Delivery

Ecosystem Foundation,

DevOps Foundation,

DevSecOps Foundation,

Site Reliability Engineering

Mean Time to

Discovery

How long a vulnerability or software

bug/defect exists before it's identified.

DevSecOps Foundation

DevOps Glossary of Terms

Mean Time to Patch

How long it takes to apply patches to

environments once a vulnerability has

been identified.

DevSecOps Foundation

Mean Time to

Repair/Recover

(MTTR)

Average time required to repair/recover a

failed component or device. MTTR does

not include the time required to recover or

restore service.

DevOps Foundation,

DevSecOps Foundation,

Site Reliability Engineering

Mean Time to Restore

Service (MTRS)

Used to measure time from when the CI or

IT service fails until it is fully restored and

delivering its normal functionality

(downtime). Often used to measure

maintainability. (ITIL definition).

DevOps Foundation,

DevSecOps Foundation,

Site Reliability Engineering

Mental Models

A mental model is an explanation of

someone's thought process about how

something works in the real world.

DevOps Leader

Merge

Action of integrating a software changes

together into a software version

management system.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Metric

Something that is measured and reported

upon to help manage a process, IT service

or activity.

DevOps Foundation,

DevSecOps Foundation

Metrics

This is a class of terms relevant to

measurements used to monitor the health

of a product or infrastructure.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Microprocess

A distinct activity that can be defined,

designed, implemented and managed

independently and is generally associated

with a primary service management

practice. A microprocess may be

integrated with other service management

practices.

Certified Agile Service

Manager

Microprocess

Architecture

A collection of integrated microprocesses

that collectively perform all of the activities

necessary for an end-to-end service

management practice to be successful.

Certified Agile Service

Manager

Microservices

A software architecture that is composed

of smaller modules that interact through

APIs and can be updated without

affecting the entire system.

DevOps Foundation

Mindset

A person's usual attitude or mental state is

their mindset.

DevOps Leader

DevOps Glossary of Terms

Minimum Viable

Process

The least amount needed in order for this

process or microprocess to meet its

Definition of Done.

Certified Agile Service

Manager

Minimum Viable

Product

Most minimal version of a product that can

be released and still provide enough value

that people are willing to use it.

DevOps Leader

Mock Object

Mock is a method/object that simulates the

behavior of a real method/object in

controlled ways. Mock objects are used in

unit testing. Often a method under a test

calls other external services or methods

within it. These are called dependencies.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Model

Representation of a system, process, IT

service, CI, etc. that is used to help

understand or predict future behavior. In

the context of processes, models represent

pre-defined steps for handling specific

types of transactions.

DevSecOps Foundation

Model‐Based

Test cases are automatically derived from

a model of the entity under test. Example

tool: Tricentus

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Monitoring

The use of a hardware or software

component to monitor the system

resources and performance of a computer

service.

Site Reliability Engineering

Monitoring Tools

Tools that allow IT organizations to identify

specific issues of specific releases and to

understand the impact on end-users.

DevOps Leader

Monolithic

A software system is called "monolithic" if it

has a monolithic architecture, in which

functionally distinguishable aspects (for

example data input and output, data

processing, error handling, and the user

interface) are all interwoven, rather than

containing architecturally separate

components.

Continuous Delivery

Ecosystem Foundation

Multi-factor

Authentication

The practice of using 2 or more factors for

authentication. Often used synonymously

with 2-factor Authentication.

DevSecOps Foundation

Multi‐cloud

Multi‐cloud DevOps solutions provide on‐

demand multi‐tenant access to

development and test environments.

Continuous Delivery

Ecosystem Foundation

DevOps Glossary of Terms

Network Reliability

Engineer (NRE)

Someone who applies a reliability

engineering approach to measure and

automate the reliability of networks.

Site Reliability Engineering

Neuroplasticity

Describes the ability of the brain to form

and reorganize synaptic connections,

especially in response to learning or

experience or following injury.

DevOps Leader

Neuroscience

The study of the brain and nervous system.

DevOps Leader

Non-functional

requirements

Requirements that specify criteria that can

be used to judge the operation of a

system, rather than specific behaviors or

functions (e.g., availability, reliability,

maintainability, supportability); qualities of

a system.

DevOps Foundation

Non-functional tests

Defined as a type of service testing

intending to check non-functional aspects

such as performance, usability and

reliability of a software service.

Site Reliability Engineering

Object Under Test

(OUT)

The EUT is a software object or class of

objects.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Observability

Observability is focused on externalizing as

much data as you can about the whole

service allowing us to infer what the current

state of that service is.

Site Reliability Engineering

On-call

Being on-call means someone being

available during a set period of time, and

being ready to respond to production

incidents during that time with appropriate

urgency.

Site Reliability Engineering

Open Source

Software that is distributed with its source

code so that end user organizations and

vendors can modify it for their own

purposes.

DevOps Foundation,

DevSecOps Foundation

Operations (Ops)

Individuals involved in the daily operational

activities needed to deploy and manage

systems and services such as quality

assurance analysts, release managers,

system and network administrators,

information security officers, IT operations

specialists and service desk analysts.

Continuous Delivery

Ecosystem Foundation

DevOps Glossary of Terms

Operations

Management

Function that performs the daily activities

needed to deliver and support IT services

and the supporting IT infrastructure at the

agreed levels. (ITIL)

DevSecOps Foundation

Ops

Individuals involved in the daily operational

activities needed to deploy and manage

systems and services such as quality

assurance analysts, release managers,

system and network administrators,

information security officers, IT operations

specialists and service desk analysts.

DevOps Foundation,

DevSecOps Foundation

Orchestration

An approach to building automation that

interfaces or "orchestrates" multiple tools

together to form a toolchain.

DevOps Foundation,

DevSecOps Foundation

Organization Culture

A system of shared values, assumptions,

beliefs, and norms that unite the members

of an organization.

DevOps Leader

Organization Model

For DevOps, an approach that models

Spotify's Squad approach for organizing IT.

DevOps Leader

Organizational

Change

Efforts to adapt the behavior of humans

within an organization to meet new

structures, processes or requirements.

DevOps Foundation,

DevSecOps Foundation

OS Virtualization

A method for splitting a server into multiple

partitions called "containers" or "virtual

environments" in order to prevent

applications from interfering with each

other.

DevOps Foundation

Outcome

Intended or actual results.

DevOps Foundation,

DevSecOps Foundation

Package Registry

A repository for software packages,

artifacts and their corresponding

metadata. Can store files produced by an

organization itself or for third party binaries.

Artifactory and Nexus are amongst the

most popular.

Site Reliability Engineering

Pages

Something for creating supporting web

pages automatically as part of a CI/CD

pipeline.

Site Reliability Engineering

Patch

A software update designed to address

(mitigate/remediate) a bug or weakness.

DevSecOps Foundation

Patch management

The process of identifying and

implementing patches.

DevSecOps Foundation

DevOps Glossary of Terms

Pathological Culture

Pathological cultures tend to view

information as a personal resource, to be

used in political power struggles (Westrum).

DevOps Leader, Site

Reliability Engineering

Penetration Testing

An authorized simulated attack on a

computer system that looks for security

weaknesses, potentially gaining access to

the system's features and data.

DevSecOps Foundation

People Changes

Focuses on changing attitudes, behaviors,

skills, or performance of employees.

DevOps Leader

Performance Test

The purpose of the test is to determine an

EUT meets its system performance criterion

or to determine what a system's

performance capabilities are.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Plan-Do-Check-Act

A four-stage cycle for process

management and improvement attributed

to W. Edwards Deming. Sometimes called

the Deming Cycle or PDCA.

Certified Agile Service

Manager, DevOps

Foundation, DevSecOps

Foundation

Platform‐as‐a‐Service

(PaaS)

Category of cloud computing services that

provides a platform allowing customers to

develop, run, and manage applications

without the complexity of building and

maintaining the infrastructure.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Plugin

A pre‐programmed integration between

an Orchestration tool and other tools. For

example, many tools offer plugins to

integrate with Jenkins.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Policies

Formal documents that define boundaries

in terms of what the organization may or

may not do as part of its operations.

DevOps Foundation,

DevSecOps Foundation

Policy as Code

The notion that security principles and

concepts can be articulated in code (e.g.,

software, configuration management,

automation) to a sufficient degree that the

need for an extensive traditional policy

framework is greatly reduced. Standards

and guidelines should be implemented in

code and configuration, automatically

enforced and automatically reported-on in

terms of compliance, variance or

suspected violations.

DevSecOps Foundation

Practice

A complete end to end capability for

managing a specific aspect of service

delivery (e.g. changes, incidents, service

levels).

Certified Agile Service

Manager

DevOps Glossary of Terms

Practice Backlog

A prioritized list of everything that needs to

be designed or improved for a practice

including current and future requirements.

Certified Agile Service

Manager

Practice/Microprocess

Planning

A high-level event to define the goals,

objectives, inputs, outcomes, activities,

stakeholders, tools and other aspects of a

practice or microprocess. This meeting is

not timeboxed.

Certified Agile Service

Manager

Pre‐Flight

This is a class of terms which refers names of

activities and processes that are

conducted on an EUT prior to integration

into the trunk branch.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Priority

The relative importance of an incident,

problem or change; based on impact and

urgency. (ITIL definition)

DevSecOps Foundation

Privileged Access

Management (PAM)

Technologies that help organizations

provide secured privileged access to

critical assets and meet compliance

requirements by securing, managing and

monitoring privileged accounts and

access. (Gartner)

DevSecOps Foundation

Problem

The underlying cause of one or more

incidents. (ITIL definition)

DevOps Foundation,

DevSecOps Foundation

Process

Structured set of activities designed to

accomplish a specific objective. A process

takes inputs and turns them into defined

outputs. Related work activities that take

specific inputs and produce specific

outputs that are of value to a customer.

Certified Agile Service

Manager, DevOps

Foundation, DevSecOps

Foundation

Process Changes

Focuses on changes to standard IT process,

such as software development practices,

ITIL processes, change management,

approvals etc.

DevOps Leader

Process Owner

Role accountable for the overall quality of

a process. May be assigned to the same

person who carries out the Process

Manager role, but the two roles may be

separate in larger organizations. (ITIL

definition)

DevSecOps Foundation

Process Standup

A timeboxed event of 15 minutes to inspect

progress towards the Sprint Goal and

identify impediments as quickly as possible.

Certified Agile Service

Manager

DevOps Glossary of Terms

Processing Time

The period during which one or more inputs

are transformed into a finished product by

a manufacturing or development

procedure. (Business Dictionary)

DevOps Leader

Product Backlog

Prioritized list of functional and non-

functional requirements for a system usually

expressed as user stories.

DevOps Foundation

Product Owner

An individual responsible for maximizing the

value of a product and for managing the

product backlog. Prioritizes, grooms, and

owns the backlog. Gives the squad

purpose.

DevOps Leader

Programming‐Based

Test cases are created by writing code in a

programming language. E.g. JavaScript,

Python, TCL, Ruby

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Provision Platforms

Tools that provide platforms for provisioning

infrastructure (e.g., Puppet, Chef, Salt).

DevOps Leader

Psychological Safety

Psychological safety is a shared belief that

the team is safe for interpersonal risk taking.

DevOps Leader

QTP

Quick Test Professional is a functional and

regression test automation tool for software

applications. Licensed by HP.

Continuous Testing

Foundation

Quality Management

Tools that handle test case planning, test

execution, defect tracking (often into

backlogs), severity and priority analysis.

CA’s Agile Central

Site Reliability Engineering

Ranorex

GUI test automation framework for testing

of desktop, web‐based and mobile

applications. Licensed by Ranorex.

Continuous Testing

Foundation

Ransomware

Encrypts the files on a user’s device or a

network’s storage devices. To restore

access to the encrypted files, the user must

pay a “ransom” to the cybercriminals,

typically through a tough-to-trace

electronic payment method such as

Bitcoin.

DevSecOps Foundation

Regression testing

The purpose of the test is to determine if a

new version of an EUT has broken

somethings that worked previously.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps Glossary of Terms

Regulatory

compliance testing

The purpose of the test is to determine if an

EUT conforms to specific regulatory

requirements. E.g. verify an EUT satisfies

government regulations for consumer

credit card processing.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Release

Software that is built, tested and deployed

into the production environment.

Continuous Delivery

Ecosystem Foundation,

DevOps Foundation,

DevSecOps Foundation

Release Acceptance

Criteria

Measurable attributes for a release

package which determine whether a

release candidate is acceptable for

deployment to customers.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Release Candidate

A release package that has been

prepared for deployment, may or may not

have passed the Release.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Release Governance

Release Governance is all about the

controls and automation (security,

compliance, or otherwise) that ensure your

releases are managed in an auditable and

trackable way, in order to meet the need

of the business to understand what is

changing.

Site Reliability Engineering

Release Management

Process that manages releases and

underpins Continuous Delivery and the

Deployment Pipeline.

DevOps Foundation,

DevSecOps Foundation

Release Orchestration

Typically a deployment pipeline, used to

detect any changes that will lead to

problems in production. Orchestrating

other tools will identify performance,

security, or usability issues. Tools like Jenkins

and Gitlab CI can “orchestrate” releases.

Site Reliability Engineering

Relevance

A Continuous Testing tenet which

emphasizes a preference to focus on the

most important tests and test results

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Reliability

Measure of how long a service,

component or CI can perform its agreed

function without interruption. Usually

measured as MTBF or MTBSI. (ITIL definition)

DevOps Foundation,

DevSecOps Foundation,

Site Reliability Engineering

Reliability Test

The purpose of the test is to determine if a

complete system performs as expected

under stressful and loaded conditions over

an extended period of time.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps Glossary of Terms

Remediation

Action to resolve a problem found during

DevOps processes. E.g. Roll‐back changes

for an EUT change that resulted in a CT a

test case fail verdict.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Remediation Plan

Plan that determines the actions to take

after a failed change or release. (ITIL

definition)

DevOps Foundation,

DevSecOps Foundation

Request for Change

(RFC)

Formal proposal to make a change. The

term RFC is often misused to mean a

change record, or the change itself. (ITIL

definition)

DevOps Foundation

Requirements

Management

Tools than handle requirements definition,

traceability, hierarchies & dependency.

Often also handles code requirements and

test cases for requirements.

Site Reliability Engineering

Resilience

Building an environment or organization

that is tolerant to change and incidents.

DevSecOps Foundation,

Site Reliability Engineering

Response Time

Response time is the total time it takes from

when a user makes a request until they

receive a response.

Site Reliability Engineering

REST

Representation State Transfer. Software

architecture style of the world‐wide web.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Restful API

Representational state transfer (REST) or

RESTful services on a network, such as HTTP,

provide scalable interoperability for

requesting systems to quickly and reliably

access and manipulate textual

representations (XML, HTML, JSON) of

resources using stateless operations (GET,

POST, PUT, DELETE, etc.).

Continuous Delivery

Ecosystem Foundation

RESTful interface

testing

The purpose of the test is to determine if an

API satisfies its design criterion and the

expectations of the REST architecture.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Return on Investment

(ROI)

Difference between the benefit achieved

and the cost to achieve that benefit,

expressed as a percentage.

DevOps Foundation,

DevSecOps Foundation

Review Apps

Allow code to be committed and

launched in real time – environments are

spun up to allow developers to review their

application.

Site Reliability Engineering

DevOps Glossary of Terms

Rework

The time and effort required to correct

defects (waste).

DevOps Leader

Risk

Possible event that could cause harm or

loss or affect an organization's ability to

achieve its objectives. The management of

risk consists of three activities: identifying

risks, analyzing risks and managing risks. The

probable frequency and probable

magnitude of future loss. Pertains to a

possible event that could cause harm or

loss or affect an organization's ability to

execute or achieve its objectives.

DevOps Foundation,

DevSecOps Foundation

Risk Event

Possible event that could cause harm or

loss or affect an organization's ability to

achieve its objectives. The management of

risk consists of three activities: identifying

risks, analyzing risks and managing risks.

DevOps Leader

Risk Management

Process

The process by which "risk" is

contextualized, assessed, and treated.

From ISO 31000: 1) Establish context, 2)

Assess risk, 3) Treat risk (remediate, reduce

or accept).

DevSecOps Foundation

Robot Framework

TDD framework created and supported by

Google.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Role

Set of responsibilities, activities and

authorities granted to a person or team. A

role is defined by a process. One person or

team may have multiple roles. A set of

permissions assigned to a user or group of

users to allow a user to perform actions

within a system or application.

DevOps Foundation,

DevSecOps Foundation

Role-based Access

Control (RBAC)

An approach to restricting system access

to authorized users.

DevSecOps Foundation

Roll‐back

Software changes which have been

integrated are removed from the

integration.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Root Cause Analysis

(RCA)

Actions take to identify the underlying

cause of a problem or incident.

DevOps Foundation,

DevSecOps Foundation

DevOps Glossary of Terms

Rugged Development

(DevOps)

Rugged Development (DevOps) is a

method that includes security practices as

early in the continuous delivery pipeline as

possible to increase cybersecurity, speed,

and quality of releases beyond what

DevOps practices can yield alone.

DevOps Foundation

Rugged DevOps

Rugged DevOps is a method that includes

security practices as early in the continuous

delivery pipeline as possible to increase

cybersecurity, speed, and quality of

releases beyond what DevOps practices

can yield alone.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Runbooks

A collection of procedures necessary for

the smooth operation of a service.

Previously manual in nature they are now

usually automated with tools like Ansible.

Site Reliability Engineering

Runtime Application

Self Protection (RASP)

Tools that actively monitor and block

threats in the production environment

before they can exploit vulnerabilities.

Site Reliability Engineering

Sanity Test

A very basic set of tests that determine if a

software is functional at all.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Scalability

Scalability is a characteristic of a service

that describes its capability to cope and

perform under an increased or expanding

load.

Site Reliability Engineering

Scaled Agile

Framework (SAFE)

A proven, publicly available, framework for

applying Lean-Agile principles and

practices at an enterprise scale.

DevOps Foundation

SCARF Model

A summary of important discoveries from

neuroscience about the way people

interact socially.

DevOps Leader

Scheduling

Scheduling: the process of planning to

release changes into production.

DevOps Leader

Scrum

A simple framework for effective team

collaboration on complex projects. Scrum

provides a small set of rules that create "just

enough" structure for teams to be able to

focus their innovation on solving what

might otherwise be an insurmountable

challenge. (Scrum.org)

Certified Agile Service

Manager, DevOps

Foundation

Scrum Pillars

Pillars that uphold the Scrum framework

that include: Transparency, Inspection and

Adaption.

Certified Agile Service

Manager

DevOps Glossary of Terms

Scrum Team

A self-organizing, cross-functional team

that uses the Scrum framework to deliver

products iteratively and incrementally. The

Scrum Team consists of a Product Owner,

Developers, and a Scrum Master.

DevOps Foundation

Scrum Values

A set of fundamental values and qualities

underpinning the Scrum framework:

commitment, focus, openness, respect and

courage.

Certified Agile Service

Manager

Scrum Master

An individual who provides process

leadership for Scrum (i.e., ensures Scrum

practices are understood and followed)

and who supports the Scrum Team by

removing impediments.

Certified Agile Service

Manager, DevOps

Foundation

Secret Detection

Secret Detection aims to prevent that

sensitive information, like passwords,

authentication tokens, and private keys are

unintentionally leaked as part of the

repository content.

Site Reliability Engineering

Secrets Management

Secrets management refers to the tools

and methods for managing digital

authentication credentials (secrets),

including passwords, keys, APIs, and tokens

for use in applications, services, privileged

accounts and other sensitive parts of the IT

ecosystem.

Site Reliability Engineering,

DevSecOps Foundation

Secure Automation

Secure automation removes the chance of

human error (and wilful sabotage) by

securing the tooling used across the

delivery pipeline.

Site Reliability Engineering

Security (Information

Security)

Practices intended to protect the

confidentiality, integrity and availability of

computer system data from those with

malicious intentions.

DevOps Foundation,

DevSecOps Foundation

Security as Code

Automating and building security into

DevOps tools and practices, making it an

essential part of tool chains and workflows.

DevOps Foundation,

DevSecOps Foundation

Security tests

The purpose of the test is to determine if an

EUT meets its security requirements. An

example is a test that determines if an EUT

processes login credentials properly.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Selenium

Popular open‐source tool for software

testing GUI and web applications.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps Glossary of Terms

Self-healing

Self-healing means the ability of services

and underlying environments to detect

and resolve problems automatically. It

eliminates the need for manual human

intervention.

Site Reliability Engineering

Serverless

A code execution paradigm were no

underlying infrastructure or dependencies

are needed, moreover a piece of code is

executed by a service provider (typically

cloud) who takes over the creation of the

execution environment. Lambda functions

in AWS and Azure Functions are examples.

Site Reliability Engineering

Service

Means of delivering value to customers by

facilitating outcomes customers want to

achieve without the ownership of specific

costs and risks.

DevOps Foundation,

DevSecOps Foundation

Service Desk

Single point of contact between the

service provider and the users. Tools like

Service Now are used for managing the

lifecycle of services as well as internal and

external stakeholder engagement.

DevOps Foundation

Service Level

Agreement (SLA)

Written agreement between an IT service

provider and its customer(s) that defines

key service targets and responsibilities of

both parties. An SLA may cover multiple

services or customers. (ITIL definition)

Site Reliability Engineering

Service Level Indicator

(SLI)

SLI's are used to communicate quantitative

data about services, typically to measure

how the service is performing against an

SLO.

Site Reliability Engineering

Service Level

Objective (SLO)

An SLO is a goal for how well a product or

service should operate. SLO's are set based

on what an organization is expecting from

a service.

Site Reliability Engineering

Seven Pillars of

DevOps

Seven distinct "pillars" provide a foundation

for DevOps systems which include

Collaborative Culture, Design for DevOps,

Continuous Integration, Continuous Testing,

Continuous Delivery and Deployment,

Continuous Monitoring and Elastic

Infrastructures and Tools.

Continuous Delivery

Ecosystem Foundation

DevOps Glossary of Terms

Shift Left

An approach that strives to build quality

into the software development process by

incorporating testing early and often. This

notion extends to security architecture,

hardening images, application security

testing, and beyond.

DevOps Foundation,

DevSecOps Foundation

SilkTest

Automated function and regression testing

of enterprise applications. Licensed by

Borland.

Continuous Testing

Foundation

Simian Army

The Simian Army is a suite of failure-

inducing tools designed by Netflix. The most

famous example is Chaos Monkey which

randomly terminates services in production

as part of a Chaos Engineering approach.

Site Reliability Engineering

Single Point of Failure

(SPOF)

A single point of failure (SPOF) is a part of a

system that, if it fails, will stop the entire

system from working.

DevOps Foundation

Site Reliability

Engineering (SRE)

The discipline that incorporates aspects of

software engineering and applies them to

infrastructure and operations problems. The

main goals are to create scalable and

highly reliable software systems.

Site Reliability Engineering

Smoke Test

A basic set of functional tests that are run

immediately after a software component is

built. Same as CI Regression Test.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Snapshot

Report of pass/fail results for a specific

build.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Snippets

Stored and shared code snippets to allow

collaboration around specific pieces of

code. Also allows code snippets to be used

in other code-bases. BitBucket and GitLab

allow this.

Site Reliability Engineering

SOAP

Simple Object Access Protocol (SOAP) is an

XML-based messaging protocol for

exchanging information among

computers.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Software Composition

Analysis

A tool that checks for libraries or functions

in source code that have known

vulnerabilities.

DevSecOps Foundation

DevOps Glossary of Terms

Software Defined

Networking (SDN)

Software-Defined Networking (SDN) is a

network architecture approach that

enables the network to be intelligently and

centrally controlled, or 'programmed,' using

software applications.

Site Reliability Engineering

Software Delivery

Lifecycle (SDLC)

The process used to design, develop and

test high quality software.

DevOps Leader, Site

Reliability Engineering

Software Version

Management System

A repository tool which is used to manage

software changes. Examples are: Azure

DevOps, BitBucket, Git, GitHub, GitLab,

VSTS.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Software‐as‐a‐Service

(SaaS)

Category of cloud computing services in

which software is licensed on a subscription

basis.

DevOps Foundation,

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Source Code Tools

Repositories for controlling source code for

key assets (application and infrastructure)

as a single source of truth.

DevOps Foundation,

DevOps Leader

Spotify Squad Model

An organizational model that helps teams

in large organizations behave like startups

and be nimble.

DevOps Foundation,

DevOps Leader

Sprint

A period of 2‐4 weeks during which an

increment of product work is completed.

Continuous Delivery

Ecosystem Foundation

Sprint (Scrum)

A time-boxed iteration of work during

which an increment of product

functionality is implemented.

DevOps Foundation

Sprint Backlog

Subset of the backlog that represents the

work that must be completed to realize the

Sprint Goal.

Certified Agile Service

Manager, DevOps

Foundation

Sprint Goal

Purpose and objective of a Sprint, often

expressed as a business problem that is

going to be solved.

Certified Agile Service

Manager

Sprint Planning

A 4 to 8-hour time-boxed event that

defines the Sprint Goal, the increment of

the Product Backlog that will be

completed during the Sprint and how it will

be completed.

Certified Agile Service

Manager

Sprint Retrospective

A 1.5 to 3-hour time-boxed event during

which the Team reviews the last Sprint and

identifies and prioritizes improvements for

the next Sprint.

Certified Agile Service

Manager

DevOps Glossary of Terms

Sprint Review

A time-boxed event of 4 hours or less where

the Team and stakeholders inspect the

work resulting from the Sprint and update

the Product Backlog.

Certified Agile Service

Manager

Spyware

Software that is installed in a computer

without the user's knowledge and transmits

information about the user's computer

activities over back to the threat agent.

DevSecOps Foundation

Squads

A cross-functional, co-located,

autonomous, self-directed team.

DevOps Leader

Stakeholder

Person who has an interest in an

organization, project or IT service.

Stakeholders may include customers, users

and suppliers. (ITIL definition).

DevOps

Foundation, DevSecOps

Foundation

Stability

The sensitivity a service has to accept

changes and the negative impact that

may be caused by system changes.

Services may have reliability, in that if

functions over a long period of time, but

may not be easy to change and so does

not have stability.

Site Reliability Engineering

Standard Change

Pre-approved, low risk change that follows

a procedure or work instruction. (ITIL

definition)

DevOps

Foundation, DevSecOps

Foundation

Static Application

Security Testing (SAST)

A type of testing that checks source code

for bugs and weaknesses.

DevSecOps Foundation

Static Code Analysis

The purpose of the test is to detect source

code logic errors and omissions such as

memory leaks, unutilized variables,

unutilized pointers.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Status Page

Service pages that easily communicate the

status of services to customers and users.

Site Reliability Engineering

Sticks

Negative incentives, for discouraging or

punishing undesired behaviors.

DevSecOps Foundation

Storage Security

A specialty area of security that is

concerned with securing data storage

systems and ecosystems and the data that

resides on these systems.

Site Reliability Engineering

Stormstack

A commercial orchestration tool based on

event triggers instead of time based.

Continuous Testing

Foundation

StoStaKee

This stands for stop, start, and keep: this is

an interactive time-boxed exercise focused

on past events.

DevOps Leader

DevOps Glossary of Terms

Strategic Sprint

A <4 week timeboxed Sprint during which

strategic elements that were defined

during Practice Planning are completed so

that the Team can move on to designing

the activities of the process.

Certified Agile Service

Manager

Structural Changes

Changes in the hierarchy of authority,

goals, structural characteristics,

administrative procedures and

management systems.

DevOps Leader

Supplier

External (third party) supplier, manufacturer

or vendor responsible for supplying goods

or services that are required to deliver IT

services.

DevOps Foundation

Synthetic Monitoring

Synthetic monitoring (also known as active

monitoring, or semantic monitoring) runs a

subset of an application's automated tests

against the system on a regular basis. The

results are pushed into the monitoring

service, which triggers alerts in case of

failures.

Continuous Delivery

Ecosystem Foundation

System of Record

A system of record is the authoritative data

source for a data element or data entity.

DevOps

Foundation, DevSecOps

Foundation

System Test

The purpose of the test is to determine if a

complete system performs as expected in

its intended configurations.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

System Under Test

(SUT)

The EUT is an entire system. E.g. Bank teller

machine is being tested.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Tag‐Based Test

Selection Method

Tests and Code modules are pre‐assigned

tags. Tests are selected for a build

matching pre‐assigned tags.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Target Operating

Model

A description of the desired state of

the operating model of an organisation.

DevOps Leader

Teal Organization

An emerging organizational paradigm that

advocates a level of consciousness

including all previous world views within the

operations of an organisation.

DevOps Leader

DevOps Glossary of Terms

Team Dynamics

A measurement of how a team works

together. Includes team culture,

communication styles, decision making

ability, trust between members, and the

willingness of the team to change.

DevOps Leader

Techno-Economic

Paradigm Shifts

Techno-economic paradigm shifts are at

the core of general, innovation-based

theory of economic and societal

development as conceived by Carlota

Perez.

DevOps Leader

Telemetry

Telemetry is the collection of

measurements or other data at remote or

inaccessible points and their automatic

transmission to receiving equipment for

monitoring.

Site Reliability Engineering

Test Architect

Person who has responsibility for defining

the overall end‐to‐end test strategy for an

EUT.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Artifact

Repository

Database of files used for testing.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Campaign

A test campaign may include one or more

test sessions.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Case

Set of test steps together with data and

configuration information. A test case has a

specific purpose to test at least one

attribute of the EUT.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Creation Methods

This is a class of test terms which refers to

the methodology used to create test

cases.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps Glossary of Terms

Test Driven

Development (TDD)

Test-driven development (TDD) is a

software development process in which

the developer writes a test before

composing code. They then follow this

process:

1. Write the test

2. Run the test and any others that are

relevant and see them fail

3. Write the code

4. Run test(s)

5. Refactor code if needed

6. Repeat

Unit level tests and/or application tests are

created ahead of the code that is to be

tested.

Continuous Delivery

Ecosystem Foundation,

DevOps Foundation,

Continuous Testing

Foundation

Test Duration

The time it takes to run a test. E.g. # hours

per test

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Environment

The test environment refers to the

operating system (e.g. Linus, windows

version etc.), configuration of software

(e.g. parameter options), dynamic

conditions (e.g. CPU and memory

utilization) and physical environment (e.g.

power, cooling) in which the tests are

performed.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Fast

A CT tenet referring to accelerated testing.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Framework

A set of processes, procedures, abstract

concept and environment in which

automated tests are designed and

implemented.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Harness

A tool which enables the automation of

tests. It refers to the system test drivers and

other supporting tools that requires to

execute tests. It provides stubs and drivers

which are small programs that interact with

the software under test.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Hierarchy

This is a class of terms describes the

organization of tests into groups.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps Glossary of Terms

Test Methodology

This class of terms identifies the general

methodology used by a test. Examples are

White Box, Black Box

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test result repository

Database of test results.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Results Trend‐

based

A matrix of correlation factors correlates

test cases and code modules according to

test result (verdict).

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Roles

This class of terms identifies general roles

and responsibilities for people relevant to

testing.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Script

Automated test case. A single test script

may be implemented one or more test

cases depending on the data.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Selection Method

This class of terms refers to the method

used to select tests to be executed on a

version of an EUT.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Session

Set of one or more test suites that are run

together on a single build at a specific

time.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Suite

Set of test cases that are run together on a

single build at a specific time.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Trend

History of verdicts.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Type

Class that indicates what the purpose of

the test is.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Test Version

The version of files used to test a specific

build.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps Glossary of Terms

Tester

Individual who has responsibility to test a

system or service.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Testing Tools

Tools that verify code quality before

passing the build.

DevOps Leader

The Advice Process

Any person deciding must seek advice

from everyone meaningfully affected by

the decision and people with expertise in

the matter. Advice received must be taken

into consideration, though it does not have

to be accepted or followed. The objective

of the advice process is not to form

consensus, but to inform the decision-

maker so that they can make the best

decision possible. Failure to follow the

advice process undermines trust and

unnecessarily introduces risk to the business.

DevSecOps Foundation

The Checkbox Trap

The situation wherein an audit-centric

perspective focuses exclusively on

"checking the box" on compliance

requirements without consideration for

overall security objectives.

DevSecOps Foundation

The Power of TED

The Power of TED* offers an alternative to

the Karpman Drama Triangle with its roles

of Victim, Persecutor, and Rescuer. The

Empowerment Dynamic (TED) provides the

antidote roles of Creator, Challenger and

Coach and a more positive approach to

life's challenges.

DevOps Leader

The Sprint

A period of <4 weeks during which an

increment of work is completed.

Certified Agile Service

Manager

The Three Ways

Key principles of DevOps – Flow, Feedback,

Continuous experimentation and learning.

DevOps Foundation,

DevSecOps Foundation,

Site Reliability Engineering

Theory of Constraints

Methodology for identifying the most

important limiting factor (i.e., constraint)

that stands in the way of achieving a goal

and then systematically improving that

constraint until it is no longer the limiting

factor.

DevOps

Foundation, DevSecOps

Foundation

Thomas Kilmann

Inventory (TKI)

Measures a person's behavioral choices

under certain conflict situations.

DevOps Foundation

DevOps Glossary of Terms

Threat Agent

An actor, human or automated, that acts

against a system with intent to harm or

compromise that system. Sometimes also

called a "Threat Actor."

DevSecOps Foundation

Threat Detection

Refers to the ability to detect, report, and

support the ability to respond to attacks.

Intrusion detection systems and denial-of-

service systems allow for some level of

threat detection and prevention.

Threat Intelligence

Information pertaining to the nature of a

threat or the actions a threat may be

known to be perpetrating. May also

include "indicators of compromise" related

to a given threat's actions, as well as a

"course of action" describing how to

remediate the given threat action.

DevSecOps Foundation

Threat Modeling

A method that ranks and models potential

threats so that the risk can be understood

and mitigated in the context of the value

of the application(s) to which they pertain.

DevSecOps Foundation

Time to Market

The period of time between when an idea

is conceived and when it is available to

customers.

DevOps Leader

Time to Value

Measure of the time it takes for the business

to realize value from a feature or service.

DevOps

Foundation, DevSecOps

Foundation

Time Tracking

Tools that allow for time to be tracked,

either against individual issues or other work

or project types.

Site Reliability Engineering

Timebox

Maximum duration of a Scrum event.

Certified Agile Service

Manager

Toil

A kind of work tied to running a production

service that tends to be manual, repetitive,

automatable, tactical, devoid of enduring

value.

Site Reliability Engineering

Tool

This class describes tools that orchestrate,

automate, simulate and monitor EUT's and

infrastructures.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Toolchain

A philosophy that involves using an

integrated set of complimentary task

specific tools to automate an end to end

process (vs. a single-vendor solution).

DevOps Foundation

DevOps Glossary of Terms

Touch Time

In a Lean Production system the touch

time is the time that the product is actually

being worked on, and value is being

added.

DevOps Leader

Tracing

Tracing provides insight into the

performance and health of a deployed

application, tracking each function or

microservice which handles a given

request.

Site Reliability Engineering

Traffic Volume

The amount of data sent and received by

visitors to a service (e.g. a website or API).

Site Reliability Engineering

Training From the Back

of the Room

An accelerated learning model in line with

agile values and principles using the 4Cs

instructional design “map” (Connection,

Concept, Concrete Practice, Conclusion).

Transformational

Leadership

A leadership model in which leaders inspire

and motivate followers to achieve higher

performance by appealing to their values

and sense of purpose, facilitating wide-

scale organizational change (State of

DevOps Report, 2017).

DevOps Leader

Tribe Lead

A senior technical leader that has broad

and deep technical expertise across all the

squads' technical areas. A group of squads

working together on a common feature

set, product or service is a tribe in Spotify's

definitions.

DevOps Leader

Tribes

A collection of squads with a long-term

mission that work on/in a related business

capability.

DevOps Leader

Trojan (horses)

Malware that carries out malicious

operations under the appearance of a

desired operation such as playing an online

game. A Trojan horse differs from a virus

because the Trojan binds itself to non-

executable files, such as image files, audio

files whereas a virus requires an executable

file to operate.

DevSecOps Foundation

Trunk

The primary source code integration

repository for a software product.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Unit Test

The purpose of the test is to verify code

logic.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

DevOps Glossary of Terms

Usability Test

The purpose of the test is to determine if

humans have a satisfactory experience

when using an EUT.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

User

Consumer of IT services. Or, the identity

asserted during authentication (aka

username).

DevOps

Foundation, DevSecOps

Foundation

User and Entity

Behavior Analytics

(UEBA)

A machine learning technique to analyze

normal and “abnormal” user behaviour

with the aim of preventing the latter.

Site Reliability Engineering

User Story

A brief statement used to describe a

requirement from a user’s perspective. User

stories are used to facilitate

communication, planning, and negotiation

activities between the stakeholders and

the Agile Service Management Team.

Certified Agile Service

Manager

Value Added Time

The amount of time spent on an activity

that creates value (e.g., development,

testing).

DevOps Leader

Value Efficiency

Being able to produce value with the

minimum amount of time and resources.

DevOps Leader

Value Stream

All of the activities to go from a customer

request to a delivered product or service.

DevOps Foundation

Value Stream Map

Visually depicts the end-to-end flow of

activities from the initial request to value

creation for the customer.

Certified Agile Service

Manager

Value Stream

Mapping

Lean tool that depicts the flow of

information, materials and work across

functional silos with an emphasis on

quantifying waste, including time and

quality.

DevOps Foundation

Value Stream

Management

The ability to visualize the flow of value

delivery through the DevOps lifecycle.

Gitlab CI and the Jenkins extension (from

Cloud Bees) DevOptics can provide this

visualization.

Certified Agile Service

Manager, Site Reliability

Engineering

Variable Speed IT

An approach where traditional and digital

processes co-exist within an organization

while moving at their own speed.

DevOps Foundation

Velocity

Measure of the quantity of work done in a

pre-defined interval. The amount of work

an individual or team can complete in a

given amount of time.

Certified Agile Service

Manager, DevOps

Foundation, DevSecOps

Foundation, Site Reliability

Engineering

DevOps Glossary of Terms

Verdict

Test result classified as Fail, Pass or

Inconclusive.

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Version control tools

Ensure a 'single source of truth' and enable

change control and tracking for all

production artifacts.

DevOps Foundation

Vertical Scaling

Computing resources are scaled higher to

increase processing speed e.g. using faster

computers to run more tasks faster.

Continuous Testing

Foundation

Virus (Computer)

Malicious executable code attached to a

file that spreads when an infected file is

passed from system to system that could

be harmless (but annoying) or it could

modify or delete data.

DevSecOps Foundation

Voice of the

Customer (VOC)

A process that captures and analyzes

customer requirements and feedback to

understand what the customer wants.

DevOps Foundation

Vulnerability

A weakness in a design, system, or

application that can be exploited by an

attacker.

DevSecOps Foundation

Vulnerability

Intelligence

Information describing a known

vulnerability, including affected software

by version, relative severity of the

vulnerability (for example, does it result in

escalation of privileges for user role, or does

it cause a denial of service), exploitability

of the vulnerability (how easy/hard it is to

exploit), and sometimes current rate of

exploitation in the wild (is it being actively

exploited or is it just theoretical). This

information will also often include

guidance on what software versions are

known to have remediated the described

vulnerability.

DevSecOps Foundation

Vulnerability

management

The process of identifying and remediating

vulnerabilities.

DevSecOps Foundation

Wait Time

The amount of time wasted on waiting for

work (e.g., waiting for development and

test infrastructure, waiting for resources,

waiting for management approval).

DevOps Leader

Waste (Lean

Manufacturing)

Any activity that does not add value to a

process, product or service.

Certified Agile Service

Manager, DevOps

Foundation, DevOps

Leader

DevOps Glossary of Terms

Water‐scrum‐fall

A hybrid approach to application lifecycle

management that combines waterfall and

Scrum development can complete in a

given amount of time.

Continuous Delivery

Ecosystem Foundation

Waterfall (Project

Management)

Linear and sequential approach to

managing software design and

development projects in which progress is

seen as flowing steadily (and sequentially)

downwards (like a waterfall).

Certified Agile Service

Manager, Continuous

Delivery Ecosystem

Foundation, DevOps

Foundation

Weakness

An error in software that can be exploited

by an attacker to compromise the

application, system, or the data contained

therein. Also called a vulnerability.

DevSecOps Foundation

Web Application

Firewall (WAF)

Tools that examine traffic being sent to an

application and can block anything that

looks malicious.

Site Reliability Engineering

Web IDE

Tools that have a web client integrated

development environment. Enables

developer productivity without having to

use a local development tool.

Site Reliability Engineering

Westrum

(Organization

Types)

Ron Westrum developed a typology of

organizational cultures that includes

three types of organizations:

Pathological (power-oriented),

Bureaucratic (rule-oriented) and

Generative (performance-oriented).

DevSecOps Foundation,

Site Reliability

Engineering

White‐Box Testing

(or Clear-, Glass-,

Transparent-Box

Testing or Structural

Testing)

Test cases use extensive knowledge of

the internal design structure or workings

of an application, as opposed to its

functionality (i.e. Black-Box Testing).

Continuous Delivery

Ecosystem Foundation,

Continuous Testing

Foundation

Whitelisting

Application whitelisting is the practice

of specifying an index of approved

software applications that are

permitted to be present and active on

a computer system.

Continuous Delivery

Ecosystem Foundation

Wicked Questions

Wicked questions are used to expose

the assumptions which shape our

actions and choices. They

are questions that articulate the

embedded, and often contradictory

assumptions, we hold about an issue, a

problem or a context.

DevOps Leader

DevOps Glossary of Terms

Wiki

Knowledge sharing can be enabled by

using tools like Confluence which

create a rich Wiki of content

Site Reliability

Engineering

Wilber's Quadrants

A model that recognises four modes of

general approach for human beings.

Two axes are used: on one axis people

tend towards individuality OR

collectivity.

DevOps Leader

Work in Progress

(WIP)

Any work that has been started but has

not been completed.

DevOps Foundation

Workaround

Temporary way to reduce or eliminate

the impact of incidents or problems.

May be logged as a known error in the

Known Error Database. (ITIL definition).

DevOps

Foundation, DevSecOps

Foundation

World Café

Is a structured conversational process

for knowledge sharing in which groups

of people discuss a topic at several

tables, with individuals switching tables

periodically and getting introduced to

the previous discussion at their new

table by a "table host".

DevOps Leader

Worms (Computer)

Worms replicate themselves on a

system by attaching themselves to

different files and looking for pathways

between computers. They usually slow

down networks and can run by

themselves (where viruses need a host

program to run).

DevSecOps Foundation

SRE Practitioner Course: Value Added Resources

Videos Featured in the Course

Module

Title & Description

Link

Module 1

SRE Anti-patterns

Persistent SRE Antipatterns with

Blake Bisset and Jonah Horowitz

https://www.youtube.com/watch?v

=7Y06GIHlZl8

Module 2

SLO is the Proxy for Customer

Happiness

SLI/SLOs Deep Dive with

David Blank Edelman

https://www.youtube.com/watch?v

=dplGoewF4DA

Module 3

Building Secure and Reliable Systems

Building Secure & Reliable Systems

with Heather Adkins

https://youtu.be/0LlBmPW3F1c?t=690

Assignment

Non-Abstract Large Scale Design

Assignment

Google SRE classroom

https://www.youtube.com/watch?v

=bOXkgMuVuYY

Module 4

Full Stack Observability

OpenTelemetry with

Constance Caramanolis

https://youtu.be/S0-t-Mgbhsc?t=119

Module 5

Using Platform Engineering & AIOps

AI in Ops with Stylianos Kampakis

https://youtu.be/GSS_rTXkpFU?t=203

This document provides links to articles and videos related to the Site Reliability

Engineering (SRE) Practitioner course from DevOps Institute. This information is

provided to enhance your understanding of SRE-related concepts and terms

and is not examinable. Of course, there is a wealth of other videos, blogs and

case studies on the web. We welcome suggestions for additions.

SRE Practitioner Course: Value Added Resources

Module 6

SRE & Incident Response

Management

Runbook Automation with

Damon Edwards

https://youtu.be/uyJ-FJXD5co?t=140

Assignment

Instrumenting Gremlin Assignment

https://youtu.be/w_Y6C0QgmL0?t=2

912

Module 7

Chaos Engineering

Practical Chaos Engineering

Adrian Hornsby

https://www.youtube.com/watch?t=

733&v=w_Y6C0QgmL0&feature=yout

u.be

Module 8

SRE is the Purest Form of DevOps

SRE and Digital Business

https://youtu.be/T01ge8byOoU?t=25

Case Studies Featured in the Course

Module

Title & Description

Link

Module 1

SRE Anti-patterns

Defense in Depth works for Reliability

– Monzo Bank

https://youtu.be/OUYTNywPk-s?t=148

Module 2

SLO is the Proxy for Customer

Happiness

Home Depot

https://sre.google/workbook/slo-

engineering-case-studies/

Module 2

SLO is the Proxy for Customer

Happiness

Kudos Engineering

https://youtu.be/KmVDkBmnb4U?t=6

Module 2

SLO is the Proxy for Customer

Happiness

AWS summary of SLAs

SLAs for Microsoft Azure

SRE Practitioner Course: Value Added Resources

Cloud SLAs

Google Cloud Platform SLAs

Module 3

Building Secure and Reliable Systems

Google Chrome Security Team

https://youtu.be/fNyT7HNKQfk?t=332

https://learning.oreilly.com/library/vie

w/building-secure-

and/9781492083115/ch19.html#onen

ine_case_study_chrome_security_tea

Module 4

Full Stack Observability

Planet Case Study

https://youtu.be/5aNeNhKNlUM?t=12

Module 5

Using Platform Engineering & AIOps

How FedEx uses AIOps to improve

Operational Efficiencies’

https://opusresearch.net/wordpress/

2017/10/02/case-study-how-fedex-is-

leveraging-intelligent-assistants-ai-

and-natural-language-

understanding/

Module 5

Using Platform Engineering & AIOps

How 3M Modernized IT Event

Management and Alerting

Using AIOPs

https://www.splunk.com/en_us/form/

how-3m-modernized-it-event-

management-and-alerting-with-

splunk.html

Module 6

SRE & Incident Response

Management

HCL helps its customers better

manage and monitor their modern IT

environments

https://www.moogsoft.com/resource

s/aiops/case-study/moogsoft-hcl-

technologies-case-study/

Module 8

SRE is the Purest Form of DevOps

AirBnB’s adoption of practical SRE

https://youtu.be/T01ge8byOoU?t=25

SRE Practitioner Course: Value Added Resources

References to Articles

Module

Title & Description

Link

1. SRE Anti-patterns

Pitfalls on the Road to

Creating a Successful

SRE Program Like Netflix

and Google

https://www.usenix.org/conference/lisa17/

conference-program/presentation/bisset

1. SRE Anti-patterns

Google Explains Why

Others Are Doing SRE

Wrong

https://www.infoq.com/news/2018/07/goo

gle-explains-sre/

1. SRE Anti-patterns

Pets, Cattle, Chickens,

and Snowflakes

https://subscription.packtpub.com/book/vi

rtualization_and_cloud/9781785882753/1/c

h01lvl1sec08/pets-cattle-chickens-and-

snowflakes

1. SRE Anti-patterns

TechBiz Do you know

what SRE is and what it

can do for your

business?

https://en.paradigmadigital.com/techbiz/

do-you-know-what-sre-is-and-what-it-can-

do-for-your-business/

1. SRE Anti-patterns

SRE Anti-patterns in

everyday life and what

do they teach us

https://content.sonatype.com/2020addo-

sre/addo2020-sre-

petoff?__hstc=160429922.baf9e6a9a6b98c

7f5180e92ae5a71875.161348730006.116151

15690600.1615123090615.4&__hssc=160429

922.3.1615123090615&__hsfp=1318888879

1. SRE Anti-patterns

How to "SRE" a Travel

Emergency

https://www.sidewalksafari.com/2018/12/sr

e-in-a-travel-emergency.html

1. SRE Anti-patterns

Site Reliability

Engineering; that’s

music to my ears!

SRE@bol.com

https://techlab.bol.com/site-reliability-

engineering-thats-music-to-my-ears/

SRE Practitioner Course: Value Added Resources

1. SRE Anti-patterns

SRE Anti-Pattern: “Do it.

Do it again. Then do it

again.”

https://www.rundeck.com/blog/sre-anti-

pattern-do-it-then-do-it-again

1. SRE Anti-patterns

4 DevOps Anti-patterns

That Lead to Disaster

https://techbeacon.com/devops/4-

devops-anti-patterns-lead-disaster

1. SRE Anti-patterns

Postmortem Culture:

Learning from Failure

https://sre.google/workbook/postmortem-

culture/

2. SLO is the proxy for

Customer Happiness

10 Steps to Implement

SLO in Your Organisation

https://content.sonatype.com/2020addo-

sre/addo2020-sre-

barteneva?__hstc=160429922.baf9e6a9a6

b98c7f5180e92ae5a71875.1613487300061.1

615115690600.1615123090615.4&__hssc=16

0429922.3.1615123090615&__hsfp=1318888

879

2. SLO is the proxy for

Customer Happiness

I want all the 9s….. in my

SLO

https://www.youtube.com/watch?v=KhJb

brKy1pw&t=2268s

2. SLO is the proxy for

Customer Happiness

Evernote’s SLO Story

(Case Study)

https://sre.google/workbook/slo-

engineering-case-studies/

2. SLO is the proxy for

Customer Happiness

The Home Depot Case

Story VALET

https://sre.google/workbook/slo-

engineering-case-studies/

2. SLO is the proxy for

Customer Happiness

ERROR BUDGET Practical

application when 3rd

party software is

involved

https://youtu.be/uBbE8HTXbaw?t=882

3. Building Secure

and Reliable Systems

Non-Abstract Large-

Scale Design NALSD

https://docs.google.com/presentation/d/1

jW2S9yYZf5DYmri0KlOu1DFSZMTeOswl6ce5

V9xMIOQ/edit?resourcekey=0-

SRE Practitioner Course: Value Added Resources

3. Building Secure

and Reliable Systems

The intersection of

Security and Reliability

https://learning.oreilly.com/library/view/bui

lding-secure-

and/9781492083115/ch01.html#reliability_v

ersus_security_design_cons

3. Building Secure

and Reliable Systems

How do you design for a

changing landscape?

https://learning.oreilly.com/library/view/bui

lding-secure-

and/9781492083115/ch07.html#design_for

_a_changing_landscape

https://learning.oreilly.com/library/view/bui

lding-secure-

and/9781492083115/ch07.html#design_for

_a_changing_landscape

3. Building Secure

and Reliable Systems

Building successful SRE in

large enterprises

https://www.oreilly.com/library/view/veloci

ty-

conference/9781492025870/video323188.h

tml

3. Building Secure

and Reliable Systems

Data Privacy and

Security

https://www.varonis.com/blog/data-

privacy/

3. Building Secure

and Reliable Systems

"Building Reliable

Systems Masterclass"

course

http://www.russmiles.com/building-

reliable-systems.html

3. Building Secure

and Reliable Systems

Clarifying Containers,

Microservices, and

Kubernetes

https://devopsinstitute.com/clarifying-

containers-microservices-and-kubernetes-

with-tracy-ragan-of-deployhub-e10/

3. Building Secure

and Reliable Systems

Cloud Operations and

Analytics

https://www.slideshare.net/JorgeCardoso4

/cloud-operations-and-analytics-

improving-distributed-systems-reliability-

using-fault-injection

SRE Practitioner Course: Value Added Resources

3. Building Secure

and Reliable Systems

Kubernetes Up &

Running

https://clouddamcdnprodep.azureedge.n

et/gdc/gdckTlBtc/original

4. Full Stack

Observability

A Collection of Best

Practices for Production

Services

https://sre.google/sre-book/service-best-

practices/

4. Full Stack

Observability

Differences Between

Synthetic Monitoring

and Real User

Monitoring

https://stackify.com/rum-vs-synthetic-

monitoring/

4. Full Stack

Observability

Good article on

Observability and

Monitoring

https://youtu.be/pY44UX8j4Pc?t=26

4. Full Stack

Observability

Observability — A 3-

Year Retrospective

https://thenewstack.io/observability-a-3-

year-retrospective/

4. Full Stack

Observability

Monitoring and

Observability — What’s

the Difference and Why

Does It Matter?

https://thenewstack.io/monitoring-and-

observability-whats-the-difference-and-

why-does-it-matter/

4. Full Stack

Observability

Observability at Google

https://www.oreilly.com/library/view/obser

vability-at-

google/0636920424239/video329911.html

5. Platform

Engineering and

AIOPs

Building Self-Healing

with AIOps

https://www.dynatrace.com/news/blog/sh

ift-left-sre-building-self-healing-into-your-

cloud-delivery-pipeline/

SRE Practitioner Course: Value Added Resources

5. Platform

Engineering and

AIOPs

The Seven Steps to

Implement #DataOps

https://www.youtube.com/watch?v=muhs

8zJnETM

5. Platform

Engineering and

AIOPs

Proactively Detect

Unusual Behavior

https://newrelic.com/lp/aiops?utm_camp

aign=AIOps-

Emerging&utm_medium=cpc&utm_source

=google&utm_content=AIO_LP&fiscal_year

=FY21&quarter=Q4&gtm=OPS&program=ai

ops&ad_type=None&geo=EMERGING&ut

m_term=aiops&utm_device=c&_bt=504169

602696&_bm=e&_bn=g&gclid=Cj0KCQjw5

PGFBhC2ARIsAIFIMNeggGr9EMZ_f9XTeybtx

JldKxxZEFp1g4oMXpPyaOq5NUBHT5FyWFU

aAn6uEALw_wcB

5. Platform

Engineering and

AIOPs

Gartner Market Guide

for AIOps Platforms,

2021

https://digitate.com/market-guide-for-

aiops-

platforms/?utm_source=google&utm_medi

um=search&utm_campaign=corporate-

aiops&utm_content=aiops-gartner-Aiops

5. Platform

Engineering and

AIOPs

A successful digital

transformation requires

AIOps

https://sciencelogic.com/solutions/aiops

5. Platform

Engineering and

AIOPs

The Rise of Platform

Engineering

https://softwareengineeringdaily.com/202

0/02/13/setting-the-stage-for-platform-

engineering/

5. Platform

Engineering and

AIOPs

Top 5 things to know

about Platform

Engineering

https://www.youtube.com/watch?v=htQfj

klTNrM

6. SRE & Incident

Response

Management

High Velocity IT

https://www.axelos.com/news/blogs/janu

ary-2020/itil-4-high-velocity-it-the-digital-

enterprise

SRE Practitioner Course: Value Added Resources

6. SRE & Incident

Response

Management

Valuable Investments,

Fast Development,

Resilient Operations

https://www.itsmacademy.com/content/

What_is_ITIL_4_HVIT.pdf

6. SRE & Incident

Response

Management

Recovery Point

Objective (RPO)

https://whatis.techtarget.com/definition/re

covery-point-objective-RPO

6. SRE & Incident

Response

Management

DevOps vs. ITIL 4 vs. SRE:

Stop the arguments

https://enterprisersproject.com/article/201

9/11/devops-vs-itil4-vs-SRE

6. SRE & Incident

Response

Management

Welcome to the future

of ITIL 4

https://www.axelos.com/welcome-to-itil-4

6. SRE & Incident

Response

Management

Unmanaged Incidents

https://sre.google/sre-book/managing-

incidents/

6. SRE & Incident

Response

Management

Being on call?

https://response.pagerduty.com/about/

6. SRE & Incident

Response

Management

Why a 3-tier support

should be replaced with

SWARMING

https://jonstevenshall.medium.com/itsm-

devops-and-why-the-three-tier-structure-

must-be-replaced-with-swarming-

91e76ba22304

6. SRE & Incident

Response

Management

This is How to Use ITIL,

DevOps, and SRE Best

Practices

https://www.blameless.com/blog/itil-

devops-sre-work-together

6. SRE & Incident

Response

Management

Incident Response

Training,

PagerDuty Academy

https://response.pagerduty.com/training/c

ourses/incident_response/

SRE Practitioner Course: Value Added Resources

6. SRE & Incident

Response

Management

Tracking Every Release

https://codeascraft.com/2010/12/08/track

-every-release/

6. SRE & Incident

Response

Management

Accelerating SREs to

On-Call and Beyond

https://sre.google/sre-book/accelerating-

sre-on-call/

7. Chaos Engineering

Chaos engineering:

Stress Testing the Cloud

https://www2.deloitte.com/us/en/pages/c

onsulting/articles/chaos-engineering-stress-

testing-the-cloud-sre-devops-cloud-value-

devops-reliability-risk-management-test-

management.html

7. Chaos Engineering

Disaster Recovery

Testing (DiRT)

Test Template

https://docs.google.com/document/d/1nx

YuX62SvKst9YuozJCsBEWU9AltvtV9mgnd-

HbzBrA/edit#heading=h.hpgc9ckwivdb

7. Chaos Engineering

Security Chaos

Engineering

https://www.oreilly.com/library/view/securi

ty-chaos-engineering/9781492080350/

7. Chaos Engineering

Principles of Chaos

Engineering

https://principlesofchaos.org/

7. Chaos Engineering

Chaos Engineering: the

history, principles, and

practice

https://www.gremlin.com/community/tutor

ials/chaos-engineering-the-history-

principles-and-practice/

7. Chaos Engineering

How to Use Chaos

Engineering to Break

Things Productively

https://www.infoq.com/articles/chaos-

engineering-security-networking/

7. Chaos Engineering

GameDay Case Study

https://queue.acm.org/detail.cfm?id=2371

297

SRE Practitioner Course: Value Added Resources

7. Chaos Engineering

Security and chaos

engineering

https://www.rochestersecurity.org/wp-

content/uploads/2018/10/RSS2018-B1.pdf

8. SRE is the Purest

Form of DevOps

SRE Essentials

https://learning.oreilly.com/playlists/7b526

ba0-0ba2-4d89-baac-25e9f3877d7f/

8. SRE is the Purest

Form of DevOps

97 Things Every SRE

Should Know

https://learning.oreilly.com/library/view/97-

things-every/9781492081487/

8. SRE is the Purest

Form of DevOps

Building an SRE

Organisation

https://www.slideshare.net/FranklinAngulo

1/building-an-sre-organization-

squarespace

8. SRE is the Purest

Form of DevOps

A Day in the Life of a

New SRE

https://blog.newrelic.com/engineering/wh

at-does-an-sre-do/

8. SRE is the Purest

Form of DevOps

The Evolving SRE

Engagement Model

https://sre.google/sre-book/evolving-sre-

engagement-model/

8. SRE is the Purest

Form of DevOps

I’’m SRE and You Can

Too! —A Fine Manual.

https://youtu.be/Cg877bv_xig?t=1027

8. SRE is the Purest

Form of DevOps

I’m an SRE Lead! Now

What? How to Bootstrap

and Organize Your SRE

Team

https://www.youtube.com/watch?v=KbKf

AwPbQgk

8. SRE is the Purest

Form of DevOps

Psychological Safety for

SRE

https://www.oreilly.com/library/view/seeki

ng-

sre/9781491978856/ch27.html#psychologic

al_safety_in_sre

SRE Practitioner Course: Value Added Resources

8. SRE is the Purest

Form of DevOps

Liz and Dave on how to

implement SRE

https://learning.oreilly.com/videos/velocity

conference/9781492025870/978149202587

0-video323188

SRE Book References, Articles and Reports

Site Reliability Engineering

https://landing.google.com/sre/sre-book/toc/index.html

The Site Reliability

Workbook

https://landing.google.com/sre/workbook/toc/

https://learning.oreilly.com/library/view/site-reliability-

engineering/9781491929117/

SRE Essentials

https://learning.oreilly.com/playlists/7b526ba0-0ba2-4d89-baac-

25e9f3877d7f/

Seeking SRE

https://learning.oreilly.com/library/view/seeking-

sre/9781491978856/

Building Secure and

Reliable Systems

https://learning.oreilly.com/library/view/building-secure-

and/9781492083115/

Database Reliability

Engineering

https://learning.oreilly.com/library/view/database-reliability-

engineering/9781491925935/

Practical Site Reliability

Engineering

https://learning.oreilly.com/library/view/practical-site-

reliability/9781788839563/

Chaos Engineering

https://learning.oreilly.com/library/view/chaos-

engineering/9781492043850/

SRE Practitioner Course: Value Added Resources

Security Chaos

Engineering

https://www.oreilly.com/library/view/security-chaos-

engineering/9781492080350/

Real-World SRE

https://www.amazon.com/Real-World-SRE-Survival-Responding-

Maximizing/dp/1788628888?asin=1788628888&revisionId=&format

=4&depth=1

2021 SRE Report by

Catchpoint

https://pages.catchpoint.com/2021-sre-report

SRE Practitioner v1.0

Sample Examination

with Answer Key

1. Why are Containers important for modern/distributed architecture?

A. Containers lead to consistent storage of code that keeps applications fresh

B. Containers are the only processes that run on Cloud

C. Containers lead to consistent development and deployment methodologies that

can be iterated easily

D. Containers are the processes that support infrastructure as code

2. How can you secure your Docker containers?

A. Choose third party containers carefully and consider 3

party tools to secure

B. Enable Docker content trust

C. Set resource limit for your containers

D. All of the above

3. Prompted by customer complaints an organization is looking for ways to improve its

major incident management process. Customers complaints include the frequency of

system outages and the lack of communication during outages. A team has identified

several improvement opportunities. Which would address the customers’ concerns?

A. Invest in detection and alerting systems

B. Establish an incident command framework

C. Implement a new incident management system

D. Use swarming to engage multiple people in incident resolution

4. What is the best definition for DNS?

A. The domain name system (DNS) is a decentralized naming system for resources

connected to the internet or a private network

B. Humans cannot feasibly remember IP addresses, so DNS allows the assigning of a

human-readable name, such as google.com, to use in place of the IP address

C. A relational database that has metadata of IPs

D. All of the above

5. What are the 3 most important things to consider when considering Data

architecture?

A. Security, Performance and Network

B. Security, Compliance and Performance

C. Confidentiality, Integrity and Availability

D. Compliance, Reliability and Security

6. Which type of cloud deployment model reduces the complexity of building, testing,

and deploying applications by keeping the developers inside a well-defined

environment, which limits the ability for the developers to make mistakes?

A. Software as a Service

B. Integration as a Service

C. Infra as a Service

D. Platform as a Service

7. A company is working with partners to develop a new cloud service to be used in a

heavily regulated industry. It wants to ensure the components that are consumed by

development teams have the necessary governance, controls and standards built-in.

Which approach would achieve this?

A. Adopt a Platform SRE approach

B. Embed SREs with development teams

C. Establish an SRE Center of Excellence

D. Design for security

8. System boundaries help in defining meaningful SLAs. The key points to consider are:

A. Be logical to define a clear capability

B. Base it on domain-based design principles

C. Customer-facing capabilities only

D. Tracking on a system boundary that is large enough than individual system

components

9. What is telemetry?

A. Telemetry is the process of recording the behavior of your systems

B. Telemetry is a widely known Software as a Service (SaaS)tool to plan and execute

DevOps projects

C. Telemetry is a communication tool used by DevOps teams at geographically

distributed locations

D. None of these

10. What is the error budget for a SLO of 95% on the page served less than 200ms over the

past 6 hours?

A. Allow 5% failure of page requests served in < 200ms over past 24 hours

B. Allow 5% failure of page requests served in < 200ms over past 6 hours

C. Allow 5% failure of 95% percentile latency over past 6 hours

D. Allow 5% failure of service availability for the past 6 hours

11. Which of the following practices supports SRE The First Way?

A. Shared On-call Rotation

B. Value Stream Mapping

C. Agile Process

D. Improvement Kata model

12. When a 3rd party downstream system provides an error rate of 1%, and your backend

has an error rate of 0.1%, what is the error rate that will be inherited by your

middle-tier?

A. 1.1%

B. 0.1%

C. 1%

D. 1.2%

13. An SRE wants to establish SLOs for the customer-facing capabilities of a system. This is a

complex system that relies on several third-party services. What should the SRE identify

FIRST?

A. System boundaries

B. SLIs for system components

C. Error rates of third-party services

D. SLIs for each capability

14. What are the 2 major tenets of a major incident?

A. Communication and collaboration

B. Resolution and lesson learned

C. Rolling out mitigation and monitoring

D. Identifying cause and resolution

15. For implementing Service Level Objectives (SLOs), Jane used request-driven services

such as HTTP servers and Application Programming Interface (API) endpoints. She and

the development team decided the SLOs should be against a 2-week sprint cycle and

that the Service Level Indicator (SLI) on both availability and latency should be based

on remote probes sent every minute. This helped drive product resilience because:

A. SLOs and Error Budgets help drive reliability if built correctly

B. Error budgets help with system technical debt

C. SLOs and Error Budgets helped place request-based service with dashboards

D. All of the above

16. If your error budget allows 1% failure of 95% percentile home latency over 5 minutes <

200ms for the past month, what is the Service Level Indicator (SLI) for this home page?

A. 95th percentile of home page latency over 5 mins < 200ms

B. 99% of 95th percentile of home page latency over 5 mins < 200ms for the past

month

C. 99% of home page should be available for 99% of time over 30 days

D. None of the above

17. SREs are deployed at application level, system level, and enterprise level based on the

organizational context. When working at a system level, they participate in the

following activities:

A. Application system design

B. System architecture readiness

C. Engineering center for excellence

D. None of the above

18. Provide a guesstimate for GB bandwidth requirements when using peak at 1.25x times

load. Facts: 200 TB of data per day. (Calculate per second requirements)

A. 31 Gbps

B. 60 Gbps

C. 45 Gbps

D. 10 Gbps

19. Joe is trying to estimate (on the higher side) the bandwidth requirements at average

load on a metadata size of 8 KB. The assumptions are there are 1 million users who

search 50 times a day and 10 results are yielded per search. What is the average load

in terms of bandwidth per second?

A. 20 MB/sec

B. 50 MB/sec

C. 40 MB/sec

D. 10 MB/sec

20. Non-Abstract Large-Scale Design (NALSD) is a critical skill for SRE because:

A. It provides a way to design upfront, real-world requirements

B. Component isolation, graceful degradation and capacity planning are thought

through well to avoid wastage

C. It is a whiteboard exercise that will translate into concrete estimates that can be

used to build the real systems

D. All of the above

21. The Role of Security in the modern world of connected ecosystems is changing

primarily due to the following:

A. ID based authentication compared to network-based authentication

B. DevSecOps principle shift left security in the development lifecycle

C. Project based engagement to continuous engagement

D. All of the above

22. What is a Microservice?

A. A design used primarily in functional programming and object-oriented

programming

B. A small program that represents discrete logic that executes within a well defined

boundary on dedicated hardware

C. A style of design for enterprise systems based on a loosely coupled component

architecture

D. A very small piece of code that never gets any bigger than 10 lines

23. An IT organization using a microservices architecture wants to improve the resiliency of

its services. What can this organization do to prevent a service failure from cascading

to other services?

A. Implement a circuit breaker

B. Leverage the MITRE ATT&CK Framework

C. Implement a supervisor agent

D. Use canary deployments

24. SRE is expected NOT to “_____” systems, but rather to create environments,

infrastructure and automation that will ________ systems to run meeting SLAs.

A. Run, Self-enable

B. Run, Automate

C. Develop, Self-enable

D. Develop, Automate

25. To start with SRE, “North Star” cannot be achieved immediately. We need to take

small steps. In the “Walking” stage, what probably is a measure?

A. End-to-end build and packaging from CI Server

B. Supports Autonomous Delivery

C. Retrospectives are effective

D. Static Code Analysis and automated unit tests

26. In the Build Stage Participation, SREs get involved in various activities. What activity do

SREs perform in the Continuous Deployment Stage related to Deployment on

Production?

A. Automated Documentation

B. Security Check

C. On-call Support

D. B/G Deployment

27. Which is the correct definition of Chaos Engineering?

A. Chaos Engineering is the discipline of experimenting on a distributed system in order

to build confidence in the system’s ability to withstand turbulent conditions

B. Chaos Engineering is the practice of breaking things in production during the

business hours

C. Chaos Engineering is the discipline of experimenting on a individual system in order

to build confidence in the system

D. Chaos Engineering is the discipline of experimenting on a distributed system in order

to build confidence in the system’s ability to deliver the expected functionality

28. What is NOT a myth about Chaos Engineering?

A. Chaos Engineering is about breaking things

B. Chaos Engineering is about injecting random chaos experiments into the system

and see what happens

C. Chaos Engineering is not about injecting random chaos experiments into the system

and seeing what happens

D. Chaos Engineering is only for Cloud

29. True SRE is about remediating issues all across the value chain. Which one below is a

reflection of an SRE’s activity?

A. An architecturally complex system primarily transformed and moved business data

across multiple cloud providers.

B. Poorly implemented Virtual Private Cloud for Production workloads.

C. A single active VPN tunnel was present to connect both the Cloud infrastructures

D. All of the above

30. An organization has moved from a traditional siloed culture to cross-functional

product teams. A newly-formed SRE team is exploring needed changes to the

organization's incident management procedures. What is the FIRST change this team

should make?

A. Establish an incident command framework

B. Redefine the responsibilities of each support tier

C. Use AI/ML to automate as much as possible

D. Develop a communication strategy

31. What scalability challenges does the Platform SRE approach solve?

A. Fragmentation

B. Unpredictability

C. Cost

D. Options A and B

32. What is the sequence of phases in implementing AIOps?

A. Organize, Collect, Analyze, Infuse

B. Collect, Organize, Analyze, Infuse

C. Gather, Organize, Analyze, Infuse

D. Collect, Correlate, Analyze, Introduce

33. An organization’s product teams have the ability to determine whether users are able

to access applications and if those applications are performing within appropriate

limits. They lack, however, the ability to better understand the inner workings of these

systems. Which would provide the teams this capability?

A. Monitoring

B. Observability

C. AIOps

D. Chaos engineering

34. According to Google`s Golden Signals, what is the highest level in the Pyramid?

A. Errors

B. Latency

C. Saturation

D. Traffic

35. In a possible SRE Implementation, where does the SRE role fit in?

A. Part of the Development Team

B. Outside the Value Stream

C. SRE Product and Platform Team

D. Operations

36. A newly-formed team of SREs has had some quick wins by working with engineers and

product owners to drive automation and improve incident handling. The team is

working to get buy-in for SLOs and error budgets in an effort to affect how work is

prioritized and mature the organization. Which behavioral skills are MOST needed in

this situation?

A. Collaboration skills

B. Decision making skills

C. Negotiation and influencing skills

D. Conflict management skills

37. Game Days are done to discover the robustness around degradation of services.

Which one of the below is an example?

A. Stop your docker service

B. DDOS yourself

C. Add delay to your Network and check for Resiliency

D. All of the above

38. In the Game Day Experiment Lifecycle, what is the sequence of activities?

A. Automate, Perform, Verification, Remediate

B. Automate, Perform, Validate, Remediate

C. Perform, Validate, Remediate, Automate

D. Perform, Validate, Automate, Remediate

39. What is the main goal for successful SRE?

A. SLO

B. SLI

C. Error Budget

D. Customer Satisfaction

40. To introduce chaos engineering, a team is thinking of quietly running a series of

experiments that confirm known weaknesses in a system. The team can then use the

results of these experiments to justify needed improvements. What is the likely outcome

of this approach?

A. Evidence of the value of chaos engineering

B. Resistance to future chaos engineering efforts

C. Buy-in from the people who support the system

D. Confidence in the ability to detect system weaknesses

SRE Practitioner v1.0 - Sample Exam Answer Key

Question

Correct Answer

Module

devopsinstitute.com/membership