Glossary

This glossary provides definitions of key concepts used throughout the book.

Actual State
The current, real-time configuration and operational state of the network infrastructure as it exists, which may differ from the desired state.
Artificial Intelligence (AI)
The simulation of human intelligence processes by machines, especially computer systems, enabling them to perform tasks such as learning, reasoning, and problem-solving.
Artificial Intelligence for IT Operations (AIOps)
The application of artificial intelligence and machine learning to IT operations for automated insight generation, anomaly detection, and operational decision-making.
Ansible
An open-source automation platform that uses declarative YAML playbooks for configuration management, application deployment, and task automation without requiring agents.
Application Programming Interface (API)
A set of definitions, protocols, and tools that allow software components to communicate with each other programmatically.
Address Resolution Protocol (ARP)
A protocol used to map IP addresses to physical MAC addresses on a local network.
Atomic Operation
An operation that completes entirely or not at all, with no partial or intermediate states visible to the system. Essential for maintaining consistency in distributed systems.
Border Gateway Protocol (BGP)
The standardized exterior gateway protocol used to exchange routing information between autonomous systems on the Internet.
BGP Monitoring Protocol (BMP)
A protocol that provides real-time monitoring of BGP routing information by streaming BGP updates and events from routers to monitoring stations.
Bill of Materials (BOM)
A comprehensive list of components, parts, and materials required to build, deploy, or maintain a network system or device.
Brownfield
An existing environment with legacy systems, manual processes, and established infrastructure. Automation is introduced gradually into these systems.
CI/CD
Continuous Integration/Continuous Deployment. Automated processes that test, validate, and deploy code changes whenever they are committed to version control.
Circuit Breaker
A distributed systems pattern that stops requests to a failing service to prevent cascading failures. Similar to electrical circuit breakers, it 'trips' to protect the system.
Command Line Interface (CLI)
A text-based interface for interacting with systems by typing commands, commonly used for network device configuration and troubleshooting.
Collector
An architectural component that retrieves and reads the actual state of the network, gathering telemetry, configuration, and operational data.
Compensation Logic
Error handling mechanism that undoes or recovers from failed operations by applying compensating transactions or reverting to previous states.
Configuration Drift
The divergence between the intended configuration state and the actual configuration on network devices, often caused by manual changes, errors, or failures in automation.
Central Processing Unit (CPU)
The primary component of a computer that performs most of the processing, executing instructions from programs and managing system operations.
Declarative
An automation approach where the desired end state is defined, and the system determines the steps needed to achieve it. Contrast with imperative.
Desired State
The intended configuration and operational state of a system as defined in the Source of Truth, representing what the network should look like.
Dry Run
An execution mode that validates and previews what changes would be made without actually applying them.
Domain-Specific Language (DSL)
A specialized programming language designed for a specific application domain, such as querying or configuration management.
Extended Berkeley Packet Filter (eBPF)
A technology that allows custom programs to run in the Linux kernel for high-performance networking, observability, and security use cases.
End-to-End Test
A test that validates complete automation workflows in near-real scenarios using lab environments or virtualized network infrastructure.
End of Life (EOL)
The stage in a product's lifecycle when it is no longer supported or sold by the manufacturer, often requiring migration or replacement planning.
Extract, Transform, Load (ETL)
A data pipeline pattern that extracts data from sources, transforms it into a desired format, and loads it into a destination system.
Executor
An architectural component that applies changes to the network infrastructure, updating configurations and driving changes as guided by the intended state.
GitOps
An operational framework where Git repositories serve as the single source of truth, and controllers continuously compare desired state in Git with actual state, automatically correcting drift.
gRPC Network Management Interface (gNMI)
A modern network management protocol based on gRPC that provides streaming telemetry and configuration capabilities using YANG data models.
Graphics Processing Unit (GPU)
A specialized processor designed to accelerate graphics rendering and parallel processing tasks, widely used in AI, ML, and high-performance computing.
Graceful Degradation
A system design principle where components continue operating with reduced functionality when failures occur, rather than failing completely. Enables resilience in distributed systems.
Greenfield
A new environment or project built from scratch with automation-first design principles from day one.
gRPC Remote Procedure Call (gRPC)
A modern, high-performance RPC framework that uses HTTP/2 for transport and Protocol Buffers as the interface definition language.
Hypertext Transfer Protocol (HTTP)
An application-layer protocol for transmitting hypermedia documents and serving as the foundation of data communication for the World Wide Web.
Infrastructure as Code (IaC)
A practice of managing and provisioning infrastructure through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools. Enables version control, testing, and automation of infrastructure changes.
Idempotency
A property of automation where repeated operations yield the same result. Running the same operation multiple times produces the same final state with no unintended side effects.
Imperative
An automation approach where specific steps and commands are defined in exact order. Contrast with declarative.
Infrahub
An open-source network source of truth platform built on a graph database architecture. Features Git-like branching, schema-driven data models, proposed-state workflows, and complex relationship modeling for network infrastructure.
Integration Test
A test that validates how multiple components work together in a simulated environment, introducing third-party systems and interdependencies.
Intent (Architectural Block)
An architectural component that defines the logic to handle and persist the desired state of the network, including both configuration and operational expectations. Serves as the Source of Truth for automation.
Intent-Driven
An automation approach where engineers define the desired end state of the system, and the automation determines how to achieve it (declarative approach).
Interface Contracts
Formal definitions of how systems communicate, including schemas, validation rules, and backward-compatibility policies. Enable independent evolution of components and clear integration boundaries.
Internet of Things (IoT)
A network of physical devices embedded with sensors, software, and connectivity that enables them to collect and exchange data.
Internet Protocol (IP)
A set of rules governing the format of data sent over the Internet or other networks, enabling devices to communicate and route packets across interconnected networks.
IP Flow Information Export (IPFIX)
An IETF standard protocol for exporting IP flow information from routers and switches, used for network traffic analysis and monitoring.
Intermediate System to Intermediate System (IS-IS)
A link-state routing protocol used in large service provider networks to route IP and other network layer protocols.
JavaScript Object Notation (JSON)
A lightweight, text-based data interchange format that is easy for humans to read and write and easy for machines to parse and generate.
Log Query Language (LogQL)
Grafana Loki's query language, inspired by PromQL, designed for querying and filtering log data.
Management Information Base (MIB)
A database used for managing entities in a network using SNMP, defining the structure and meaning of management data.
Machine Learning (ML)
A subset of artificial intelligence that enables systems to learn and improve from experience without being explicitly programmed, using algorithms and statistical models.
Mean Time To Resolution (MTTR)
The average time required to resolve an incident or problem, measured from the time the issue is detected until it is fully resolved.
NAF Framework
Network Automation Forum Framework. A community-driven, vendor-agnostic reference architecture that defines seven building blocks for organizing network automation solutions: Intent, Executor, Collector, Observability, Orchestrator, Presentation, and Network Infrastructure.
Nautobot
An open-source network source of truth and network automation platform, originally forked from NetBox. Provides enhanced extensibility, job framework for automation workflows, Git data source integration, and professional support options.
NetBox
An open-source infrastructure resource modeling (IRM) application designed to serve as a network source of truth, tracking IP addresses, devices, cables, and other network infrastructure data.
NETCONF
Network Configuration Protocol. An IETF standard protocol that provides network device configuration and data retrieval capabilities with support for atomic operations and rollback.
Network Operations Center (NOC)
A centralized location where IT professionals monitor, manage, and maintain an organization's network infrastructure.
Network Operating System (NOS)
The software that runs on network devices, providing the functionality to configure, manage, and control network operations.
Observability
The ability to measure and understand a system's internal state through its external outputs. Includes logging, metrics, and tracing to detect failures and optimize automation behavior.
OpenConfig
A collaborative initiative developing vendor-neutral data models and programmatic interfaces for managing network devices using YANG.
OpenTelemetry
A vendor-neutral observability framework providing a unified set of APIs, libraries, and tools for collecting, processing, and exporting telemetry data (metrics, logs, and traces).
Orchestrator
An architectural component that coordinates and executes automation tasks in response to events, managing workflows and task sequencing across the automation system.
Open Shortest Path First (OSPF)
A link-state routing protocol used within an autonomous system to determine the best path for routing IP packets.
OpenTelemetry Protocol (OTLP)
The native protocol of OpenTelemetry for transmitting telemetry data (metrics, logs, traces) between applications and observability backends.
Packet Capture (PCAP)
A standard format for capturing network traffic data, commonly used by tools like tcpdump and Wireshark for packet analysis.
Predictable
A quality of trustworthy automation where operations produce consistent, deterministic outcomes that engineers can anticipate.
Presentation (Layer)
An architectural component that provides the interfaces through which users interact with the automation system, including dashboards, GUIs, APIs, ITSM integrations, and CLI tools.
Prometheus Query Language (PromQL)
A functional query language for Prometheus that enables selection and aggregation of time series data in real time.
Packet Sampling (PSAMP)
An IETF standard framework for packet sampling and filtering for network monitoring and measurement.
Root Cause Analysis (RCA)
A systematic process for identifying the underlying causes of problems or incidents to prevent their recurrence.
Reliable
A quality where automation handles errors gracefully, recovers from failures, and completes operations safely even under unexpected conditions.
Representational State Transfer (REST)
An architectural style for designing networked applications using HTTP methods (GET, POST, PUT, DELETE) to interact with resources, commonly used in web APIs.
RESTCONF
A protocol that provides a programmatic interface for accessing YANG-defined data using HTTP-based RESTful APIs, designed for web-based network management.
RDMA over Converged Ethernet (RoCE)
A network protocol that enables Remote Direct Memory Access (RDMA) over Ethernet networks, providing high-throughput, low-latency networking for data centers and high-performance computing.
Rollback
The process of reverting a system to a previous known-good state after a failed or problematic change, often using snapshots, commits, or version control.
Software-Defined Networking (SDN)
A network architecture approach that enables the network to be intelligently and centrally controlled, or 'programmed,' using software applications. This helps operators manage the entire network consistently and holistically, regardless of the underlying network technology.
sFlow
A network monitoring technology that uses packet sampling to provide visibility into network traffic patterns and performance.
Shadow Mode
A deployment pattern where automation runs alongside production systems without making actual changes, allowing validation and confidence-building before full deployment.
Service Level Agreement (SLA)
A commitment between a service provider and a client defining the expected level of service, including performance metrics and guarantees.
Service Level Indicator (SLI)
A quantitative measure of some aspect of the level of service being provided, such as response time, error rate, or availability.
Simple Network Management Protocol (SNMP)
An Internet Standard protocol for collecting and organizing information about managed devices on IP networks, widely used for network monitoring and management.
Source of Truth (SoT)
An authoritative, centralized system holding the intended state of the network, used as the single reference point for automation decisions.
Switched Port Analyzer (SPAN)
A network switch feature that mirrors traffic from one or more ports to a monitoring port for analysis and troubleshooting.
Secure Shell (SSH)
A cryptographic network protocol for secure remote login and command execution over unsecured networks, widely used for network device management.
System Logging Protocol (Syslog)
A standard protocol for message logging that allows separation of the software generating messages from the system storing and analyzing them.
Test Access Point (TAP)
A hardware device that provides access to network traffic by creating a physical copy of data flowing through a network link.
Terraform
An infrastructure-as-code tool that uses declarative configuration files to provision and manage cloud and on-premises resources across multiple providers.
Telegraf-Prometheus-Grafana (TPG)
A popular open-source observability stack combining Telegraf for data collection, Prometheus for storage, and Grafana for visualization.
Transactional
A property where multiple changes are grouped so they either complete fully or fail safely with no partial, inconsistent, or half-applied states.
Time Series Database (TSDB)
A database optimized for storing and querying time-stamped or time series data, commonly used in monitoring, IoT, and real-time analytics applications.
Understandable
A quality where automation systems expose intent, steps, results, and decisions transparently, building human confidence.
Unit Test
A test that validates a single component or function in isolation, typically using mocks or fake systems to focus on specific behavior.
Usable
A quality where automation provides interfaces that allow engineers to validate, reason about, and control behavior without excessive complexity.
Version Control System (VCS)
A tool that tracks and manages changes to code and data over time, enabling collaboration, rollback, and audit trails. Examples: Git, Mercurial.
Versioning
The practice of tracking and managing different versions of code, data, or configurations using version control systems. Enables rollback, audit trails, and collaboration.
YAML
YAML Ain't Markup Language. A human-readable data serialization format commonly used for configuration files, data exchange, and infrastructure-as-code definitions.
Yet Another Next Generation (YANG)
A data modeling language used to model configuration and state data for network devices, providing a standardized way to describe network device capabilities.
Zero Touch Provisioning (ZTP)
An automated deployment process that allows network devices to be configured and brought online with minimal manual intervention, typically using pre-defined scripts or configuration files.

Powered by Buttondown.