Glossary D
38 terms starting with D
A DAG is a graph where edges have direction and no path leads back to the same node, representing dependencies where no circular references exist. DAGs model build dependency graphs (Bazel, Make), CI/CD pipeline stages, data transformation workflows (Airflow, dbt), and package dependency trees. Topological sorting of a DAG determines the correct execution order for tasks.
View full page →DALL-E is OpenAI's series of text-to-image generation models capable of creating photorealistic images, illustrations, and art from natural language descriptions. DALL-E 3 uses a diffusion-based architecture conditioned on rich captions produced by a language model, improving prompt adherence. It is integrated into ChatGPT and available via API for application development.
View full page →DAST tests running applications for vulnerabilities by simulating external attacks. Unlike SAST, it doesn't require access to source code — it interacts with the application through its exposed interfaces. DAST excels at finding runtime issues like authentication flaws, server misconfigurations, and injection vulnerabilities in deployed environments.
View full page →A data lake is a centralized repository that stores raw, unprocessed data in its native format — structured, semi-structured, and unstructured — at any scale. Unlike a data warehouse, a data lake imposes no schema at write time (schema-on-read). Platforms like AWS S3 with Athena, Delta Lake, and Apache Iceberg are common data lake implementations. Data lakes enable flexible analytics and ML on heterogeneous data.
View full page →Data mesh is an architectural paradigm that decentralizes data ownership to domain teams, treating data as a product with defined APIs and SLAs. Instead of a central data engineering team managing a monolithic lake or warehouse, domain teams own, publish, and maintain their data products. A federated governance layer enforces interoperability standards across domains.
View full page →Data residency requirements mandate that certain categories of data must be stored and processed within specified geographic boundaries, typically driven by national laws or regulations. Cloud customers manage data residency through service region selection, AWS data residency controls (data perimeter policies), and encryption that ensures data is only decryptable within approved regions. Data residency requirements influence cloud architecture decisions around multi-region deployment, backup locations, and support access patterns.
View full page →Data transfer costs are fees cloud providers charge for moving data between services, regions, or to the internet (egress). Intra-region transfers between the same availability zone are typically free, cross-AZ transfers incur a small fee, and internet egress is charged per GB transferred. Egress costs are a significant and often underestimated component of cloud bills for data-intensive workloads.
View full page →A data warehouse is a centralized repository optimized for analytical queries, storing historical data from multiple operational sources in a structured, integrated form. Warehouses use dimensional modeling (star or snowflake schemas) to organize data for business intelligence and reporting. Cloud warehouses like Snowflake, BigQuery, and Redshift separate storage from compute, enabling independent scaling.
View full page →DDD is a software design approach that models complex domains by aligning code structure with business concepts using a shared ubiquitous language between developers and domain experts. Key patterns include Bounded Contexts (explicit boundaries around domain models), Aggregates (consistency boundaries), and Domain Events. DDD provides the conceptual framework behind microservice boundaries and event-driven architectures.
View full page →DDoS (Distributed Denial of Service) protection services detect and mitigate volumetric, protocol, and application-layer attacks that attempt to overwhelm infrastructure by flooding it with traffic from many sources simultaneously. Cloud providers offer tiered protection: basic network-layer protection is included free (AWS Shield Standard), while advanced protection with automatic detection and response costs additional fees. Effective DDoS protection requires both network-layer mitigation and application-layer WAF rules.
View full page →A dead letter queue (DLQ) captures messages that cannot be processed successfully after a maximum number of delivery attempts. DLQs prevent message loss and allow engineers to inspect failed messages, diagnose processing errors, and reprocess them once the underlying issue is resolved. AWS SQS, Azure Service Bus, and RabbitMQ all support DLQ configuration.
View full page →A declarative pipeline defines the desired end state of a CI/CD workflow using structured configuration (YAML, HCL, or JSON) rather than imperative scripts. Declarative pipelines are easier to read, lint, and version-control, and many tools can validate them statically before execution. Jenkins Declarative Pipeline syntax and GitHub Actions workflows are prominent examples.
View full page →The decoder is the component of a transformer that generates output sequences token-by-token using causal (masked) self-attention to prevent attending to future tokens, plus cross-attention over encoder outputs in encoder-decoder architectures. GPT-style models are decoder-only, using causal self-attention without cross-attention. Decoders are responsible for the autoregressive generation behavior of modern LLMs.
View full page →Defense in depth is a security strategy that employs multiple independent layers of controls so that if one layer fails, others continue to provide protection. It applies the principle that no single security mechanism is foolproof, and combines preventive, detective, and corrective controls at the network, host, application, and data layers. This layered approach significantly increases the cost and complexity of successful attacks.
View full page →Dependabot is GitHub's automated dependency update service that monitors repositories for outdated or vulnerable dependencies and automatically opens pull requests to update them. It supports security updates (patching known CVEs immediately) and version updates (keeping dependencies current). Dependabot integrates with GitHub Security Advisories and can be configured with merge policies, grouping rules, and update schedules.
View full page →Dependency confusion is a supply chain attack where an attacker publishes a malicious package to a public registry with the same name as a private internal package. Package managers that check public registries first will download the attacker's package instead of the legitimate private one. Mitigations include scoping private packages, pinning versions, using artifact proxies, and configuring registry precedence.
View full page →A dependency graph maps the relationships between software components, packages, services, or infrastructure resources and their dependencies. In monorepo CI, dependency graphs drive intelligent build systems to rebuild and test only affected packages. In IaC, dependency graphs determine the correct order for resource creation and deletion. Visualizing dependency graphs helps identify critical paths and circular dependencies.
View full page →Deployment frequency measures how often an organization successfully deploys code to production. It is one of the four DORA metrics and a leading indicator of DevOps maturity. Elite performers deploy on demand (multiple times per day), enabled by automated testing, trunk-based development, and progressive delivery practices that reduce deployment risk.
View full page →Deps.dev is Google's open source insights service that provides dependency graph analysis, vulnerability information, and license data for open-source packages across npm, PyPI, Go, Cargo, Maven, and NuGet. It surfaces transitive dependencies, known security advisories, and OpenSSF Scorecard ratings, helping developers understand the full security and compliance profile of their software supply chain.
View full page →DevOps is a set of practices that combines software development and IT operations to shorten the development lifecycle and deliver high-quality software continuously. It emphasizes automation, monitoring, collaboration, and infrastructure as code. DevOps culture breaks down silos between teams that build software and teams that run it.
View full page →DevSecOps integrates security practices throughout the software development and delivery pipeline rather than treating it as a gate at the end. It automates security testing in CI/CD, embeds security tooling in developer workflows, and fosters shared ownership of security outcomes across development, security, and operations teams. The goal is to deliver secure software at DevOps velocity.
View full page →DFIR combines the investigative discipline of digital forensics with the operational practice of incident response. Forensics practitioners collect and preserve evidence from compromised systems in a forensically sound manner, while IR practitioners contain threats and restore operations. The two disciplines are tightly coupled — response actions must preserve evidence integrity for potential legal proceedings.
View full page →A diffusion model is a generative model that learns to reverse a gradual noising process, starting from pure noise and iteratively denoising to produce data samples. Diffusion models have achieved state-of-the-art image and audio generation quality, surpassing GANs on diversity and mode coverage. Stable Diffusion, DALL-E 3, and Sora use diffusion-based architectures.
View full page →Directory synchronization replicates user identities, group memberships, and attributes from a source directory (e.g., Active Directory) to downstream systems and IdPs. Tools like Microsoft Entra Connect sync on-premises AD to Azure AD/Entra ID. Directory sync ensures that identity data is consistent across an organization's systems and that access changes propagate automatically when employees change roles.
View full page →Disaster recovery (DR) is the set of policies, tools, and procedures enabling an organization to restore IT systems and data after a catastrophic failure — data center outage, data corruption, or ransomware. DR plans define RTO and RPO targets and choose between warm standby, cold standby, or active-active configurations to meet them within budget constraints.
View full page →Knowledge distillation is a model compression technique where a smaller student model is trained to mimic the output distribution of a larger teacher model. The student learns from soft probability distributions (teacher logits) rather than hard labels, transferring knowledge more efficiently. Distillation is used to produce smaller, faster models that retain much of the teacher's accuracy.
View full page →Distributed training splits the work of training large neural networks across multiple GPUs or machines. It encompasses data parallelism, model parallelism, and pipeline parallelism strategies. Frameworks like PyTorch FSDP, DeepSpeed, and Megatron-LM implement distributed training with gradient synchronization, mixed precision, and memory optimization to enable training models with hundreds of billions of parameters.
View full page →Distroless container images contain only the application and its runtime dependencies — no package manager, shell, or OS utilities. Pioneered by Google, they reduce image size and dramatically shrink the attack surface by removing tools that attackers could use to escalate privileges or move laterally. Distroless images are commonly used in multi-stage Docker builds.
View full page →DLP solutions identify, monitor, and control the movement of sensitive data (PII, PHI, intellectual property, financial records) to prevent unauthorized disclosure. DLP can inspect data at rest in storage, in transit over the network, or in use on endpoints. Policies define what constitutes sensitive data and what actions to take (block, quarantine, alert) when violations occur.
View full page →DNS translates human-readable domain names (like crashoverride.com) into IP addresses that computers use to locate servers. It's a hierarchical distributed system that handles billions of queries daily. DNS configuration affects website availability, email delivery, and service discovery. Misconfigured DNS is one of the most common causes of outages.
View full page →Docker is a platform for building, shipping, and running applications in containers. Containers package an application with all its dependencies into a standardized unit, ensuring consistent behavior across development, testing, and production environments. Docker popularized container technology and its image format became the basis for the OCI (Open Container Initiative) standard.
View full page →A Dockerfile is a text file containing sequential instructions for building a Docker container image. Each instruction (FROM, RUN, COPY, CMD) creates a new image layer. Best practices include using specific base image tags, minimizing layers, leveraging multi-stage builds, and running processes as non-root users to produce secure, efficient images.
View full page →DORA metrics are four key performance indicators that measure software delivery performance: deployment frequency, lead time for changes, change failure rate, and mean time to recover (MTTR). Developed by the DORA research program (now part of Google Cloud), these metrics correlate strongly with organizational performance and are widely used to benchmark DevOps maturity.
View full page →A DPIA is a process required by GDPR before undertaking processing activities that are likely to result in high risk to individuals' rights and freedoms — such as large-scale profiling, biometric processing, or systematic monitoring. The assessment identifies risks, evaluates mitigations, and documents whether residual risk is acceptable. DPIAs must be completed before processing begins, not after.
View full page →A DPO is a mandatory role under GDPR for organizations that process personal data at scale, monitoring compliance with data protection laws and serving as the contact point for supervisory authorities. The DPO must have expert knowledge of data protection law and be independent from the data controller's operational management. Many companies appoint external DPOs to satisfy the requirement.
View full page →Drone is an open-source CI/CD platform built on Docker that executes each pipeline step in an isolated container. Its plugin ecosystem uses container images for integrations, making it highly portable and vendor-neutral. Drone's simple YAML configuration and self-hosted deployment model make it popular in organizations that want full control over their CI infrastructure.
View full page →DSPM tools discover, classify, and secure sensitive data across cloud storage, databases, and SaaS applications by continuously mapping where data lives and assessing the security controls protecting it. DSPM identifies data exposure risks like publicly accessible S3 buckets containing PII, overly permissive database access, or sensitive data stored in unexpected locations. It converges data governance with cloud security posture management to reduce data breach risk.
View full page →DynamoDB is AWS's fully managed, serverless NoSQL key-value and document database designed for applications requiring single-digit millisecond performance at any scale. DynamoDB Global Tables provide multi-region, multi-master replication. Its provisioned and on-demand capacity modes, point-in-time recovery, and DynamoDB Streams make it a foundational AWS service for high-throughput transactional applications.
View full page →