Datadog DASH 2025: AI Observability and Security

Organizations are undergoing a profound transformation. The rapid evolution of artificial intelligence is leading organizations into more complex environments with new operational risks. For visionary executives, the challenge is not only to navigate this change but to leverage its potential to create more resilient, agile, and innovative organizations.

 

At Datadog DASH, Datadog unveiled a new era of business technology: autonomous AI agents, unified observability, and robust security solutions—key elements for any organization looking to stay ahead of risk and accelerate digital transformation. Learn why Datadog is the ideal solution to manage complexity, achieve operational excellence, and gain a real strategic advantage.

 

 

Top priority: managing complexity in the AI Era

 

Technology is advancing at an unprecedented pace—advancements in AI, the expansion of the cloud, and the emergence of Web3 are redefining the business landscape. For organizations, resilience, compliance, cost control, and constant innovation now go hand in hand. Achieving these goals requires comprehensive visibility into dynamic and distributed business environments.
Datadog offers a unified platform that centralizes observability across all infrastructure, applications, and services—in the cloud and hybrid environments—enabling leaders to anticipate issues, reduce resolution time, and eliminate operational “blindness.” By integrating data from multiple sources, it enables faster decision-making and a proactive incident response.

Customer testimonial: “With Datadog, I have total visibility of our infrastructure… I can anticipate problems and quickly get to the root cause.”

 

Unified observability and security: the Datadog stack

 

Modern organizations demand operational clarity and actionable data—all from a single platform that encompasses cloud, applications, and infrastructure. Datadog responds with a unified solution, where AI and security are seamlessly integrated:

 

  • Unified platform: Monitors, protects, and remediates with consistent context.
  • Security + Observability: Security risks and performance issues are detected and prioritized together.
  • Elimination of silos: Better collaboration, fewer context switches, and greater business impact.

 

 

Autonomous Agents – The New Workforce for IT and Development

 

Bits AI for Operations Datadog unveiled enhancements to Bits AI, a virtual engineer available 24/7 that detects incidents, investigates telemetry in dashboards, deployments, and logs, and determines root causes in minutes. “Using Bits is like instantly adding an engineer to your team, always available and always alert.”

 

Bits AI Security Analyst Empowers security teams to tackle alert overload: Bits AI automatically analyzes SIEM signals, investigates incidents, and recommends actions, reducing incident management time from minutes to seconds.

 

Bits AI Dev Agent Boosts developer productivity: identifies and diagnoses critical failures, generates Pull Requests for fixes, and automates remediation by integrating with your source code repository — freeing up thousands of engineering hours each month.

 

Reimagining the Developer Experience

 

  • Internal Developer Portal (IDP): Organizes and accelerates delivery through service catalogs, best practice metrics, and self-service flows, all connected with real-time observability.

  • Live MCP Integration: Datadog announced its MCP server that links Datadog’s telemetry and remediation to popular development tools (like Cursor and OpenAI’s CodeX, allowing teams to debug and fix directly from the IDE.

 

Result: faster and safer deployments, and an empowered engineering culture.

 

 

Data as a Pillar: Observability, Governance, and Compliance

 

Business resilience relies on the quality, governance, and compliance of data. Datadog enables you to:

 

  • Real-time quality control: Detect anomalies or inconsistencies before they impact the customer.
  • Long-term retention: Flex Frozen and Archive Search offer log retention for 7 years, with instant and audit-friendly searches.
  • Unified analytics: Notebooks and Sheets allow for in-depth investigations and efficient collaboration, facilitating the transition from legacy tools.

 

AI Monitoring: End-to-End Control and Trust

 

As AI workloads and LLM models scale, so does their operational impact. Datadog addresses this by offering:

 

  • GPU visibility: Monitors GPU utilization and spending in cloud and hybrid environments.
  • Full-stack AI observability: Traces LLM model executions and agents, optimizing workflows at every level.
  • AI agent console: Tracks proprietary or third-party agents, analyzing behaviors and controlling costs.

 

These capabilities provide organizations with the transparency and trust needed to scale AI securely and efficiently.

 

 

Security for the AI Era

 

AI adoption introduce new attack vectors and risks in the value chain. Data, models, and applications can be exposed if not managed properly. Datadog offers:

 

  • Comprehensive coverage: Datadog protects the entire stack: data, models, and applications with advanced security controls.
  • Identity and observability: Prioritizes authentication and continuous monitoring, which are fundamental for AI agent-based architectures.
  • Robust ecosystem: Extensive integrations with leading providers and defenses tailored for emerging threats.

 

“Identity and observability are not optional, they are essential in the era of AI agents.” – Bhavna Singh, CTO, Okta

 

 

Customer Stories: From Availability to AI Driven Business Success

 

  • Toyota Connected: Leadership in availability 99.99% and operational agility for 12.5M vehicles, powered by Datadog.
  • Okta: Leverages Flex Logs and AI security to ensure 99.99% availability and a solid customer experience, even amidst changing identity and AI risks.
  • Cursor: Cursor has transformed the productivity of its development team and the reliability of its AI workflows by integrating Datadog with its artificial intelligence and DevOps tools.

 

Conclusion

 

The innovations Datadog presented at DASH are redefining the role of technology—from risk to catalyst. From autonomous agents to unified observability and AI security, the Datadog platform enables organizations to turn complexity into strategic advantage.

 

Ready to empower your teams with the next generation of secure AI and total observability? Schedule a meeting or request a demo with our experts. Discover how your organization can turn today’s complexity into tomorrow’s competitive advantage.

Subscribe for more content

Share this post

What are ITSM processes? ITIL version 4 recently went from recommending ITSM “processes” to introducing 34 ITSM “practices”. Their reasoning for this updated terminology is that “elements such as culture, technology, information and data management can be considered to get a holistic view of ways of working”. This more comprehensive approach better reflects the realities of modern organizations.

 

Here, we will not concern ourselves with nuanced differences in the use of practice or process terminology. What’s important and true, no matter what framework your team follows, is that modern IT service teams use organizational resources and follow repeatable procedures to deliver consistent and efficient service. In fact, leveraging practice or process is what distinguishes ITSM from IT.

Change management ensures standard procedures are used for efficient and prompt handling of all changes to IT infrastructure, whether it’s rolling out new services, managing existing ones, or resolving problems in the code. Effective change management provides context and transparency to avoid bottlenecks, while minimizing risk. Don’t feel overwhelmed by these and the even longer list of ITIL practices.

Problem management is the process of identifying and managing the causes of incidents on an IT service. Problem management isn’t just about finding and fixing incidents, but identifying and understanding the underlying causes of an incident as well as identifying the best method to eliminate the root causes.

Incident management is the process to respond to an unplanned event or service interruption and restore the service to its operational state. Considering all the software services organizations rely on today, there are more potential failure points than ever, so this process must be ready to quickly respond to and resolve issues.

IT asset management (also known as ITAM) is the process of ensuring an organization’s assets are accounted for, deployed, maintained, upgraded, and disposed of when the time comes. Put simply, it’s making sure that the valuable items, tangible and intangible, in your organization are tracked and being used.

Is the process of creating, sharing, using, and managing the knowledge and information of an organization. It refers to a multidisciplinary approach to achieving organizational objectives by making the best use of knowledge.

Is a repeatable procedure for handling the wide variety of customer service requests, like requests for access to applications, software enhancements, and hardware updates. The service request workstream often involves recurring requests, and benefits greatly from enabling customers with knowledge and automating certain tasks.

It’s simply not enough to have an ITSM solution – you need one that actually accelerates how your teams work.

Atlassian’s ITSM solution unlocks IT at high- velocity by streamlining workflows across development and operations at scale. Meaning what was once many siloed teams with different ways of working, are now integrated and much more collaborative than ever before.

ITSM benefits your IT team, and service management principles can improve your entire organization. ITSM leads to efficiency and productivity gains. A structured approach to service management also brings IT into alignment with business goals, standardizing the delivery of services based on budgets, resources, and results. It reduces costs and risks, and ultimately improves the customer experience.