Maria vaxeras
María Vaxeras, Cloud DevOps Engineer and Observability Expert, shares the importance of observability, how it has changed over time with the complexity of digital solutions and the benefits it brings to companies and end users.
OBSERVABILITY AS A COMPETITIVE ADVANTAGE
Observability has become a core term to guarantee application operations and uninterrupted availability. It provides a real-time, wide-angle view of an application, making it possible to respond quickly in the event of an issue.
What is observability? How has it changed over time?
Observability refers to the ability to understand and monitor the internal state or condition of a system based on its external outputs. That is, its purpose is to provide clear and deep insight into what is happening inside an application or infrastructure in real time, allowing different technical teams to understand how it behaves, identify problems and then execute measures to improve its performance.
The evolution of application architecture into complex systems with numerous interconnected elements has prompted a corresponding shift in the monitoring approach for these applications. Traditional, reactive response to an issue has evolved into the application of observability, which is geared towards collecting inputs from all the different components in an app and correlating it in real time. Should a problem occur, a response can be made before the issue affects the service provided to end users, minimizing response time along with the impact it has on service.
Evidence that the term "observability" has been gaining ground in recent years is the development of the Gartner Magic Quadrant for Application Performance Monitoring and Observability, which was designed to evaluate different providers that offer observability-based products and solutions.
End-to-end observability
Today technology plays a key role, applications are based on microservices and deployed with CI/CD technologies and observability has become a core term to guarantee application operations and uninterrupted availability. It provides a real-time, wide-angle view of an application, making it possible to respond quickly in the event of an issue in any of its ecosystems.
For best-case scenarios, at Holcim we apply end-to-end observability through the following elements:
- APM (application performance monitoring) makes real-time collection of information related to application execution possible. It lets users visualize how the different elements are interconnected and check which one is the bottleneck and the errors or exceptions that may be happening.
- Synthetic monitoring makes it possible to simulate end-user workflows in an application interface so that a potential issue can be detected before the user executes the workflow, thus minimizing the MTTD (mean time to detect).
- RUM (real user monitoring) serves to analyze application performance as perceived by individual users.
- Machine learning makes it possible to detect anomalous trends in the way a given application performance metric behaves and to react before the problem occurs.
- Using dashboards makes it possible to evaluate the evolution of the different metrics recorded along the end-to-end path in real time.
An example of the benefit of applying observability has been that of the applications involved in the EMEA Holcim Cash Cycle (order taking, credit, pricing, offer, logistics, dispatching, O2C, etc.). This has made it possible to achieve 99.99% of availability over the past year. We continue working in this line with the ambition of guaranteeing 100% of availability of our applications to our clients.