EN

KUBICAST 176 - AI + DevOps & Machine Learning

Let's put these machines to work for us and learn what they can to help us in our daily lives

mansplainer

João Brito

In episode 175 of Kubicast, we welcome Daniel Romeiro (Infoslack) for an intense conversation about how Artificial Intelligence and DevOps are colliding and complementing each other in the software engineering world. Instead of repeating the speech about the AI "hype", we took a practical approach: we talked about the reverse filter, tested concepts in real deployments, and discussed metrics that truly matter in Machine Learning pipelines.

The first part of the episode addresses the reverse filter in AI: how to separate signals from noise, avoid ready-made solutions that do not deliver value, and structure your own criteria for evaluating models. Then, we reflect on career paths. Daniel shares how he accumulated DevOps skills, from CI/CD configurations to Kubernetes cluster management, without losing focus on code quality.

In the second half, we dive into production engineering for ML models. We debate deployment strategies: containers, orchestration, and pipeline automation. Afterwards, we delve into advanced observability — structured logs, customized metrics, and dashboards that allow you to visualize the performance and health of models in production in real-time.

Highlighted Chapters

  • Model Deployment: we understand the packaging options, integration tests, and secure rollback.

  • AI Observability: we discuss how to create key indicators, latency probes, and alerts that do not generate noise.

  • ML Architectures: we analyze microservice patterns versus monolithic architectures for AI workloads.

Practical Insights

  1. Pipeline Automation: how to set up a CI/CD flow for models, from training to deployment, including real-time A/B testing.

  2. Artifact Management: versioning of datasets and models, dependency control, and retention policies.

  3. Resilience and Latency: fallback strategies when external services fail and optimizations to reduce jitter in inference.

Future Perspectives

We conclude by discussing the role of SREs in a world dominated by generative AI. Daniel bets on an increase in demand for professionals capable of designing scalable and secure architectures, with special attention to DevSecOps in data pipelines.

If you are curious about how to unite AI, DevOps, and Machine Learning in your daily routine, this episode brings a complete roadmap: from model conception to ensuring availability in production.



Participate in our early access program and have a safer environment in moments! https://getup.io/zerocve


🎧 Listen also to Kubicast on Spotify, and share it with the whole DevOps group that doesn't have a CREA but is an engineer!

Newsletter Getup.

Atualizações sobre Kubernetes e Software Supply Chain Security todos os meses.

Operating Kubernetes in production for more than 13 years. With Quor, this experience extends to software supply chain security as well.