Introduction

The practical handbook for infrastructure engineers

β€œYou don’t need to be a data scientist to work with AI β€” but you do need to understand how it runs, scales, and is observed.”

AI for Infra Pros

About this project

As someone who spent years in infrastructure before transitioning into AI, I experienced firsthand how challenging it can be to bridge the gap between systems, compute, and machine learning. This repository AI for Infra Pros, was born from that journey. It documents the exact steps, labs, and concepts I wish I had when starting out, helping other infrastructure engineers confidently navigate their AI learning path. Each chapter connects real infrastructure knowledge to AI foundations, tools, and workloads on Azure and beyond.

Overview

AI for Infra Pros is a practical and technical handbook designed to help infrastructure, cloud, and DevOps professionals understand and apply Artificial Intelligence in a secure, efficient, and scalable way.

This repository combines foundational knowledge, best practices, and real-world examples to turn infrastructure expertise into a competitive advantage in the AI era.

What you’ll find

  • AI fundamentals explained from an infrastructure perspective

  • Architecture and automation models using Bicep, Terraform, and YAML

  • Hands-on mini-labs with AKS, GPU VMs, and Azure Machine Learning

  • Strategies for monitoring, security, and resilience in AI workloads

  • Visual glossary comparing key Infrastructure vs. AI concepts

  • An AI Adoption Framework tailored for technical teams

Chapter list

Below is a quick list of all chapters included in this handbook. Each chapter is self-contained and can be read independently:

  • Chapter 1 – AI fundamentals: link

  • Chapter 2 – Data: The fuel of AI: link

  • Chapter 3 – Infrastructure and compute for AI: link

  • Chapter 4 – IaC and automation: link

  • Chapter 5 – Monitoring and observability: link

  • Chapter 6 – Security in AI environments: link

  • Chapter 7 – AI use cases for infrastructure engineers: link

  • Chapter 8 – AI adoption framework: link

  • Chapter 9 – Azure OpenAI: TPM, RPM, and PTU: link

  • Chapter 10 – Visual glossary: link

How to navigate

Each chapter is designed to be independent and complementary. If you want to start quickly:

Goal
Where to start

Understand how AI connects to Infrastructure

Build an AI environment

Measure and observe AI workloads

Ensure security and resilience

Get hands-on experience

Translate terms and concepts


Repository structure

ai-for-infra-pros/
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ chapters/
β”‚   β”‚   β”œβ”€β”€ 01-introduction.md
β”‚   β”‚   β”œβ”€β”€ 02-data.md
β”‚   β”‚   β”œβ”€β”€ 03-compute.md
β”‚   β”‚   β”œβ”€β”€ 04-iac.md
β”‚   β”‚   β”œβ”€β”€ 05-monitoring.md
β”‚   β”‚   β”œβ”€β”€ 06-security.md
β”‚   β”‚   β”œβ”€β”€ 07-use-cases.md
β”‚   β”‚   β”œβ”€β”€ 08-adoption-framework.md
β”‚   β”‚   β”œβ”€β”€ 09-azure-openai-tpm-ptu.md
β”‚   β”‚   └── 10-visual-glossary.md
β”‚   β”œβ”€β”€ extras/
β”‚   β”‚   β”œβ”€β”€ labs/
β”‚   β”‚   β”‚   β”œβ”€β”€ bicep-vm-gpu/
β”‚   β”‚   β”‚   β”‚   └── README.md
β”‚   β”‚   β”‚   β”œβ”€β”€ terraform-aks-gpu/
β”‚   β”‚   β”‚   β”‚   └── README.md
β”‚   β”‚   β”‚   └── yaml-inference-api/
β”‚   β”‚   β”‚       └── README.md
β”‚   β”‚   β”œβ”€β”€ case-studies.md
β”‚   β”‚   β”œβ”€β”€ cheatsheets.md
β”‚   β”‚   └── technical-faq.md
β”‚   β”œβ”€β”€ images/
β”‚   β”‚   β”œβ”€β”€ infrastructure-flow.png
β”‚   β”‚   β”œβ”€β”€ model-life-cycle.png
β”‚   β”‚   └── relationship-tpm-qps-cost.png
β”‚   └── README.md
β”œβ”€β”€ README.md
└── SUMMARY.md

Target audience

Professionals in:

  • Infrastructure and Cloud (Azure, AWS, GCP)

  • DevOps and SRE

  • Solutions Architecture

  • Security and Governance

  • Data Engineering professionals who want to understand the infrastructure side of AI

Mission

To turn infrastructure knowledge into an advantage in the era of Artificial Intelligence. To show that you don’t need to be a data scientist to work with AI β€” but you do need to understand how it runs, scales, and is observed.

Extra resources

Credits

Created by Ricardo Martins πŸ“ Principal Solutions Engineer @ Microsoft πŸ“– Author of Azure Governance Made Simple and Linux Hackathon 🌐 rmmartins.com

β€œAI needs infrastructure. And infrastructure needs to understand AI.”

Last updated