# Introduction

### The practical handbook for infrastructure engineers

> “You don’t need to be a data scientist to work with AI — but you do need to understand how it runs, scales, and is observed.”

![AI for Infra Pros](https://907582225-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FYKsakaXMV3jPqzN0DHCB%2Fuploads%2Fgit-blob-73ce1243badc5e71350cdc3b5c9d6a153288783a%2Fai4infrapros.png?alt=media)

### About this project

As someone who spent years in infrastructure before transitioning into AI, I experienced firsthand how challenging it can be to bridge the gap between systems, compute, and machine learning. This repository **AI for Infra Pros**, was born from that journey. It documents the exact steps, labs, and concepts I wish I had when starting out, helping other infrastructure engineers confidently navigate their AI learning path. Each chapter connects real infrastructure knowledge to AI foundations, tools, and workloads on Azure and beyond.

## Overview

**AI for Infra Pros** is a practical and technical handbook designed to help infrastructure, cloud, and DevOps professionals understand and apply **Artificial Intelligence in a secure, efficient, and scalable way**.

This repository combines foundational knowledge, best practices, and real-world examples to turn infrastructure expertise into a competitive advantage in the AI era.

## What you’ll find

* AI fundamentals explained from an infrastructure perspective
* Architecture and automation models using **Bicep**, **Terraform**, and **YAML**
* Hands-on mini-labs with **AKS**, **GPU VMs**, and **Azure Machine Learning**
* Strategies for **monitoring, security, and resilience** in AI workloads
* Visual glossary comparing key **Infrastructure vs. AI** concepts
* An **AI Adoption Framework** tailored for technical teams

## Chapter list

Below is a quick list of all chapters included in this handbook. Each chapter is self-contained and can be read independently:

* Chapter 1 – AI fundamentals: [link](https://www.ai4infra.com/main-chapters/01-introduction)
* Chapter 2 – Data: The fuel of AI: [link](https://www.ai4infra.com/main-chapters/02-data)
* Chapter 3 – Infrastructure and compute for AI: [link](https://www.ai4infra.com/main-chapters/03-compute)
* Chapter 4 – IaC and automation: [link](https://www.ai4infra.com/main-chapters/04-iac)
* Chapter 5 – Monitoring and observability: [link](https://www.ai4infra.com/main-chapters/05-monitoring)
* Chapter 6 – Security in AI environments: [link](https://www.ai4infra.com/main-chapters/06-security)
* Chapter 7 – AI use cases for infrastructure engineers: [link](https://www.ai4infra.com/main-chapters/07-use-cases)
* Chapter 8 – AI adoption framework: [link](https://www.ai4infra.com/main-chapters/08-adoption-framework)
* Chapter 9 – Azure OpenAI: TPM, RPM, and PTU: [link](https://www.ai4infra.com/main-chapters/09-azure-openai-tpm-ptu)
* Chapter 10 – Visual glossary: [link](https://www.ai4infra.com/main-chapters/10-visual-glossary)

## How to navigate

Each chapter is designed to be **independent and complementary**.\
If you want to start quickly:

| Goal                                         | Where to start                                                                                   |
| -------------------------------------------- | ------------------------------------------------------------------------------------------------ |
| Understand how AI connects to Infrastructure | [Chapter 1 - AI fundamentals](https://www.ai4infra.com/main-chapters/01-introduction)            |
| Build an AI environment                      | [Chapter 4 - IaC and automation](https://www.ai4infra.com/main-chapters/04-iac)                  |
| Measure and observe AI workloads             | [Chapter 5 - Monitoring and observability](https://www.ai4infra.com/main-chapters/05-monitoring) |
| Ensure security and resilience               | [Chapter 6 - Security in AI environments](https://www.ai4infra.com/main-chapters/06-security)    |
| Get hands-on experience                      | [Mini-labs](https://www.ai4infra.com/hands-on-labs/labs)                                         |
| Translate terms and concepts                 | [Visual glossary](https://www.ai4infra.com/main-chapters/10-visual-glossary)                     |

***

## Repository structure

```
ai-for-infra-pros/
├── docs/
│   ├── chapters/
│   │   ├── 01-introduction.md
│   │   ├── 02-data.md
│   │   ├── 03-compute.md
│   │   ├── 04-iac.md
│   │   ├── 05-monitoring.md
│   │   ├── 06-security.md
│   │   ├── 07-use-cases.md
│   │   ├── 08-adoption-framework.md
│   │   ├── 09-azure-openai-tpm-ptu.md
│   │   └── 10-visual-glossary.md
│   ├── extras/
│   │   ├── labs/
│   │   │   ├── bicep-vm-gpu/
│   │   │   │   └── README.md
│   │   │   ├── terraform-aks-gpu/
│   │   │   │   └── README.md
│   │   │   ├── yaml-inference-api/
│   │   │   │   └── README.md
│   │   ├── case-studies.md
│   │   ├── cheatsheets.md
│   │   └── technical-faq.md
│   └── images/
│       ├── ai4infrapros.png
│       ├── aiforinfracon.png
│       ├── designing-ha.png
│       ├── example-architecture.png
│       ├── framework-structure.png
│       ├── full-training-pipeline.png
│       ├── infrastructure-flow.png
│       ├── model-life-cycle.png
│       ├── predictive-analysis.png
│       ├── relationship-tpm-qps-cost.png
│       ├── simple-training-pipeline.png
│       ├── typical-architecture-ptu.png
│       └── where-to-run.png
├── README.md
└── SUMMARY.md
```

## Target audience

Professionals in:

* Infrastructure and Cloud (Azure, AWS, GCP)
* DevOps and SRE
* Solutions Architecture
* Security and Governance
* Data Engineering professionals who want to understand the infrastructure side of AI

## Mission

To turn infrastructure knowledge into an advantage in the era of Artificial Intelligence.\
To show that **you don’t need to be a data scientist to work with AI — but you do need to understand how it runs, scales, and is observed.**

## Extra resources

* [Infrastructure mini-labs for AI](https://www.ai4infra.com/hands-on-labs/labs)
* [Technical Case Studies](https://www.ai4infra.com/extras-and-reference-material/case-studies)
* [Cheatsheets](https://www.ai4infra.com/extras-and-reference-material/cheatsheets)
* [Technical FAQ](https://www.ai4infra.com/extras-and-reference-material/technical-faq)

## Credits

Created by **Ricardo Martins**\
📍 Principal Solutions Engineer @ Microsoft\
📖 Author of [*Azure Governance Made Simple*](https://book.azgovernance.com/) and [*Linux Hackathon*](https://linuxhackathon.com/)\
🌐 [rmmartins.com](https://rmmartins.com)

> *“AI needs infrastructure. And infrastructure needs to understand AI.”*
