Skip to content

Blog

🔧 How I Used Claude Code to Deploy a Security Scan Across Many Azure VMs

Sometimes the best way to learn a new Azure feature is to have an AI agent explain it to you while you're under pressure to deliver.

I'd been asked to deploy a third-party security scanning agent across our Azure VM estate. Should have been straightforward — except the usual deployment routes, GPO and Intune, both fell flat for different reasons. I was left without an obvious path forward. Rather than spend hours trawling through documentation for something I might not even find, I opened Claude Code and described the problem. What came back was an Azure feature I'd barely touched before, and within half a day the whole thing was done.

🔐 Ditching Storage Account Keys: OAuth and Managed Identity for Azure Files REST API

TL;DR

  • Managed identities can authenticate to Azure Files via REST API using OAuth tokens — no storage account keys required
  • ⚠️ The x-ms-file-request-intent: backup header is mandatory — without it, all OAuth requests return HTTP 400
  • 🎯 For OAuth-based access over the Azure Files REST API, assign the Storage File Data Privileged Reader or Storage File Data Privileged Contributor role, scoped appropriately (for example, at the file share level). For SMB access, use the dedicated Storage File Data SMB Share roles instead.
  • 🕐 OAuth tokens expire after ~1 hour — implement caching and proactive refresh
  • 📦 No additional SMB OAuth configuration is required on the storage account when using OAuth authentication over the REST API.

OAuth-based REST access can be introduced alongside existing Shared Key or SAS usage during migration.

📚 From PDF Overload to AI Clarity: Building an AI RAG Assistant

Introduction

If you’ve ever tried to dig a single obscure fact out of a massive technical manual, you’ll know the frustration 😩: you know it’s in there somewhere, you just can’t remember the exact wording, cmdlet name, or property that will get you there.

For me, this pain point came from Office 365 for IT Pros — a constantly updated, encyclopaedic PDF covering Microsoft cloud administration. It’s a superb resource… but not exactly quick to search when you can’t remember the magic keyword.
Often I know exactly what I want to achieve — say, add copies of sent emails to the sender’s mailbox when using a shared mailbox — but I can’t quite recall the right cmdlet or property to Ctrl+F my way to the answer.

That’s when I thought: what if I could take this PDF (and others in my archive), drop them into a centralised app, and use AI as the conductor and translator 🎼🤖 to retrieve the exact piece of information I need — just by asking naturally in plain English.

This project also doubled as a test bed for Claude Code, which I’d been using since recently completing a GenAI Bootcamp 🚀.
I wanted to see how it fared when building something from scratch in an IDE, rather than in a chat window.

👉 In this post, I’ll give a very high level overview of the four iterations (v1–v4) - what worked, what failed, and what I learned along the way.

🔥 Vibe Coding My Way to AI Connected Infra: Claude, Terraform & Cloud-Native Monitoring

📖 TL;DR – What This Post Covers

  • How I used AI tools to build an Azure-based monitoring solution from scratch
  • Lessons learned from developing two full versions (manual vs. Terraform)
  • The good, bad, and wandering of GenAI for infrastructure engineers
  • A working, cost-effective, and fully redeployable AI monitoring stack

Introduction

This project began, as many of mine do, with a career planning conversation. During a discussion with ChatGPT about professional development and emerging skill areas for 2025, one suggestion stuck with me:

"You should become an Infrastructure AI Integration Engineer."

It’s a role that doesn’t really exist yet — but probably should.

What followed was a journey to explore whether such a role could be real. I set out to build an AI-powered infrastructure monitoring solution in Azure, without any formal development background and using nothing but conversations with Claude. This wasn’t just about building something cool — it was about testing whether a seasoned infra engineer could:

  • Use GenAI to design and deploy a full solution
  • Embrace the unknown and lean into the chaos of LLM-based workflows

🍓 Building AI-Powered Infrastructure Monitoring: From Home Lab to Cloud Production

After successfully diving into AI automation with n8n (and surviving the OAuth battles), I decided to tackle a more ambitious learning project: exploring how to integrate AI into infrastructure monitoring systems. The goal was to understand how AI can transform traditional monitoring from simple threshold alerts into intelligent analysis that provides actionable insights—all while experimenting in a safe home lab environment before applying these concepts to production cloud infrastructure.

What you'll discover in this post:

  • Complete monitoring stack deployment using Docker Compose
  • Prometheus and Grafana setup for metrics collection
  • n8n workflow automation for data processing and AI analysis
  • Azure OpenAI integration for intelligent infrastructure insights
  • Professional email reporting with HTML templates
  • Lessons learned for transitioning to production cloud environments
  • Practical skills for integrating AI into traditional monitoring workflows

Here's how I built a home lab monitoring system to explore AI integration patterns that can be applied to production cloud infrastructure.

🤖 First Steps into AI Automation: My Journey from Trial to Self-Hosted Chaos

What started as 'let me just automate some emails' somehow turned into a comprehensive exploration of every AI automation platform and deployment method known to mankind...

After months of reading about AI automation tools and watching everyone else's productivity skyrocket with clever workflows, I finally decided to stop being a spectator and dive in myself. What started as a simple "let's automate job alert emails" experiment quickly became a week-long journey through cloud trials, self-hosted deployments, OAuth authentication battles, and enough Docker containers to power a small data centre.

In this post, you'll discover:

  • Real costs of AI automation experimentation ($10-50 range)
  • Why self-hosted OAuth2 is significantly harder than cloud versions
  • Performance differences: Pi 5 vs. desktop hardware for local AI
  • When to choose local vs. cloud AI models
  • Time investment reality: ~10 hours over 1 week for this project

Here's how my first real foray into AI automation unfolded — spoiler alert: it involved more container migrations than I initially planned.

🔄 Bringing Patch Management In-House: Migrating from MSP to Azure Update Manager

It's all fun and games until the MSP contract expires and you realise 90 VMs still need their patching schedules sorted…

With our MSP contract winding down, the time had come to bring VM patching back in house. Our third-party provider had been handling it with their own tooling, which would no longer be used when the service contract expired.

Enter Azure Update Manager — the modern, agentless way to manage patching schedules across your Azure VMs. Add a bit of PowerShell, sprinkle in some Azure Policy, and you've got yourself a scalable, policy-driven solution that's more visible, auditable, and way more maintainable.

Here's how I made the switch — and managed to avoid a patching panic.

⚙️ Azure BCDR Review – Turning Inherited Cloud Infrastructure into a Resilient Recovery Strategy

When we inherited our Azure estate from a previous MSP, some of the key technical components were already in place — ASR was configured for a number of workloads, and backups had been partially implemented across the environment.

What we didn’t inherit was a documented or validated BCDR strategy.

There were no formal recovery plans defined in ASR, no clear failover sequences, and no evidence that a regional outage scenario had ever been modelled or tested. The building blocks were there — but there was no framework tying them together into a usable or supportable recovery posture.

This post shares how I approached the challenge of assessing and strengthening our Azure BCDR readiness. It's not about starting from scratch — it's about applying structure, logic, and realism to an environment that had the right intentions but lacked operational clarity.

Whether you're stepping into a similar setup or planning your first formal DR review, I hope this provides a practical and relatable blueprint.

🧾 Azure BCDR – How I Turned a DR Review into a Strategic Recovery Plan

In Part 1 of this series, I shared how we reviewed our Azure BCDR posture after inheriting a partially implemented cloud estate. The findings were clear: while the right tools were in place, the operational side of disaster recovery hadn’t been addressed.

There were no test failovers, no documented Recovery Plans, no automation, and several blind spots in DNS, storage, and private access.

This post outlines how I took that review and turned it into a practical recovery strategy — one that we could share internally, align with our CTO, and use as a foundation for further work with our support partner.

To provide context, our estate is deployed primarily in the UK South Azure region, with UK West serving as the designated DR target region.

It’s not a template — it’s a repeatable, real-world approach to structuring a BCDR plan when you’re starting from inherited infrastructure, not a clean slate.

💰 Saving Azure Costs with Scheduled VM Start/Stop using Custom Azure Automation Runbooks

As part of my ongoing commitment to FinOps practices, I've implemented several strategies to embed cost-efficiency into the way we manage cloud infrastructure. One proven tactic is scheduling virtual machines to shut down during idle periods, avoiding unnecessary spend.

In this post, I’ll share how I’ve built out custom Azure Automation jobs to schedule VM start and stop operations. Rather than relying on Microsoft’s pre-packaged solution, I’ve developed a streamlined, purpose-built PowerShell implementation that provides maximum flexibility, transparency, and control.