Microsoft Foundry Explained: Hub, Project & Agent Service Architecture

If there's one thing you take from the AI-103 course, let it be this: Foundry isn't just where you deploy models — it's the entire platform where agents live, breathe, get their identity, and get monitored. Understanding Foundry at the structural level answers a huge percentage of exam questions before you even read them carefully.

What Foundry Replaced

Not long ago, building a production AI agent on Azure meant stitching together five or six separate services yourself. You'd provision Azure OpenAI for the model, App Service or Container Apps to host the agent code, Cosmos DB for memory, Azure Monitor for observability, and Azure Active Directory for identity management. Then you'd write all the connection logic, error handling, and retry code to make them talk to each other.

Foundry eliminates that integration work. It's a single Azure service that brings model deployment, agent hosting, identity, safety, and observability under one roof. You focus on what the agent should do — Foundry handles how it runs.

The Three-Layer Hierarchy

Everything in Foundry sits in a three-level structure. Getting this clear in your head is essential, because exam questions will probe exactly which layer does what.

Layer	You manage it?	What lives here	Common exam question
Hub	Yes	Multiple projects, shared models, security policies, cost boundaries	"How do you isolate two departments' agents?" → separate Hubs
Project	Yes	Agent code, system messages, tool definitions, memory config, Trace logs	"Where is agent code stored?" → Project
Agent Service	Foundry (managed)	Deployed running agents, endpoints, auto-scaling, health monitoring	"Which layer serves live users?" → Agent Service

Memorise this: Agent code is stored in the Project. The Agent Service is what runs that code and serves users. The Hub is the governance container above it all. These three questions appear constantly in the exam.

A Hub is how you create organisational separation. One Azure subscription can have multiple Hubs — one per department, one per region, one per compliance boundary. Everything inside a Hub shares its network configuration and access policies. A Hub can hold multiple Projects.

A Project is your workspace. It's where you write system messages, configure tools, set memory parameters, and review trace logs from previous runs. Think of it as the developer workbench where you build and test before deploying. Once your agent is ready, you deploy it from the Project to the Agent Service, which is the managed runtime that your actual end users connect to.

The building analogy: A Hub is a building — it has one address (network), one security desk (access policy), and shared utilities (billing). A Project is a floor in that building — separate workspace, separate config. The Agent Service is the publicly accessible lobby where visitors actually walk in.

The Model Catalog

Foundry ships with a model catalog — a library of pre-trained models you can browse and deploy. OpenAI models (GPT-4o, GPT-5), Microsoft's own Phi series for smaller/faster inference, and various third-party options. When you deploy a model from the catalog, you're not sharing capacity with anyone else. You reserve a dedicated copy with its own endpoint URL and API key. That isolation is what makes latency predictable at scale.

One model can serve multiple agents within the same Hub, and a single agent can use multiple deployed models — routing to a cheaper, faster model for simple tasks and escalating to a more capable one for complex reasoning.

Endpoints, API Keys, and How to Store Them

When you deploy a model or an agent, Foundry gives you an HTTPS endpoint URL. That's the address your code uses to reach it. Every endpoint also has an associated API key — a secret string that must be included in every request header. Without it, calls get rejected with a 401.

The golden rule: never put API keys in source code. Not in a string literal, not in a comment, not anywhere a version control system might capture them. Use environment variables or Azure Key Vault.

import os
from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient

# In production: retrieve keys from Key Vault at runtime
credential = DefaultAzureCredential()  # Uses managed identity on Azure
kv = SecretClient("https://my-vault.vault.azure.net/", credential)

endpoint = kv.get_secret("foundry-endpoint").value
api_key  = kv.get_secret("foundry-api-key").value

# DefaultAzureCredential tries these in order:
# 1. AZURE_CLIENT_ID / AZURE_CLIENT_SECRET env vars
# 2. Managed Identity (if running on Azure)
# 3. Visual Studio / VS Code login
# 4. Azure CLI login

DefaultAzureCredential authentication order matters for the exam: Environment variables first, then Managed Identity, then Visual Studio login, then Azure CLI. In production on Azure, step 2 (Managed Identity) kicks in automatically — no secrets in code at all.

Agent Identity: Blueprints, Sponsors, and Conditional Access

Before Foundry, agents ran under the identity of whatever Azure service hosted them — or even the developer's own credentials. That made auditing messy and access control hard. Foundry introduces Entra Agent ID, giving every agent its own unique identity in Azure AD, at the same tier as a human user.

This matters for three reasons: you can audit agent actions separately from human actions, you can grant agents exactly the permissions they need (and no more), and you can assign a human sponsor who is accountable for the agent's behaviour.

A Blueprint is a reusable permissions template. Instead of manually granting the same RBAC roles to every agent you deploy, you define them once in a Blueprint and apply it to many agents. The real power: update the Blueprint and all agents using it automatically receive the updated permissions — no redeployment needed.

A Sponsor is the human responsible for an agent. Every agent must have one. If a sponsor's Entra account is deleted (they leave the company, for example), the agent enters a 24-hour grace window before its permissions are suspended. After that grace window, the agent throws an error — "agent suspended, no sponsor assigned" — until a new sponsor is assigned.

Conditional Access extends the same policy engine used for human employees to agents. You can write rules like "this agent may only access customer data between 8am and 6pm" or "this agent may only run from approved IP ranges." These are enforced at the platform level, outside your agent code.

Put it together: The Entra Agent ID is the employee badge. The Blueprint is the employment contract (defines what they're allowed to do). The Sponsor is their manager (accountable if something goes wrong). Conditional Access is the building's after-hours security system — even with a valid badge, some doors don't open after 6pm.

Microsoft Foundry: Your AI Agent Control Centre

What Foundry Replaced

The Three-Layer Hierarchy

The Model Catalog

Endpoints, API Keys, and How to Store Them

Agent Identity: Blueprints, Sponsors, and Conditional Access