If there's one thing you take from the AI-103 course, let it be this: Foundry isn't just where you deploy models — it's the entire platform where agents live, breathe, get their identity, and get monitored. Understanding Foundry at the structural level answers a huge percentage of exam questions before you even read them carefully.
What Foundry Replaced
Not long ago, building a production AI agent on Azure meant stitching together five or six separate services yourself. You'd provision Azure OpenAI for the model, App Service or Container Apps to host the agent code, Cosmos DB for memory, Azure Monitor for observability, and Azure Active Directory for identity management. Then you'd write all the connection logic, error handling, and retry code to make them talk to each other.
Foundry eliminates that integration work. It's a single Azure service that brings model deployment, agent hosting, identity, safety, and observability under one roof. You focus on what the agent should do — Foundry handles how it runs.
The Three-Layer Hierarchy
Everything in Foundry sits in a three-level structure. Getting this clear in your head is essential, because exam questions will probe exactly which layer does what.
| Layer | You manage it? | What lives here | Common exam question |
|---|---|---|---|
| Hub | Yes | Multiple projects, shared models, security policies, cost boundaries | "How do you isolate two departments' agents?" → separate Hubs |
| Project | Yes | Agent code, system messages, tool definitions, memory config, Trace logs | "Where is agent code stored?" → Project |
| Agent Service | Foundry (managed) | Deployed running agents, endpoints, auto-scaling, health monitoring | "Which layer serves live users?" → Agent Service |
A Hub is how you create organisational separation. One Azure subscription can have multiple Hubs — one per department, one per region, one per compliance boundary. Everything inside a Hub shares its network configuration and access policies. A Hub can hold multiple Projects.
A Project is your workspace. It's where you write system messages, configure tools, set memory parameters, and review trace logs from previous runs. Think of it as the developer workbench where you build and test before deploying. Once your agent is ready, you deploy it from the Project to the Agent Service, which is the managed runtime that your actual end users connect to.
The Model Catalog
Foundry ships with a model catalog — a library of pre-trained models you can browse and deploy. OpenAI models (GPT-4o, GPT-5), Microsoft's own Phi series for smaller/faster inference, and various third-party options. When you deploy a model from the catalog, you're not sharing capacity with anyone else. You reserve a dedicated copy with its own endpoint URL and API key. That isolation is what makes latency predictable at scale.
One model can serve multiple agents within the same Hub, and a single agent can use multiple deployed models — routing to a cheaper, faster model for simple tasks and escalating to a more capable one for complex reasoning.
Endpoints, API Keys, and How to Store Them
When you deploy a model or an agent, Foundry gives you an HTTPS endpoint URL. That's the address your code uses to reach it. Every endpoint also has an associated API key — a secret string that must be included in every request header. Without it, calls get rejected with a 401.
The golden rule: never put API keys in source code. Not in a string literal, not in a comment, not anywhere a version control system might capture them. Use environment variables or Azure Key Vault.
import os
from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient
# In production: retrieve keys from Key Vault at runtime
credential = DefaultAzureCredential() # Uses managed identity on Azure
kv = SecretClient("https://my-vault.vault.azure.net/", credential)
endpoint = kv.get_secret("foundry-endpoint").value
api_key = kv.get_secret("foundry-api-key").value
# DefaultAzureCredential tries these in order:
# 1. AZURE_CLIENT_ID / AZURE_CLIENT_SECRET env vars
# 2. Managed Identity (if running on Azure)
# 3. Visual Studio / VS Code login
# 4. Azure CLI login
Agent Identity: Blueprints, Sponsors, and Conditional Access
Before Foundry, agents ran under the identity of whatever Azure service hosted them — or even the developer's own credentials. That made auditing messy and access control hard. Foundry introduces Entra Agent ID, giving every agent its own unique identity in Azure AD, at the same tier as a human user.
This matters for three reasons: you can audit agent actions separately from human actions, you can grant agents exactly the permissions they need (and no more), and you can assign a human sponsor who is accountable for the agent's behaviour.
A Blueprint is a reusable permissions template. Instead of manually granting the same RBAC roles to every agent you deploy, you define them once in a Blueprint and apply it to many agents. The real power: update the Blueprint and all agents using it automatically receive the updated permissions — no redeployment needed.
A Sponsor is the human responsible for an agent. Every agent must have one. If a sponsor's Entra account is deleted (they leave the company, for example), the agent enters a 24-hour grace window before its permissions are suspended. After that grace window, the agent throws an error — "agent suspended, no sponsor assigned" — until a new sponsor is assigned.
Conditional Access extends the same policy engine used for human employees to agents. You can write rules like "this agent may only access customer data between 8am and 6pm" or "this agent may only run from approved IP ranges." These are enforced at the platform level, outside your agent code.