Sanganak Authority: December 2025

In previous video we saw step by step process on integrating Gemini AI in Azure API Management. In this video we will see how Gemini AI can be rate limited in terms of token consumption using api management. This way it can help us to limit AI cost spikes also.

1. Common limit per client

example 200 TPM per client, applicable for all clients

2. Separate limit per client

example, Client1 – 200TPM, Client2- 500TPM etc.

Part 1 – Integrate Gemini AI in API Management –

https://youtu.be/HNuOF09vq_I

Rate limit Gemini AI using API Management Code Base

https://github.com/kunalchandratre1/azure-gen-ai-gateway

#AzureBeyondDemos #AzureAPIManagement #GeminiAI #GenAIGateway #AzureIntegration #GoogleGemini #AzureBeyondDemos #APIM #GenerativeAI #AzureTutorial #CloudArchitecture #AIIntegration #AzureForAI #GeminiOnAzure #APIMGateway #AzureAI #Microsoft #MicrosoftAzure #Msftadvocate

Sanganak Authority

Pages

Monday, December 15, 2025

Avoid Gemini AI Cost Spikes! Apply Rate Limits to Gemini AI via Azure API Management