Every time I wrote about agent skills, someone asked: “What about Gemini?”
The Claude post covers tool_use. The OpenAI post covers function calling. This is the post that closes the set.
Gemini’s function calling works on the same conceptual model — define tools, model decides to call one, you execute it, you return the result. The syntax is different enough that it’s worth its own walkthrough. By the end you’ll have a working example and a three-way comparison table.
Setup
npm install @google/generative-ai dotenv
Get a free API key from Google AI Studio — no credit card required for the free tier.
export GEMINI_API_KEY="your-key-here"
Step 1 — Define a function declaration
Gemini calls tool definitions “function declarations.” The schema format is OpenAPI-style — closer to OpenAI’s parameters than Claude’s input_schema.
// tool-definitions.js
const weatherFunctionDeclaration = {
name: "get_weather",
description:
"Get current weather for a city. Use when the user asks about " +
"weather, temperature, rain, or what to wear outdoors.",
parameters: {
type: "OBJECT",
properties: {
city: {
type: "STRING",
description: "The city name, e.g. 'Mumbai' or 'London'"
}
},
required: ["city"]
}
};
Two things to notice:
- Type names are uppercase strings:
"OBJECT","STRING","NUMBER","BOOLEAN","ARRAY"— unlike OpenAI/Claude which use lowercase - The structure is otherwise very similar to OpenAI’s
parametersfield
Step 2 — Start a chat with tools
Gemini uses a chat session model rather than a stateless messages array. You create a model instance, start a chat with tool declarations, then send messages to that chat.
import { GoogleGenerativeAI } from "@google/generative-ai";
import "dotenv/config";
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
const model = genAI.getGenerativeModel({
model: "gemini-2.0-flash",
tools: [
{
functionDeclarations: [weatherFunctionDeclaration]
}
]
});
// Start a chat session
const chat = model.startChat();
// Send the first message
const response = await chat.sendMessage("What's the weather in Mumbai?");
Step 3 — Handle the functionCall response
When Gemini wants to call a tool, the response looks different from Claude and OpenAI:
const result = response.response;
// Check if Gemini wants to call a function
const parts = result.candidates[0].content.parts;
const functionCallPart = parts.find(p => p.functionCall);
if (functionCallPart) {
console.log("Function to call:", functionCallPart.functionCall.name);
console.log("Arguments:", functionCallPart.functionCall.args);
// { name: "get_weather", args: { city: "Mumbai" } }
}
Unlike OpenAI where arguments is a JSON string you must parse, Gemini’s args is already a parsed object — no JSON.parse() needed.
Also unlike Claude’s stop_reason: "tool_use" or OpenAI’s finish_reason: "tool_calls", Gemini uses result.candidates[0].content.parts — you check for a functionCall part directly.
Step 4 — Execute the function
Same as every other platform — just your JavaScript:
async function get_weather({ city }) {
try {
const geo = await fetch(
`https://geocoding-api.open-meteo.com/v1/search?name=${encodeURIComponent(city)}&count=1`
).then(r => r.json());
if (!geo.results?.length) return { error: `City not found: ${city}` };
const { latitude, longitude, name, country } = geo.results[0];
const weather = await fetch(
`https://api.open-meteo.com/v1/forecast?latitude=${latitude}&longitude=${longitude}¤t_weather=true`
).then(r => r.json());
const codes = {
0: "Clear sky", 1: "Mainly clear", 2: "Partly cloudy", 3: "Overcast",
61: "Light rain", 63: "Moderate rain", 65: "Heavy rain", 95: "Thunderstorm"
};
return {
city: `${name}, ${country}`,
temperature: `${weather.current_weather.temperature}°C`,
condition: codes[weather.current_weather.weathercode] ?? "Unknown"
};
} catch (err) {
return { error: err.message };
}
}
Step 5 — Send the functionResponse back
This is where Gemini’s format diverges most from Claude and OpenAI. You send back a functionResponse part via chat.sendMessage() with a specific structure:
// Execute the function
const toolResult = await get_weather(functionCallPart.functionCall.args);
// Send the result back to Gemini
const functionResponseMessage = await chat.sendMessage([
{
functionResponse: {
name: functionCallPart.functionCall.name, // must match the function name
response: toolResult // your result object (not stringified — Gemini accepts objects)
}
}
]);
// Now get the final text response
const finalText = functionResponseMessage.response.text();
console.log(finalText);
// "Mumbai is currently experiencing light rain at 27°C with partly cloudy skies."
Key difference from OpenAI: the result goes in response as a plain object — not a JSON string. Gemini handles the serialization internally.
Full working example
// weather-agent-gemini.js
import { GoogleGenerativeAI } from "@google/generative-ai";
import "dotenv/config";
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
const weatherFunctionDeclaration = {
name: "get_weather",
description:
"Get current weather for a city. Use when the user asks about weather, temperature, rain, or what to wear.",
parameters: {
type: "OBJECT",
properties: {
city: { type: "STRING", description: "The city name" }
},
required: ["city"]
}
};
async function get_weather({ city }) {
try {
const geo = await fetch(
`https://geocoding-api.open-meteo.com/v1/search?name=${encodeURIComponent(city)}&count=1`
).then(r => r.json());
if (!geo.results?.length) return { error: `City not found: ${city}` };
const { latitude, longitude, name, country } = geo.results[0];
const w = await fetch(
`https://api.open-meteo.com/v1/forecast?latitude=${latitude}&longitude=${longitude}¤t_weather=true`
).then(r => r.json());
const codes = { 0: "Clear sky", 2: "Partly cloudy", 3: "Overcast", 61: "Light rain", 63: "Moderate rain", 95: "Thunderstorm" };
return { city: `${name}, ${country}`, temperature: `${w.current_weather.temperature}°C`, condition: codes[w.current_weather.weathercode] ?? "Unknown" };
} catch (err) {
return { error: err.message };
}
}
const toolFunctions = { get_weather };
async function chat(userMessage) {
const model = genAI.getGenerativeModel({
model: "gemini-2.0-flash",
tools: [{ functionDeclarations: [weatherFunctionDeclaration] }]
});
const chatSession = model.startChat();
let response = await chatSession.sendMessage(userMessage);
// Loop until no more function calls
while (true) {
const parts = response.response.candidates[0].content.parts;
const functionCallPart = parts.find(p => p.functionCall);
if (!functionCallPart) break;
const { name, args } = functionCallPart.functionCall;
console.log(`[Tool: ${name}]`, args);
const fn = toolFunctions[name];
const result = fn ? await fn(args) : { error: `Unknown function: ${name}` };
response = await chatSession.sendMessage([
{ functionResponse: { name, response: result } }
]);
}
return response.response.text();
}
const answer = await chat("Is it raining in Chennai right now?");
console.log(answer);
Run it:
node weather-agent-gemini.js
Three-way comparison table
Here’s how Gemini, Claude, and OpenAI differ at every step of the tool call cycle:
| Claude API | OpenAI API | Gemini API | |
|---|---|---|---|
| Tool definition key | tools: [{ name, description, input_schema }] | tools: [{ type: "function", function: { name, description, parameters } }] | tools: [{ functionDeclarations: [{ name, description, parameters }] }] |
| Schema field | input_schema | parameters | parameters |
| Type names | "object", "string" (lowercase) | "object", "string" (lowercase) | "OBJECT", "STRING" (uppercase) |
| Stop signal | stop_reason: "tool_use" | finish_reason: "tool_calls" | Check parts.find(p => p.functionCall) |
| Call location | response.content[].type === "tool_use" | message.tool_calls[].function | candidates[0].content.parts[].functionCall |
| Arguments format | Object (parsed) | JSON string (must JSON.parse) | Object (parsed) |
| Session model | Stateless messages array | Stateless messages array | Chat session object (startChat()) |
| Send result back | Add type: "tool_result" to user message | Add role: "tool" message | chat.sendMessage([{ functionResponse: { name, response } }]) |
| Result format | JSON string in content | JSON string in content | Plain object in response |
The most important differences day-to-day:
- Gemini uses a chat session — you call
chat.sendMessage()each round, notclient.messages.create()with a growing messages array - Arguments and results are objects — no
JSON.parse()orJSON.stringify()needed - No explicit stop signal — check for the presence of
functionCallparts rather than a status field
Gemini-specific features
Grounding vs. function calling
Gemini has a built-in “grounding” feature that connects the model to Google Search directly — without you writing a web_search tool. If you’re building with Gemini and need web search, grounding may be simpler than a custom tool.
const model = genAI.getGenerativeModel({
model: "gemini-2.0-flash",
tools: [{ googleSearch: {} }] // built-in search grounding
});
This is Gemini-specific — Claude and OpenAI don’t have an equivalent built-in.
Multimodal tool inputs
Gemini 2.0 can pass image data as tool arguments. If you define a tool that accepts an imageData parameter, the model can pass a screenshot or photo it received as input. This enables skills like “analyze this screenshot and file a bug report” in a single step.
Using Gemini via the Vercel AI SDK
If you’ve read the Vercel AI SDK post, you already know the unified approach. For Gemini:
npm install @ai-sdk/google
import { generateText } from "ai";
import { google } from "@ai-sdk/google";
const result = await generateText({
model: google("gemini-2.0-flash"), // one line change from Claude or OpenAI
tools: { get_weather: getWeatherTool },
maxSteps: 5,
messages: [{ role: "user", content: "Weather in Delhi?" }]
});
The Vercel AI SDK normalizes the session model, argument formats, and result handling — you never deal with the differences outlined in the comparison table above.
Use the raw @google/generative-ai SDK when you need Gemini-specific features (grounding, multimodal tool inputs). Use the Vercel AI SDK for everything else.
Gemini and MCP
As of early 2026, Gemini supports the Model Context Protocol (MCP) — the same protocol Claude and other models use for connecting to external tools and data sources.
Learn how MCP connects AI to any tool: MCP Explained: How Claude Connects to Any Tool or Data Source
If you’re building a production integration that needs to work across multiple AI providers, MCP + the Vercel AI SDK is the most future-proof combination.
What’s next
Unify all three providers with one tool API: Vercel AI SDK Tools: One API for Claude and OpenAI Skills
Compare all implementations: Agent Skills with the Claude API and Agent Skills with the OpenAI API
Back to fundamentals: What Are Agent Skills? AI Tools Explained Simply
Related Reading.
Vercel AI SDK Tools: One API for Claude and OpenAI Skills
Vercel AI SDK's unified tool interface works with Claude, OpenAI, and Gemini. Write your skill once and switch AI providers without rewriting the agent loop.
Agent Skills with the OpenAI API: Function Calling Explained
How to use OpenAI function calling with gpt-4o — define functions, handle tool_calls in responses, execute your code, and return results. Full Node.js working example.
Build a GitHub Issue Creator Skill for Your AI Agent
Create a production-ready agent skill that creates GitHub issues from natural language, with label assignment, duplicate detection, and dry-run mode.