Post By: Raj Gupta
Hey there! Today we’re going to build something exciting — an AI assistant that can understand natural language requests and convert them into specific tool calls. Think of it like teaching an AI to press the right buttons when someone asks for help.
What We’re Going to Build
Imagine someone asks: “What’s the weather in Paris?” Our AI will respond with: {"tool": "get_weather", "location": "Paris"}
Or if someone says: “Find me red shoes under $100” It’ll know to use: {"tool": "search_products", "query": "red shoes", "max_price": 100}
Let’s build this together! 🚀
What You’ll Need
Before we start, let’s make sure you have:
A Google Colab account (Pro recommended for faster training)
Basic Python knowledge
About 2–3 hours of time
Some example data (don’t worry, we’ll create this together!)
Step 1: Preparing Our Training Data
Let’s start with the most important part — our training data. Think of this as the textbook we’ll use to teach our AI.
Creating Our CSV File
First, let’s create a CSV file that looks like this:
user_message,tool_name,parameters
"What's the weather in Paris?","get_weather","{""location"": ""Paris"", ""units"": ""celsius""}
""Find red shoes under $100","search_products","{""query"": ""red shoes"", ""max_price"": 100}"
Let me explain each column:
user_message
: What a person might ask (like a text message)
tool_name
: Which tool should handle this request
parameters
: What information the tool needs (in JSON format)
💡 Why JSON format?: We use JSON because it’s a standard way to structure data. It’s like giving the AI a consistent format to follow.
Let’s Create Some Examples Together
Think about common requests people might make. Here are some patterns we can follow:
Weather requests:
{
"user_message": "What's the weather like in [CITY]?",
"tool_name": "get_weather",
"parameters": {"location": "[CITY]", "units": "celsius"}
}
2. Shopping requests:
{
"user_message": "Find [ITEM] under $[PRICE]",
"tool_name": "search_products",
"parameters": {"query": "[ITEM]", "max_price": [PRICE]}
}
🔍 Quick Check: Before moving on, make sure your CSV file:
Opens correctly in Excel or a text editor
Has properly formatted JSON in the parameters column
Contains at least 10–15 diverse examples
Step 2: Setting Up Our Environment
Now let’s set up everything we need to train our model. I’ll explain each part as we go:
!pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install bitsandbytes>=0.41.1 pandas
from unsloth import FastLanguageModel
import torch
import json
import pandas as pd
from datasets import Dataset
from transformers import BitsAndBytesConfig
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True
)
💡 Why 4-bit quantisation?: Think of this like compressing a file. We’re making the model smaller without losing too much quality. It’s like turning a high-res photo into a smaller file that still looks good!
✅ Success Check: You should see no error messages after running these commands.
Step 3: Loading Our Training Data
Let’s create a helper class to load and validate our CSV data. I’ll explain each part:
class DataLoader:
def __init__(self, csv_path):
self.csv_path = csv_path
print("Loading data from:", csv_path)
def load_data(self):
"""Loads and checks our training data"""
try:
df = pd.read_csv(self.csv_path)
print(f"Found {len(df)} examples")
processed_examples = []
for idx, row in df.iterrows():
try:
parameters = json.loads(row['parameters'])
example = {
"input": row['user_message'],
"output": {
"function_call": {
"name": row['tool_name'],
"arguments": parameters
}
}
}
processed_examples.append(example)
except json.JSONDecodeError:
print(f"⚠️ Row {idx + 1} has invalid JSON - skipping it")
continue
print(f"Successfully processed {len(processed_examples)} examples")
return processed_examples
except Exception as e:
print(f"Error loading CSV: {str(e)}")
return []
def validate_example(self, example):
"""Checks if an example looks good"""
if not example['input'].strip():
return False
if not example['output']['function_call']['name'].strip():
return False
if not example['output']['function_call']['arguments']:
return False
return True
Let’s load our data:
loader = DataLoader('training_data.csv')
training_examples = loader.load_data()
print(f"\nData Summary:")
print(f"- Total examples: {len(training_examples)}")
print(f"- Unique tools: {len(set(ex['output']['function_call']['name'] for ex in training_examples))}")
🔍 What to Look For:
You should see the number of examples loaded
No error messages about invalid JSON
At least a few different tools being used
Step 4: Preparing for Training
Now let’s format our data for training. We’ll create a class to handle this:
class TrainingPreparator:
def __init__(self, examples):
self.examples = examples
self.tools = self._gather_tools()
def _gather_tools(self):
"""Collects info about all our tools"""
tools = {}
for ex in self.examples:
tool_call = ex['output']['function_call']
tool_name = tool_call['name']
if tool_name not in tools:
tools[tool_name] = {
"name": tool_name,
"parameters": set()
}
tools[tool_name]["parameters"].update(tool_call['arguments'].keys())
return tools
def format_for_training(self):
"""Gets our data ready for the model"""
formatted_data = []
for example in self.examples:
instruction = self._create_instruction(example)
output = json.dumps(example['output'], indent=2)
formatted = {
"text": f"<s>[INST] {instruction} [/INST]\n{output}</s>"
}
formatted_data.append(formatted)
return formatted_data
def _create_instruction(self, example):
"""Creates clear instructions for the model"""
instruction = f"User Request: {example['input']}\n\n"
instruction += "Available tools:\n"
for tool_name, tool_info in self.tools.items():
params = ", ".join(tool_info["parameters"])
instruction += f"- {tool_name} (Parameters: {params})\n"
return instruction
Let’s use our preparator:
preparator = TrainingPreparator(training_examples)
prepared_data = preparator.format_for_training()
print("\nTraining Data Preview:")
print(prepared_data[0]["text"])
✅ Success Check: You should see:
A formatted example with instructions
List of available tools
Properly formatted JSON output
Step 5: Training Our Model
Now for the exciting part — let’s train our model! I’ll explain each step:
def setup_training():
"""Sets up everything for training"""
print("Loading base model...")
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "unsloth/Llama-3.2-1B-Instruct",
max_seq_length = 2048,
quantization_config = bnb_config,
device_map = "auto"
)
print("Preparing for fine-tuning...")
model = FastLanguageModel.get_peft_model(
model,
r = 16,
target_modules = [
"q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"
],
lora_alpha = 16,
lora_dropout = 0,
use_gradient_checkpointing = True,
use_4bit = True
)
return model, tokenizer
from transformers import TrainingArguments
def get_training_settings():
"""Gets our training configuration"""
return TrainingArguments(
output_dir = "my_tool_calling_model",
per_device_train_batch_size = 4,
gradient_accumulation_steps = 4,
learning_rate = 2e-4,
max_steps = 100,
logging_steps = 1,
fp16 = True,
gradient_checkpointing = True
)
from trl import SFTTrainer
def train_model(model, tokenizer, prepared_data, training_args):
"""Trains our model""" print("Setting up training...")
trainer = SFTTrainer(
model = model,
tokenizer = tokenizer,
train_dataset = Dataset.from_dict({"text": [ex["text"] for ex in prepared_data]}),
dataset_text_field = "text",
max_seq_length = 2048,
args = training_args
)
print("Starting training...")
trainer.train()
return trainer
model, tokenizer = setup_training()
training_args = get_training_settings()
trainer = train_model(model, tokenizer, prepared_data, training_args)
💡 What’s Happening?:
We load the base model (Llama 3.2 1B)
Set it up for efficient training (4-bit quantization)
Configure how it should learn
Start the training process!
You should see progress updates as it trains:
Training Step: 1/100
Loss: 2.345
Training Step: 2/100
Loss: 2.123
...
The loss number should go down — that means it’s learning! 📉
Step 6: Testing Our Model
Let’s see what our model can do:
def test_our_model(model, tokenizer, user_input):
"""Tests what our model has learned"""
input_text = f"<s>[INST] User Request: {user_input}\n\nProvide the appropriate tool call in JSON format. [/INST]"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(
**inputs,
max_new_tokens = 100,
temperature = 0.7
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
return response
test_questions = [
"What's the weather like in London?",
"Find me a gaming laptop under $1000",
"Is it raining in Tokyo?"
]
print("\nTesting our model:")
for question in test_questions:
print(f"\nQuestion: {question}")
print("Response:", test_our_model(model, tokenizer, question))
✅ Success Looks Like:
Clean JSON responses
Correct tool selection
Proper parameter formatting
Common Problems and Solutions
“Help! My CSV isn’t loading!”
Check the file encoding (use UTF-8)
Make sure JSON is properly formatted
Look for missing quotes or commas
“The training is so slow!”
Reduce max_steps for testing
Use a smaller dataset first
Make sure 4-bit quantisation is enabled
“I’m getting memory errors!”
Reduce batch size
Enable all memory optimisations
Use shorter sequences
Next Steps
Now that your model is working, try:
Adding more training examples
Creating new tools
Testing with different questions
Fine-tuning the parameters
Remember:
Start small and test often
Add complexity gradually
Keep your training data clean
Monitor the training loss
You’ve built your own tool-calling AI! 🎉
Happy training! 🚀