Fine-Tuning LLMs on Your Own Data

Session overview

A practical session on fine-tuning open-source language models on proprietary data, including the dataset preparation, training, and evaluation workflow we use in client engagements.

What we cover

When fine-tuning is the right answer. Decision framework for choosing fine-tuning vs RAG vs prompting.
Base model selection. Llama, Qwen, Mistral, DeepSeek — practical considerations for picking a starting point.
Dataset preparation. Curation, formatting, edge cases, the 80% of the work that's least glamorous.
LoRA and QLoRA training. Hyperparameter choices, training duration, hardware requirements.
Evaluation harnesses. Building the test set, defining rubrics, automated and LLM-as-judge scoring.
Common failure modes. Catastrophic forgetting, memorisation, reward hacking, distribution mismatch.

Live demonstration

The session includes a live walkthrough of fine-tuning a 7B model on a small proprietary dataset — from raw data through to evaluation. Total elapsed wall-clock time visible to the audience.

Reference materials

The workflow demonstrated is documented in our fine-tuning guide. Background context on choosing between fine-tuning and RAG is in our decision tree blog post.

Q&A topics

Sizing the dataset for a target capability.
Continuous fine-tuning vs episodic retraining.
Comparing fine-tuned open-source models with frontier APIs.
Cost optimisation strategies.

Web Design

Web Development

Software Development

Featured

Free Discovery Call

Case Studies

On-Premise AI Setup

Training & Fine-Tuning

AI Agents & Automation

AI-Powered Development

Learn

Tools

Featured

Project Scoping Guide

About

Our Vision

Team

Careers