Customization & Fine-Tuning

45 min evaluate 4 sections

Step 1 of 4

WHY WHY This Matters

Out-of-the-box AI models are generalists. They know a little about everything but may not know your business—your terminology, your products, your processes, your voice.

Customization makes AI speak your language. But it comes at a cost.

The question isn't whether to customize—it's when the investment pays off.

The Business Operator's decision:

When do I need custom AI behavior?
What's the most cost-effective way to achieve it?
How do I avoid over-engineering a simple problem?

Step 2 of 4

WHAT WHAT You Need to Know

The Customization Spectrum

RAG: Retrieval-Augmented Generation

RAG Economics

Cost Component	Typical Range	Notes
Embedding generation	$0.0001/1K tokens	One-time per document
Vector database	$0-100/month	Pinecone, Weaviate, Supabase
Retrieval per query	~$0.0002	Minimal compute
Augmented prompt tokens	+500-2000 tokens	Added context increases cost

Example RAG cost calculation:

10,000 support documents → $10 to embed (one-time)
Vector DB hosting → $25/month
50,000 queries/month → Extra $50 in prompt tokens
Total: ~$75/month + $10 setup

Compare to: Hiring someone to answer those questions

Fine-Tuning: Teaching New Behaviors

Fine-Tuning Economics

Cost Component	OpenAI GPT-4o-mini	OpenAI GPT-4o
Training cost	$3/million tokens	$25/million tokens
Inference cost	2x base model	2x base model
Minimum examples	10 (50+ recommended)	10 (50+ recommended)
Training time	Minutes to hours	Hours

Example fine-tuning cost calculation:

500 training examples × 500 tokens each = 250K tokens
Training cost: ~$0.75 (GPT-4o-mini)
Inference: 2x normal costs ongoing
Data preparation: 5-20 hours human time
Total: ~$1 compute + significant human investment

The hidden cost: Creating high-quality training data requires expert time. If you need 500 examples of perfect customer support responses, someone has to write them.

The Decision Framework

Hybrid Approaches

Most production systems use multiple approaches:

User query
     ↓
[RAG] Retrieve relevant knowledge from your docs
     ↓
[Fine-tuned model] Generate response in your brand voice
     ↓
[Prompt guard] Ensure output meets formatting requirements
     ↓
Response to user

Example: Customer Support Bot

RAG: Product specs, policies, FAQs
Fine-tuning: Brand voice and escalation behavior
Prompts: Output formatting and safety guardrails

Build vs. Buy for Customization

Approach	Build In-House	Use Managed Service
RAG	Pinecone + custom pipeline	LangChain, Anthropic Claude Projects
Fine-tuning	OpenAI API + data prep	Jasper, Copy.ai (domain-specific)
Full custom	Azure OpenAI + enterprise	Specialized vendors

When to buy:

Speed to market critical
No in-house ML expertise
Vendor has domain knowledge you lack

When to build:

Competitive advantage from customization
Data sensitivity requires control
Long-term cost optimization at scale

Key Concepts

Key Concept

customization spectrum

AI customization exists on a spectrum from cheap-and-quick to expensive-and-permanent:

Approach	Cost	Time	Flexibility	Best For
Prompt Engineering	$0	Minutes	High (change anytime)	Most use cases
Few-Shot Examples	$0	Hours	High	Pattern matching, formatting
RAG (Retrieval)	$-$$	Days	Medium	Knowledge bases, docs
Fine-Tuning	$$-$$$	Weeks	Low (retraining needed)	Style, specialized domains
Custom Training	$$$$	Months	Very Low	Unique capabilities

Key insight: Most projects should start at the top of this spectrum and move down only when necessary.

Key Concept

rag architecture

RAG = Retrieve relevant context → Augment the prompt → Generate response

Instead of training knowledge into the model, you look it up at query time.

How it works:

User asks: "What's our refund policy for premium members?"
                    ↓
Step 1: RETRIEVE — Search your knowledge base
        → Finds: "Premium members get 30-day refunds..."
                    ↓
Step 2: AUGMENT — Add context to prompt
        → Prompt: "Using this policy [context], answer: ..."
                    ↓
Step 3: GENERATE — Model produces answer
        → "Premium members are entitled to full refunds within 30 days..."

RAG advantages:

Knowledge updates instantly (just update the source)
No retraining required
Transparent sources (can show citations)
Cheaper than fine-tuning for knowledge

RAG limitations:

Retrieval quality depends on your data structure
Adds latency (search step)
Can't change how the model behaves, only what it knows

Key Concept

fine tuning

Fine-tuning = Training a model on your examples to change how it behaves.

Not about what the model knows—about how it responds.

Good fine-tuning use cases:

Consistent voice/tone (match your brand exactly)
Specialized formatting (always output in specific JSON structure)
Domain-specific reasoning (medical, legal, financial patterns)
Behavior modification (be more/less formal, technical, concise)

Poor fine-tuning use cases:

Adding factual knowledge (use RAG instead)
One-off tasks (just use prompts)
Rapidly changing requirements (too slow to iterate)

Key Concept

customization decision

Ask these questions in order:

1. Can prompt engineering solve this?

Have you tried detailed system prompts?
Have you tested few-shot examples?
Have you iterated on prompt structure? → If yes, you're done. Don't over-engineer.

2. Is the problem knowledge or behavior?

Knowledge problem = RAG
- "It doesn't know our products"
- "It can't access our policies"
- "It gives outdated information"
Behavior problem = Fine-tuning
- "It doesn't sound like our brand"
- "It won't output in our format consistently"
- "It reasons incorrectly in our domain"

3. What's the volume?

Low volume (<1000 queries/month): Prompt engineering + some manual review
Medium volume (1K-100K/month): RAG for knowledge, prompts for behavior
High volume (>100K/month): Fine-tuning ROI becomes compelling

4. How fast do requirements change?

Changing weekly: Prompts only
Changing monthly: RAG acceptable
Stable for 6+ months: Fine-tuning viable

Key Concept

when not to customize

Don't customize when:

Prompt engineering gets you to 80%+ accuracy
Volume doesn't justify the investment
Requirements are still changing
You lack quality training data
The use case is exploratory

Red flags that you're over-engineering:

"We might need this capability later"
"It would be cool if it could..."
"Other companies are doing fine-tuning"
No clear ROI calculation

The 80/20 rule applies: 80% of AI value comes from basic prompting + good workflow design. 20% comes from advanced customization.

Don't pursue the 20% until you've captured the 80%.

Step 3 of 4

HOW HOW to Apply This

Exercise: Customization Decision Matrix

Customization ROI Calculator

CUSTOMIZATION ROI WORKSHEET

1. CURRENT STATE
   Manual handling time per task: ___ minutes
   Tasks per month: ___
   Loaded labor cost per hour: $___
   Monthly labor cost: $___

2. WITH AI (BASIC PROMPTING)
   Accuracy rate (usable without edits): ___%
   Time saved per usable task: ___ minutes
   Monthly time saved: ___ hours
   Monthly value of time saved: $___

3. WITH CUSTOMIZATION
   Expected accuracy improvement: +___%
   Additional monthly value: $___

4. CUSTOMIZATION COSTS
   One-time setup:
   - RAG infrastructure: $___
   - Fine-tuning training: $___
   - Human data prep time: $___
   Ongoing monthly:
   - Compute increase: $___
   - Maintenance: $___

5. PAYBACK CALCULATION
   Incremental monthly value: $___
   Incremental monthly cost: $___
   Net monthly benefit: $___
   One-time investment: $___
   Payback period: ___ months

When to Say No

Self-Check

Practice Exercises

You're the operations lead at a mid-size law firm. Attorneys waste hours drafting routine documents that follow standard templates. You want to deploy AI assistance.

Analyze these three use cases:

Use Case A: Contract summarization

Input: Client contracts (confidential)
Output: Plain-English summaries of key terms
Volume: ~200 contracts/month
Requirements: Must cite specific clauses

Use Case B: Standard letter drafting

Input: Matter type + key details
Output: Client correspondence
Volume: ~500 letters/month
Requirements: Must match firm's formal voice exactly

Use Case C: Legal research assistance

Input: Case questions
Output: Relevant precedents and analysis
Volume: ~100 queries/month
Requirements: Must use current case law

For each use case, determine:

Primary need: Knowledge, behavior, or both?
Recommended approach: Prompt engineering, RAG, fine-tuning, or hybrid?
Cost-benefit analysis:
- Estimated setup cost
- Ongoing costs
- Value generated (hours saved × hourly rate)
Build vs. buy decision: In-house or vendor?
Risk assessment: What could go wrong?

Step 4 of 4

GENERIC Phase 3 Complete!

You've mastered Agentic Orchestration. You can now:

Evaluate and select AI tools
Work with APIs and automation platforms
Test and deploy AI applications responsibly
Make informed customization decisions

Before moving to Phase 4, complete:

Lab 5: Build an AI Assistant — Create a functional AI assistant using no-code tools

Lab 5b: Multi-Agent Orchestration — Design a system where AI agents collaborate

Phase 3 Deliverable: Multi-Agent System — Build and deploy a working multi-agent system that demonstrates autonomous reasoning and collaboration

Module Complete!

You've reached the end of this module. Review the checklist below to ensure you've understood the key concepts.

Progress Checklist

0/6

I can explain the difference between RAG and fine-tuning
I know when knowledge problems require RAG
I know when behavior problems require fine-tuning
I can calculate customization ROI
I understand hybrid approaches
I can identify when customization is overkill

0% Complete

0/4 Sections

0/5 Concepts

0/1 Exercises

WHY WHY This Matters

WHAT WHAT You Need to Know

The Customization Spectrum

RAG: Retrieval-Augmented Generation

RAG Economics

Fine-Tuning: Teaching New Behaviors

Fine-Tuning Economics

The Decision Framework

Hybrid Approaches

Build vs. Buy for Customization

Key Concepts

customization spectrum

rag architecture

fine tuning

customization decision

when not to customize

HOW HOW to Apply This

Exercise: Customization Decision Matrix

Customization ROI Calculator

When to Say No

Self-Check

Practice Exercises

Scenario

GENERIC Phase 3 Complete!

Module Complete!

Progress Checklist