Customization & Fine-Tuning

45 min evaluate 4 sections
Step 1 of 4

WHY WHY This Matters

Out-of-the-box AI models are generalists. They know a little about everything but may not know your business—your terminology, your products, your processes, your voice.

Customization makes AI speak your language. But it comes at a cost.

The question isn't whether to customize—it's when the investment pays off.

The Business Operator's decision:

  • When do I need custom AI behavior?
  • What's the most cost-effective way to achieve it?
  • How do I avoid over-engineering a simple problem?

Step 2 of 4

WHAT WHAT You Need to Know

The Customization Spectrum

RAG: Retrieval-Augmented Generation

RAG Economics

Cost Component Typical Range Notes
Embedding generation $0.0001/1K tokens One-time per document
Vector database $0-100/month Pinecone, Weaviate, Supabase
Retrieval per query ~$0.0002 Minimal compute
Augmented prompt tokens +500-2000 tokens Added context increases cost

Example RAG cost calculation:

  • 10,000 support documents → $10 to embed (one-time)
  • Vector DB hosting → $25/month
  • 50,000 queries/month → Extra $50 in prompt tokens
  • Total: ~$75/month + $10 setup

Compare to: Hiring someone to answer those questions

Fine-Tuning: Teaching New Behaviors

Fine-Tuning Economics

Cost Component OpenAI GPT-4o-mini OpenAI GPT-4o
Training cost $3/million tokens $25/million tokens
Inference cost 2x base model 2x base model
Minimum examples 10 (50+ recommended) 10 (50+ recommended)
Training time Minutes to hours Hours

Example fine-tuning cost calculation:

  • 500 training examples × 500 tokens each = 250K tokens
  • Training cost: ~$0.75 (GPT-4o-mini)
  • Inference: 2x normal costs ongoing
  • Data preparation: 5-20 hours human time
  • Total: ~$1 compute + significant human investment

The hidden cost: Creating high-quality training data requires expert time. If you need 500 examples of perfect customer support responses, someone has to write them.

The Decision Framework

Hybrid Approaches

Most production systems use multiple approaches:

User query
     ↓
[RAG] Retrieve relevant knowledge from your docs
     ↓
[Fine-tuned model] Generate response in your brand voice
     ↓
[Prompt guard] Ensure output meets formatting requirements
     ↓
Response to user

Example: Customer Support Bot

  • RAG: Product specs, policies, FAQs
  • Fine-tuning: Brand voice and escalation behavior
  • Prompts: Output formatting and safety guardrails

Build vs. Buy for Customization

Approach Build In-House Use Managed Service
RAG Pinecone + custom pipeline LangChain, Anthropic Claude Projects
Fine-tuning OpenAI API + data prep Jasper, Copy.ai (domain-specific)
Full custom Azure OpenAI + enterprise Specialized vendors

When to buy:

  • Speed to market critical
  • No in-house ML expertise
  • Vendor has domain knowledge you lack

When to build:

  • Competitive advantage from customization
  • Data sensitivity requires control
  • Long-term cost optimization at scale

Key Concepts

Key Concept

customization spectrum

AI customization exists on a spectrum from cheap-and-quick to expensive-and-permanent:

Approach Cost Time Flexibility Best For
Prompt Engineering $0 Minutes High (change anytime) Most use cases
Few-Shot Examples $0 Hours High Pattern matching, formatting
RAG (Retrieval) $-$$ Days Medium Knowledge bases, docs
Fine-Tuning $$-$$$ Weeks Low (retraining needed) Style, specialized domains
Custom Training $$$$ Months Very Low Unique capabilities

Key insight: Most projects should start at the top of this spectrum and move down only when necessary.

Key Concept

rag architecture

RAG = Retrieve relevant context → Augment the prompt → Generate response

Instead of training knowledge into the model, you look it up at query time.

How it works:

User asks: "What's our refund policy for premium members?"
                    ↓
Step 1: RETRIEVE — Search your knowledge base
        → Finds: "Premium members get 30-day refunds..."
                    ↓
Step 2: AUGMENT — Add context to prompt
        → Prompt: "Using this policy [context], answer: ..."
                    ↓
Step 3: GENERATE — Model produces answer
        → "Premium members are entitled to full refunds within 30 days..."

RAG advantages:

  • Knowledge updates instantly (just update the source)
  • No retraining required
  • Transparent sources (can show citations)
  • Cheaper than fine-tuning for knowledge

RAG limitations:

  • Retrieval quality depends on your data structure
  • Adds latency (search step)
  • Can't change how the model behaves, only what it knows
Key Concept

fine tuning

Fine-tuning = Training a model on your examples to change how it behaves.

Not about what the model knows—about how it responds.

Good fine-tuning use cases:

  • Consistent voice/tone (match your brand exactly)
  • Specialized formatting (always output in specific JSON structure)
  • Domain-specific reasoning (medical, legal, financial patterns)
  • Behavior modification (be more/less formal, technical, concise)

Poor fine-tuning use cases:

  • Adding factual knowledge (use RAG instead)
  • One-off tasks (just use prompts)
  • Rapidly changing requirements (too slow to iterate)
Key Concept

customization decision

Ask these questions in order:

1. Can prompt engineering solve this?

  • Have you tried detailed system prompts?
  • Have you tested few-shot examples?
  • Have you iterated on prompt structure? → If yes, you're done. Don't over-engineer.

2. Is the problem knowledge or behavior?

  • Knowledge problem = RAG
    • "It doesn't know our products"
    • "It can't access our policies"
    • "It gives outdated information"
  • Behavior problem = Fine-tuning
    • "It doesn't sound like our brand"
    • "It won't output in our format consistently"
    • "It reasons incorrectly in our domain"

3. What's the volume?

  • Low volume (<1000 queries/month): Prompt engineering + some manual review
  • Medium volume (1K-100K/month): RAG for knowledge, prompts for behavior
  • High volume (>100K/month): Fine-tuning ROI becomes compelling

4. How fast do requirements change?

  • Changing weekly: Prompts only
  • Changing monthly: RAG acceptable
  • Stable for 6+ months: Fine-tuning viable
Key Concept

when not to customize

Don't customize when:

  • Prompt engineering gets you to 80%+ accuracy
  • Volume doesn't justify the investment
  • Requirements are still changing
  • You lack quality training data
  • The use case is exploratory

Red flags that you're over-engineering:

  • "We might need this capability later"
  • "It would be cool if it could..."
  • "Other companies are doing fine-tuning"
  • No clear ROI calculation

The 80/20 rule applies: 80% of AI value comes from basic prompting + good workflow design. 20% comes from advanced customization.

Don't pursue the 20% until you've captured the 80%.

Step 3 of 4

HOW HOW to Apply This

Exercise: Customization Decision Matrix

Customization ROI Calculator

CUSTOMIZATION ROI WORKSHEET

1. CURRENT STATE
   Manual handling time per task: ___ minutes
   Tasks per month: ___
   Loaded labor cost per hour: $___
   Monthly labor cost: $___

2. WITH AI (BASIC PROMPTING)
   Accuracy rate (usable without edits): ___%
   Time saved per usable task: ___ minutes
   Monthly time saved: ___ hours
   Monthly value of time saved: $___

3. WITH CUSTOMIZATION
   Expected accuracy improvement: +___%
   Additional monthly value: $___

4. CUSTOMIZATION COSTS
   One-time setup:
   - RAG infrastructure: $___
   - Fine-tuning training: $___
   - Human data prep time: $___
   Ongoing monthly:
   - Compute increase: $___
   - Maintenance: $___

5. PAYBACK CALCULATION
   Incremental monthly value: $___
   Incremental monthly cost: $___
   Net monthly benefit: $___
   One-time investment: $___
   Payback period: ___ months

When to Say No

Self-Check


Practice Exercises

You're the operations lead at a mid-size law firm. Attorneys waste hours drafting routine documents that follow standard templates. You want to deploy AI assistance.

Analyze these three use cases:

Use Case A: Contract summarization

  • Input: Client contracts (confidential)
  • Output: Plain-English summaries of key terms
  • Volume: ~200 contracts/month
  • Requirements: Must cite specific clauses

Use Case B: Standard letter drafting

  • Input: Matter type + key details
  • Output: Client correspondence
  • Volume: ~500 letters/month
  • Requirements: Must match firm's formal voice exactly

Use Case C: Legal research assistance

  • Input: Case questions
  • Output: Relevant precedents and analysis
  • Volume: ~100 queries/month
  • Requirements: Must use current case law

For each use case, determine:

  1. Primary need: Knowledge, behavior, or both?

  2. Recommended approach: Prompt engineering, RAG, fine-tuning, or hybrid?

  3. Cost-benefit analysis:

    • Estimated setup cost
    • Ongoing costs
    • Value generated (hours saved × hourly rate)
  4. Build vs. buy decision: In-house or vendor?

  5. Risk assessment: What could go wrong?

Step 4 of 4

GENERIC Phase 3 Complete!

You've mastered Agentic Orchestration. You can now:

  • Evaluate and select AI tools
  • Work with APIs and automation platforms
  • Test and deploy AI applications responsibly
  • Make informed customization decisions

Before moving to Phase 4, complete:

Lab 5: Build an AI Assistant — Create a functional AI assistant using no-code tools

Lab 5b: Multi-Agent Orchestration — Design a system where AI agents collaborate

Phase 3 Deliverable: Multi-Agent System — Build and deploy a working multi-agent system that demonstrates autonomous reasoning and collaboration

Module Complete!

You've reached the end of this module. Review the checklist below to ensure you've understood the key concepts.

Progress Checklist

0/6
0% Complete
0/4 Sections
0/5 Concepts
0/1 Exercises