Raoul Harris
  • Introduction
  • Technical books
    • Data engineering with Alteryx
    • Deep learning in Python
    • Generative AI in action
    • Generative deep learning
    • Outlier analysis
    • Understanding deep learning
    • Understanding machine learning: from theory to algorithms (in progress)
    • Review: Deep learning: foundations and concepts
  • Technical courses
    • Advanced SQL Server masterclass for data analytics
    • Building full-stack apps with AI
    • Complete Cursor
    • DataOps methodology
    • DeepLearning.AI short courses
    • Generative AI for software development
      • Introduction to generative AI for software development
      • Team software engineering with AI
      • AI-powered software and system design
    • Generative AI with large language models
    • Generative pre-trained transformers
    • IBM DevOps and software engineering
      • Introduction to agile development and scrum
      • Introduction to cloud computing
      • Introduction to DevOps
    • Machine learning in production
    • Reinforcement learning specialization
      • Fundamentals of reinforcement learning
      • Sample-based learning methods
      • Prediction and control with function approximation
  • Non-technical books
    • Management skills for everyday life (in progress)
  • Non-technical courses
    • Business communication and effective communication specializations
      • Business writing
      • Graphic design
      • Successful presentation
      • Giving helpful feedback (not started)
      • Communicating effectively in groups (not started)
    • Illinois Tech MBA courses
      • Competitive strategy (in progress)
    • Leading people and teams specialization
      • Inspiring and motivating individuals
      • Managing talent
      • Influencing people
      • Leading teams
Powered by GitBook
On this page
  • Part 1: Foundations of generative AI
  • Part 2: Advanced techniques and applications
  • Chapter 6: Guide to prompt engineering
  • Chapter 7: Retrieval-augmented generation
  • Chapter 8: Chatting with your data
  • Chapter 9: Tailoring models with model adaptation and fine-tuning
  • Part 3: Deployment and ethical considerations
  1. Technical books

Generative AI in action

Part 1: Foundations of generative AI

This was all pretty introductory. A lot of it would be useful to someone new to the topic, but I don't feel like I personally learned anything.

Part 2: Advanced techniques and applications

Chapter 6: Guide to prompt engineering

Self-consistency sampling improves the reliability of language model outputs by generating multiple independent responses and analyzing their consistency.

Indirect prompt injection embeds commands in seemingly normal content, for example by having the model read files containing new instructions.

Chapter 7: Retrieval-augmented generation

Data grounding involves connecting the model to external sources like databases, APIs, or knowledge graphs to improve accuracy and traceability.

Maximum Inner Product Search (MIPS) finds the vectors in a dataset that have the highest dot product with a query vector.

Marginalization involves considering multiple possible retrieved documents/passages when generating a response, rather than relying on just the top result, by aggregating or "marginalizing" over different pieces of retrieved evidence.

Sparse retrievers use traditional keyword-based methods like TF-IDF or BM25 that rely on exact word matches, while dense retrievers look at vector similarity in an embedding space.

Dense Passage Retrieval (DPR) is a neural information retrieval system that uses two BERT-based encoders to map both queries and passages into dense vector representations.

A vector index is a specialized data structure that organizes high-dimensional vectors for efficient similarity search using techniques like approximate nearest neighbour algorithms (e.g., HNSW, IVF).

Similarity measures:

  • Cosine similarity is ideal for text and document comparisons where vector orientation matters more than magnitude. It is commonly used in NLP tasks.

  • Euclidean (L2) distance measures the actual geometric distance between points in space, making it suitable for spatial and clustering applications.

  • The dot product is computationally efficient for high-dimensional sparse vectors and useful in recommendation systems where vector magnitudes are meaningful.

  • Hamming distance measures the number of positions at which two sequences differ, making it suitable for comparing binary strings or for error detection in data transmission.

  • Manhattan (L1) distance calculates the sum of absolute differences between coordinates, making it useful for grid-based pathfinding and when individual dimension differences need to be emphasized.

Chunking breaks documents into smaller segments for retrieval systems. Approaches should consider factors like semantic coherence, size consistency, and overlap. Common ones include fixed-length splits, sentence/paragraph/section boundaries, sliding windows, and semantic chunking based on NLP.

Adaptive chunking dynamically adjusts segment sizes based on content characteristics, semantic boundaries, and context requirements rather than using fixed-length splits.

A dynamic retrieval window adjusts the amount of retrieved context based on the complexity of the query.

Fallback strategies should be considered for cases where none of the retrieved chunks are relevant.

Chapter 8: Chatting with your data

This chapter was based around an example of applying RAG. There wasn't much in the way of new concepts.

Chapter 9: Tailoring models with model adaptation and fine-tuning

Part 3: Deployment and ethical considerations

This section is a mixture of big-picture stuff and lists of libraries, metrics, etc. that you might want to use. While a lot of it is valuable, there's nothing in particular that it seems to make sense to summarize here. I'll just check the book itself when applicable (or do some searching online for the areas like the list of libraries that will be hopelessly out of date within a few months).

Last updated 7 months ago

Plenty of good stuff and sensible advice in this chapter, but nothing that I've not seen before in places such as Generative AI with large language models, Machine learning in production, and .

Improving the accuracy of LLM applications