Raoul Harris
  • Introduction
  • Technical books
    • Data engineering with Alteryx
    • Deep learning in Python
    • Generative AI in action
    • Generative deep learning
    • Outlier analysis
    • Understanding deep learning
    • Understanding machine learning: from theory to algorithms (in progress)
    • Review: Deep learning: foundations and concepts
  • Technical courses
    • Advanced SQL Server masterclass for data analytics
    • Building full-stack apps with AI
    • Complete Cursor
    • DataOps methodology
    • DeepLearning.AI short courses
    • Generative AI for software development
      • Introduction to generative AI for software development
      • Team software engineering with AI
      • AI-powered software and system design
    • Generative AI with large language models
    • Generative pre-trained transformers
    • IBM DevOps and software engineering
      • Introduction to agile development and scrum
      • Introduction to cloud computing
      • Introduction to DevOps
    • Machine learning in production
    • Reinforcement learning specialization
      • Fundamentals of reinforcement learning
      • Sample-based learning methods
      • Prediction and control with function approximation
  • Non-technical books
    • Management skills for everyday life (in progress)
  • Non-technical courses
    • Business communication and effective communication specializations
      • Business writing
      • Graphic design
      • Successful presentation
      • Giving helpful feedback (not started)
      • Communicating effectively in groups (not started)
    • Illinois Tech MBA courses
      • Competitive strategy (in progress)
    • Leading people and teams specialization
      • Inspiring and motivating individuals
      • Managing talent
      • Influencing people
      • Leading teams
Powered by GitBook
On this page
  1. Technical books

Deep learning in Python

Chollet (2021)

This book gives a clear and practical high-level overview of a range of topics. It has particularly good coverage of how you actually build the models in practice.

Topics covered include:

  • Basics of machine learning

  • Simple neural networks

  • Keras

  • Convolutional networks

  • Recurrent networks

  • Transformers

  • Generative deep learning

The last section is pretty shallow and dated in terms of the topics covered, so Generative deep learning was a good complement.

I've left the more extensive notes for Understanding deep learning as that goes into much greater depth about why things work.

Selected notes

Deep learning primarily took off due to better performance than classical approaches, but reducing the need for feature engineering is a big advantage too.

Optimizers that make use of momentum tend to converge faster and are less likely to get stuck in local minima.

Backpropagation is just the chain rule (though some optimizations are applied in practice).

You should generally apply feature-wise normalization to your data.

The manifold hypothesis posits that all natural data lies on a low-dimensional manifold within the high-dimensional space where it is encoded.

It often makes sense to train your model until it overfits in order to determine the best number of epochs to train for. If you can't get your model to overfit, it probably needs more capacity.

In the context of neural networks, weight decay is another term for L2 regularization.

Dropout rates are usually set between 0.2 and 0.5.

You can reduce the size of your model by pruning weights that don't have much impact on the output. Quantization is another option.

Convolutional layers are useful to learn local patterns. Stacking them can allow the model to learn spatial hierarchies.

You can perform feature extraction on an image model by taking the convolutional base of the model and adding a new classifier/regressor on top.

Depth-wise-separable convolutions allow you to train smaller models that often have better performance.

The sparsity of activations in a convolutional model increases with the depth of the layer.

In order to see the pattern that a filter corresponds to, you can apply gradient ascent in input space.

Class Activation Maps highlight the discriminative regions in an image that influence the CNN's classification decision. They are generated by mapping the predicted class score back to the feature maps from the last convolutional layer of the network.

Mixed precision training is a technique used to train deep neural networks more efficiently by using different numerical formats for different parts of the training process.

Last updated 9 months ago