# Deep learning in Python

This book gives a clear and practical high-level overview of a range of topics. It has particularly good coverage of how you actually build the models in practice.

Topics covered include:

* Basics of machine learning
* Simple neural networks
* Keras
* Convolutional networks
* Recurrent networks
* Transformers
* Generative deep learning

The last section is pretty shallow and dated in terms of the topics covered, so [Generative deep learning](/technical-books/generative-deep-learning.md) was a good complement.

I've left the more extensive notes for [Understanding deep learning](/technical-books/understanding-deep-learning.md) as that goes into much greater depth about why things work.

## Selected notes

Deep learning primarily took off due to better performance than classical approaches, but reducing the need for feature engineering is a big advantage too.

Optimizers that make use of momentum tend to converge faster and are less likely to get stuck in local minima.

Backpropagation is just the chain rule (though some optimizations are applied in practice).

You should generally apply feature-wise normalization to your data.

The manifold hypothesis posits that all natural data lies on a low-dimensional manifold within the high-dimensional space where it is encoded.

It often makes sense to train your model until it overfits in order to determine the best number of epochs to train for. If you can't get your model to overfit, it probably needs more capacity.

In the context of neural networks, weight decay is another term for L2 regularization.

Dropout rates are usually set between 0.2 and 0.5.

You can reduce the size of your model by pruning weights that don't have much impact on the output. Quantization is another option.

Convolutional layers are useful to learn local patterns. Stacking them can allow the model to learn spatial hierarchies.

You can perform feature extraction on an image model by taking the convolutional base of the model and adding a new classifier/regressor on top.

Depth-wise-separable convolutions allow you to train smaller models that often have better performance.

The sparsity of activations in a convolutional model increases with the depth of the layer.

In order to see the pattern that a filter corresponds to, you can apply gradient ascent in input space.

Class Activation Maps highlight the discriminative regions in an image that influence the CNN's classification decision. They are generated by mapping the predicted class score back to the feature maps from the last convolutional layer of the network.

Mixed precision training is a technique used to train deep neural networks more efficiently by using different numerical formats for different parts of the training process.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://www.raoulharris.com/technical-books/deep-learning-in-python.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
