Attention is all you need

Description

In recent years, neural networks have become a basic tool in building the most exciting AI projects: from giving power to virtual assistants like Siri or Alexa, guiding self-driving cars, to predicting the prices of houses on the market, or even writing new literary pieces of art. Neural networks are becoming an integral part of our everyday life and their relevance is ever so increasing.

In this project you will learn the basic idea and basic types of architectures of neural networks. You will learn what are the basic building blocks of different types of networks, how to put them together, how to train the network as well as optimise its performance. You will also learn a recent and powerful type of network, the transformer and in particular the idea of "attention" which is at its essence.

While this project will require you to gain significant theoretical knowledge, it will be equally important to show how this knowledge works in practice, by implementing in code what you have learnt. It is therefore very important that you have a strong knowledge of Python and that you enjoy doing it!

Prerequisites

Strong knowledge of Python. Familiarity with basic probability theory.

Some background material

The amount of literature is huge and most of it is available online. The following are just good starting points:

For more specific topics, there are many interesting reviews, including code, on machinelearningmastery.com. Kaggle also has a set of super quick and to-the-point videos and exercises.

If you like things the 'old fashioned way' here is a site where you can find a lot of free books: freecomputerbooks.com.

Project III 2023-2024

Attention is all you need

Description

Prerequisites

Some background material