Deep Learning Tutorial

Master Deep Learning from fundamentals of neural networks to advanced architectures like CNNs, RNNs, Transformers, and GANs with practical implementations in TensorFlow and PyTorch.

Neural Networks

From perceptrons to deep nets

Computer Vision

CNN, YOLO, ResNet

NLP

RNN, LSTM, Transformers

Generative AI

GANs, VAEs

Introduction to Deep Learning

Deep Learning is a subset of machine learning that uses neural networks with many layers (deep neural networks) to model complex patterns in data. Inspired by the structure and function of the human brain, deep learning has revolutionized fields like computer vision, natural language processing, and generative AI.

Evolution of Deep Learning

1943: First neural network model (McCulloch-Pitts)
1958: Perceptron (Rosenblatt)
1986: Backpropagation (Rumelhart, Hinton)
2012: AlexNet wins ImageNet (Modern DL era)
2017: Transformer architecture (Vaswani et al.)
2020+: GPT, DALL-E, Generative AI boom

Why Deep Learning?

Automatic feature extraction
Outperforms traditional ML on large datasets
State-of-the-art in vision, language, speech
Transfer learning & pre-trained models
Backed by industry (Google, Meta, OpenAI)

First Neural Network: Perceptron

A perceptron is the simplest form of a neural network - a single neuron that makes decisions by weighing inputs.

Perceptron Implementation (NumPy)

import numpy as np

class Perceptron:
    def __init__(self, learning_rate=0.01, epochs=100):
        self.lr = learning_rate
        self.epochs = epochs
        self.weights = None
        self.bias = None
    
    def activation(self, x):
        # Step function
        return 1 if x >= 0 else 0
    
    def fit(self, X, y):
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0
        
        for _ in range(self.epochs):
            for idx, x_i in enumerate(X):
                linear_output = np.dot(x_i, self.weights) + self.bias
                y_predicted = self.activation(linear_output)
                
                # Update weights and bias
                update = self.lr * (y[idx] - y_predicted)
                self.weights += update * x_i
                self.bias += update
    
    def predict(self, X):
        linear_output = np.dot(X, self.weights) + self.bias
        return np.array([self.activation(x) for x in linear_output])

Neural Networks Fundamentals

At its core, a neural network consists of layers of interconnected neurons. Each connection has a weight, and each neuron has an activation function that determines its output.

Input Layer [ ● ● ● ] → [ ● ● ● ● ● ] Hidden Layer → [ ● ● ] Output Layer

Simple Feed-Forward Neural Network Architecture

Neural Network from Scratch

import numpy as np

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    return x * (1 - x)

class NeuralNetwork:
    def __init__(self, input_size, hidden_size, output_size):
        # Initialize weights and biases
        self.W1 = np.random.randn(input_size, hidden_size) * 0.5
        self.b1 = np.zeros((1, hidden_size))
        self.W2 = np.random.randn(hidden_size, output_size) * 0.5
        self.b2 = np.zeros((1, output_size))
    
    def forward(self, X):
        # Forward propagation
        self.z1 = np.dot(X, self.W1) + self.b1
        self.a1 = sigmoid(self.z1)
        self.z2 = np.dot(self.a1, self.W2) + self.b2
        self.a2 = sigmoid(self.z2)
        return self.a2
    
    def backward(self, X, y, output):
        # Backpropagation
        m = X.shape[0]
        
        # Output layer error
        self.dz2 = output - y
        self.dW2 = (1/m) * np.dot(self.a1.T, self.dz2)
        self.db2 = (1/m) * np.sum(self.dz2, axis=0, keepdims=True)
        
        # Hidden layer error
        self.da1 = np.dot(self.dz2, self.W2.T)
        self.dz1 = self.da1 * sigmoid_derivative(self.a1)
        self.dW1 = (1/m) * np.dot(X.T, self.dz1)
        self.db1 = (1/m) * np.sum(self.dz1, axis=0, keepdims=True)
    
    def update(self, lr=0.1):
        # Gradient descent
        self.W1 -= lr * self.dW1
        self.b1 -= lr * self.db1
        self.W2 -= lr * self.dW2
        self.b2 -= lr * self.db2

Key Concept: Backpropagation is the algorithm that makes deep learning possible. It calculates gradients of the loss function with respect to each weight using the chain rule, then updates weights to minimize error.

Activation Functions

Activation functions introduce non-linearity into neural networks, enabling them to learn complex patterns.

Sigmoid

def sigmoid(x):
    return 1/(1+np.exp(-x))

Range: (0,1)
Best for: Binary classification output

Tanh

def tanh(x):
    return np.tanh(x)

Range: (-1,1)
Best for: Hidden layers (zero-centered)

ReLU

def relu(x):
    return np.maximum(0,x)

Range: [0,∞)
Best for: Most hidden layers

Deep Learning Frameworks

TensorFlow / Keras

High-level API for quick prototyping

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

PyTorch

Dynamic computation graphs, research-focused

import torch
import torch.nn as nn

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 64)
        self.fc3 = nn.Linear(64, 10)
    
    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        return self.fc3(x)

Convolutional Neural Networks (CNNs)

CNNs are designed to process grid-like data such as images. They use convolutional layers, pooling layers, and fully connected layers.

CNN for Image Classification

import tensorflow as tf

model = tf.keras.Sequential([
    # Convolutional Block 1
    tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(32,32,3)),
    tf.keras.layers.MaxPooling2D(2,2),
    
    # Convolutional Block 2
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    
    # Convolutional Block 3
    tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    
    # Classifier
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(256, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(10, activation='softmax')
])

model.summary()

Recurrent Neural Networks (RNNs) & LSTMs

RNNs are designed for sequential data. LSTMs solve the vanishing gradient problem and capture long-term dependencies.

LSTM for Sentiment Analysis

model = tf.keras.Sequential([
    tf.keras.layers.Embedding(10000, 128, input_length=100),
    tf.keras.layers.LSTM(64, return_sequences=True),
    tf.keras.layers.LSTM(32),
    tf.keras.layers.Dense(16, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

Transformers & Attention

The Transformer architecture uses self-attention mechanisms and has become the foundation of modern NLP (BERT, GPT).

Attention Mechanism

import numpy as np

def scaled_dot_product_attention(Q, K, V):
    """Attention(Q,K,V) = softmax(QK^T/√d_k)V"""
    d_k = K.shape[-1]
    scores = np.matmul(Q, K.transpose(0,1,3,2)) / np.sqrt(d_k)
    attention_weights = tf.nn.softmax(scores, axis=-1)
    output = np.matmul(attention_weights, V)
    return output, attention_weights

Generative Adversarial Networks (GANs)

GANs consist of a generator and a discriminator that compete against each other, producing realistic synthetic data.

Generator: Creates fake samples
Discriminator: Tries to distinguish real from fake
Training: Min-max game between generator and discriminator

Deep Learning Applications

Computer Vision

Image Classification
Object Detection (YOLO, SSD)
Semantic Segmentation
Face Recognition

Natural Language Processing

Machine Translation
Text Summarization
Sentiment Analysis
Chatbots & LLMs

Speech & Audio

Speech Recognition
Text-to-Speech
Music Generation
Speaker Identification

                
                    Why Deep Learning?
                
                Automatic feature engineering - no manual feature extraction
Scales with data - more data = better performance
Transfer learning - leverage pre-trained models
State-of-the-art results across vision, language, speech
Versatile architectures for any data type
Massive industry adoption and research momentum

            

Next: Neural Networks Basics

Related Deep Learning Links

Deep Learning Tutorial

Neural Networks

Computer Vision

NLP

Generative AI

Introduction to Deep Learning

Evolution of Deep Learning

Why Deep Learning?

First Neural Network: Perceptron

Neural Networks Fundamentals

Activation Functions

Sigmoid

Tanh

ReLU

Deep Learning Frameworks

TensorFlow / Keras

PyTorch

Convolutional Neural Networks (CNNs)

Recurrent Neural Networks (RNNs) & LSTMs

Transformers & Attention

Attention Mechanism

Generative Adversarial Networks (GANs)

Deep Learning Applications

Computer Vision

Natural Language Processing

Speech & Audio

Why Deep Learning?