AI Cheatsheet

AI Fundamentals

AI Overview

Artificial Intelligence (AI)

Field of computer science dedicated to creating systems capable of performing tasks that typically require human intelligence.

Machine Learning (ML)

Subset of AI that uses statistical techniques to enable machines to improve at tasks with experience.

Deep Learning (DL)

Subset of ML using multi-layered neural networks to analyze various factors of data.

Note: AI encompasses ML, which in turn encompasses DL. Each is a subset of the previous category.

Types of AI

1. Narrow AI (Weak AI)

Designed to perform a specific task (e.g., facial recognition, internet searches, self-driving cars).

2. General AI (Strong AI)

Hypothetical AI that exhibits human-like intelligence and can apply knowledge to different problems.

3. Superintelligent AI

Theoretical AI that surpasses human intelligence and cognitive abilities in virtually all domains.

Current State: All existing AI systems are considered Narrow AI. General AI remains a theoretical concept.

Machine Learning

ML Learning Types

Supervised Learning

Algorithm learns from labeled training data, making predictions based on that data.

Examples: Classification, Regression

Unsupervised Learning

Algorithm learns patterns from unlabeled data without specific guidance.

Examples: Clustering, Association

Reinforcement Learning

Algorithm learns through trial and error using feedback from its actions.

Examples: Game AI, Robotics

Semi-supervised Learning

Combines a small amount of labeled data with a large amount of unlabeled data.

Common ML Algorithms

Linear Regression

Predicts continuous values based on linear relationship between variables.

Use Case: Predicting house prices

Logistic Regression

Predicts categorical outcomes (binary classification).

Use Case: Spam detection

Decision Trees

Tree-like model of decisions and their possible consequences.

Use Case: Customer segmentation

Random Forest

Ensemble method using multiple decision trees for better accuracy.

Use Case: Disease prediction

Support Vector Machines (SVM)

Finds the optimal hyperplane that separates classes in feature space.

Use Case: Image classification

Deep Learning

Neural Networks

Artificial Neural Network (ANN)

Computing system inspired by biological neural networks.

Components: Input layer, Hidden layers, Output layer

# Simple Neural Network in Keras

from tensorflow import keras

from tensorflow.keras import layers

# Define model

model = keras.Sequential([

    layers.Dense(64, activation='relu', input_shape=(10,)),

    layers.Dense(64, activation='relu'),

    layers.Dense(1, activation='sigmoid')

])

# Compile model

model.compile(

    optimizer='adam',

    loss='binary_crossentropy',

    metrics=['accuracy']

)

Activation Functions

ReLU: Rectified Linear Unit - f(x) = max(0, x)

Sigmoid: f(x) = 1 / (1 + e^(-x))

Tanh: f(x) = tanh(x)

Softmax: Converts logits to probabilities

Specialized Networks

Convolutional Neural Networks (CNN)

Designed for processing structured grid data like images.

Layers: Convolutional, Pooling, Fully Connected

Use Cases: Image recognition, Object detection

Recurrent Neural Networks (RNN)

Designed for sequential data where context matters.

Variants: LSTM, GRU

Use Cases: Time series, Language modeling

Transformers

Attention-based architecture that has revolutionized NLP.

Key Concept: Self-attention mechanism

Examples: BERT, GPT models

# Simple CNN in Keras

model = keras.Sequential([

    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),

    layers.MaxPooling2D((2, 2)),

    layers.Conv2D(64, (3, 3), activation='relu'),

    layers.MaxPooling2D((2, 2)),

    layers.Flatten(),

    layers.Dense(64, activation='relu'),

    layers.Dense(10, activation='softmax')

])

Natural Language Processing

NLP Fundamentals

Tokenization

Breaking text into words, phrases, or other meaningful elements.

Stemming & Lemmatization

Stemming: Reducing words to their word stem

Lemmatization: Reducing words to their lemma (dictionary form)

Word Embeddings

Representing words as vectors in a continuous vector space.

Examples: Word2Vec, GloVe, FastText

TF-IDF

Term Frequency-Inverse Document Frequency - statistical measure to evaluate word importance.

# Text preprocessing with NLTK

import nltk

from nltk.tokenize import word_tokenize

from nltk.stem import PorterStemmer, WordNetLemmatizer

# Tokenization

text = "Natural Language Processing is fascinating!"

tokens = word_tokenize(text)

# Stemming

stemmer = PorterStemmer()

stems = [stemmer.stem(token) for token in tokens]

# Lemmatization

lemmatizer = WordNetLemmatizer()

lemmas = [lemmatizer.lemmatize(token) for token in tokens]

Advanced NLP Techniques

Transformer Models

Revolutionary architecture using self-attention mechanisms.

Examples: BERT, GPT, T5, RoBERTa

BERT (Bidirectional Encoder Representations from Transformers)

Pre-trained transformer model that understands context bidirectionally.

Use Cases: Question answering, Sentiment analysis

GPT (Generative Pre-trained Transformer)

Autoregressive language model that generates human-like text.

Use Cases: Text generation, Chatbots, Content creation

# Using Hugging Face Transformers

from transformers import pipeline

# Sentiment analysis

classifier = pipeline('sentiment-analysis')

result = classifier("I love this amazing AI technology!")

# Text generation

generator = pipeline('text-generation', model='gpt2')

generated_text = generator("The future of AI is", max_length=50)

# Question answering

qa_pipeline = pipeline('question-answering')

answer = qa_pipeline({

    'question': 'What is AI?',

    'context': 'Artificial intelligence is intelligence demonstrated by machines.'

})

Computer Vision

CV Fundamentals

Image Processing

Basic operations on images to enhance or extract information.

Techniques: Filtering, Edge detection, Thresholding

Feature Extraction

Identifying and describing relevant patterns in images.

Methods: SIFT, SURF, HOG, ORB

Object Detection

Identifying objects within images and locating them.

Algorithms: R-CNN, YOLO, SSD

Image Segmentation

Partitioning an image into multiple segments.

Types: Semantic, Instance, Panoptic segmentation

# Image processing with OpenCV

import cv2

import numpy as np

# Read image

image = cv2.imread('image.jpg')

# Convert to grayscale

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Edge detection

edges = cv2.Canny(gray, 100, 200)

# Display image

cv2.imshow('Edges', edges)

cv2.waitKey(0)

cv2.destroyAllWindows()

Advanced CV Techniques

Transfer Learning

Using pre-trained models as starting points for new tasks.

Popular Models: VGG, ResNet, Inception, EfficientNet

Generative Adversarial Networks (GANs)

Framework with two neural networks competing to generate new data.

Applications: Image generation, Style transfer, Data augmentation

Image Captioning

Generating textual descriptions of images.

Approach: CNN + RNN/Transformer architectures

Face Recognition

Identifying or verifying a person from a digital image.

Techniques: FaceNet, DeepFace, OpenFace

# Using pre-trained models with TensorFlow Hub

import tensorflow as tf

import tensorflow_hub as hub

# Load a pre-trained image feature vector model

model_url = "https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/feature_vector/4"

model = hub.load(model_url)

# Preprocess image

image = tf.io.read_file('image.jpg')

image = tf.image.decode_jpeg(image, channels=3)

image = tf.image.resize(image, [224, 224])

image = image / 255.0

# Extract features

features = model(tf.expand_dims(image, axis=0))

AI Tools & Frameworks

Python Libraries

TensorFlow & Keras

Open-source platform for machine learning.

Use Cases: Deep learning, Neural networks

PyTorch

Open-source machine learning library based on Torch.

Strengths: Research, Dynamic computation graphs

Scikit-learn

Machine learning library for classical algorithms.

Use Cases: Classification, Regression, Clustering

OpenCV

Library for computer vision and image processing.

Use Cases: Image processing, Object detection

NLTK & spaCy

Libraries for natural language processing.

NLTK: Education, Research

spaCy: Production, Efficiency

Hugging Face Transformers

Library for state-of-the-art natural language processing.

Use Cases: BERT, GPT, and other transformer models

Deployment & MLOps

MLflow

Platform to manage the ML lifecycle.

Features: Tracking, Projects, Models, Registry

Kubeflow

Kubernetes-based platform for ML workflows.

Use Cases: Scalable ML deployments

TensorFlow Serving

Flexible, high-performance serving system for ML models.

ONNX (Open Neural Network Exchange)

Open format to represent deep learning models.

Benefit: Interoperability between frameworks

Cloud AI Platforms

AWS SageMaker: End-to-end ML service

Google AI Platform: ML model development and deployment

Azure Machine Learning: Cloud-based ML environment

# Model serving with TensorFlow Serving

# Save model in SavedModel format

model.save('my_model', save_format='tf')

# Install TensorFlow Serving

# Run serving (command line)

tensorflow_model_server \

  --rest_api_port=8501 \

  --model_name=my_model \

  --model_base_path=/path/to/my_model

Additional Resources

Learning Resources

Courses: Coursera ML by Andrew Ng, Fast.ai, Udacity AI Nanodegree
Books: "Hands-On ML with Scikit-Learn, Keras & TensorFlow", "Deep Learning" by Ian Goodfellow
Research Papers: ArXiv, Google Scholar, Papers with Code
Blogs: Towards Data Science, Google AI Blog, OpenAI Blog
Communities: Kaggle, GitHub, Stack Overflow, Reddit ML

Datasets & Competitions

Dataset Repositories: Kaggle Datasets, UCI ML Repository, Google Dataset Search
Competition Platforms: Kaggle, DrivenData, AI Crowd
Benchmark Datasets: MNIST, ImageNet, COCO, GLUE
Data Generation: Synthetic data, Data augmentation techniques
Data Annotation: Labelbox, Prodigy, Amazon SageMaker Ground Truth

Related Cheatsheet Links