Machine Learning step by step (Beginners intro)

5 min readMay 29, 2023

I decided to learn more about machine learning at my university. I want to share what I have learned. I will learn and teach you as we go through this journey together.

What is Deep learning?

Deep learning is a subset of Machine learning, a subset of Artificial intelligence (AI). The following picture gives a much easier look :

Machine Learning: a type of artificial intelligence in which computers use huge amounts of data to learn how to do tasks rather than being programmed to do them. — Oxford Learners Dictionaries

Simply put, it's when we teach a computer to recognise patterns in data like a human would.

For example, we humans know the difference between a cat and a dog, a lion and a tiger or even between a cow and a fish.

Neural networks

“inspired by the neurons of the human brain.”

They work in 3 steps :

1 — take in data input

2 — Train themselves to understand the pattern of the data

3 — output possible predictions

Here is a simple example :

1 — Learning Process

1 — Forward Propagation

This is the process of propagation of information from the input layer to the output layer

2 — Back Propagation

This is the process when the neural network evaluates the accuracy of the network and evaluates the difference between the expected output and the given output — this then leads to the weights and biases being adjusted to better fit the prediction.

So in simple terms: Given input → Random weights and biases →output → backward propagation evaluates the difference from expected to give output → repeat for all data.

2 — Activation Functions

Introduce Non-linearity in the network
Decides whether a neuron can pass to the next layer

What determines this?

FEW EXAMPLES

Step Function e.g. if value > 0 Activate else DO NOT. This means that if it's below the threshold, which is 0, then don't activate.

Linear Functions: The sightline function where the activation is proportional to the input.

3 — Optimization

While model I is being trained; we change the parameters to reduce the loss function to optimize our model.

How is this done?

Optimizers tie together the loss function parameters by updating the network based on the output of the loss function. Optimizers guide the loss function to allow it to know if it's moving in the correct direction.

4 — Parameters and Hyperparameters

Parameters are the configuration model, which is internal to the model. These are specified or estimated while training the model.

Hyperparameters are the explicitly specified parameters that control the training process. These are set before the beginning of the training of the model.

5 — Key Terminology

We need key terms when our dataset is very large so to overcome this, we divide the dataset into chunks and pass it into the neural network one by one.

Epchos — When the entire dataset is passed forward and backwards through the neural network only ONCE

Overfitting — Model as memorized a pattern in training data and performances terrible on new data sets

Batch Size — The total number of training examples in a batch

Iterations — Numbers of iterations needed to complete a epchos

6 — types of learning

6 .1— Supervised Learning

This algorithm is designed to learn from examples that use well-labelled data. Its almost as if a human is watching oversee the entire process

6.2 — Unsupervised learning

branch of machine learning to find the underlying pattern in data

Used in exploratory data analysis. DOES NOT use label data but relies on data features. The goal is to find patterns in the data

Two types of unsupervised learning

Clustering: Put Cluster data based on patterns

Association: tries to find an association between entities.

6.3— Reinforced learning

The process of learning from trial and error allows improvement based o feedback on actions and experiences by using rewards and punishments for positive and negative behaviour.

7 — Regularization

Tackling overfitting

Dropout — randomly decides to remove a node at each iteration as well as edges [ better as its more random and memorises less of the training data ]
Dataset Augmentation — we can create fake data and add it to the dataset; this is effective depending on the task at hand e.g. rotating an image or moving a few pixels.
Early Stopping — better validation can be done when w estop training at the point where the training error decreases and the validation error increases.

8 — Neural Network Architecture

The more neurons we add to the hidden layer, the wider the network becomes, but the more hidden layers we add, the deeper the network becomes, leading to larger computational resources.

Feed Forward Neural Network: Take in a fixed-size input and returns a fixed-sized output. This cannot allow us to model every possible problem.

RNN ( Recurrent Neural Networks ): Uses a feedback loop in the middle layer; this allows it to use the output of the previous Knowledge as input for the output.

9 — Steps in creating a Deep Learning Model

Data gathering

picking the right data is key
bad data = bad model
The size of datasets can differ.
Quality of data is important

Dataset

UCL Machine Learning Repo
Kaggle
Google dataset search
Reddit

Processing the Data

splitting the dataset into subsets
make sure training, validation and testing data are similar

Formatting — might not be in the correct format eg data in the database but want csv.
Missing Data — may have missing data, so would require it to be removed
Too much data and too computational requirements, so better to use a smaller portion of the dataset

pre-processing the data

Normalization: re-scaling to range between 0 and 1 using min/max scaling.
Standardization: centrals the field at means 0.

Training the model

Feed data
Forward propagation
Loss Function
Backward propagation

Evaluation

test how good the model is by using the evaluation set.

Optimization

Hyperparameters Tuning by increasing epochs
Adjust learning rate
Addressing Overfitting can be solved by getting more data or reducing the size.
Regularization — constrains the complexity of the network.
Data Augmentation — eg zoom in, greyscale , flip the picture.
Dropout — dropout units or neurons to make the model or rando

References