In the modern era where machines have surpassed humans in a lot of things, it’s fascinating to see how machines learn by just looking at data, analyzing it and making decisions on the basis of the processed data, but as easy as it sounds, it’s not that easy when you do it by yourself.
If you’re working with offline machine learning, the main reason is probably the offline server, and the lower cost of offline machine learning compared to online machine learning, but you can still find yourself in trouble when you don’t have a high spec machine and you need to execute a big application. However, there is no need to worry if you know about the concept of mini batches.
Before we dive into the summary of mini batches, I’d like to go over some basic machine learning concepts that include epoch and weight. These will make it easier to understand about mini batches.
So, let’s first talk about weight in machine learning. According to Deepai.org
“Weight is the parameter within a neural network that transforms input data within the network’s hidden layers. A neural network is a series of nodes, or neurons. Within each node is a set of inputs, weight, and a bias value.”
In simple terms, you can say that weight is the machine learning algorithm that you’ve created.
The word epoch in machine learning refers to an entire cycle of training and updating machine learning algorithms.
Table of Contents
Mini Batch Learning
Before introducing mini batches, let’s imagine a scenario in which you have a model for a training set with millions of images. When you execute this model, all the images will pass through the model, analyze the data and the corresponding output will be generated, and after that whole process, the weight will be Updated. This is the standard rule of Offline Learning and, as you may have noticed, this process can take a lot of time to be completed and if there are any problems with your hardware the whole process can be ruined. That’s where mini batches come in and solve all of these problems.
Now, let’s imagine the same scenario again but with the concept of mini batch learning.
In mini batch learning, that big chunk of millions of images will be divided into smaller batches called mini batches to be executed one by one, the mini batches will be for input and output too.
In this scenario, the smaller batches will be taken one by one as an input and the algorithm data will be analyzed, errors will be removed and on that basis the weight will be updated. This process will go on until the last mini batch is analyzed and the algorithm is updated. It doesn’t matter how many of these data sets are there. This might be only a single epoch or thousands of epochs, it doesn’t matter because we’re giving our neural network a considerable amount to execute, analyze and update the weights.
Size of mini batch
The reason for using mini batches is that we can execute smaller batches in the neural network to execute and update the weight more frequently and that’s why we should properly set the size of the mini batch so that we can get the most out of it. There are some standards for setting the size of a mini batch and the highest priority is that the size of the mini batch should be in numbers, and that number should be decided very carefully because if the number is very large, the execution will take time and the neural network will be updated less frequently, and that will eventually lessen the accuracy. If we use a smaller value, then we’ll be updating the neural network more frequently. This can also produce erroneous results if one of the data sets is guided incorrectly and the updates will also be done frequently which can slow down the processing speed and increase the time it takes, that’s why it’s important to use a good number of batch sizes, not too big and not too small either. Generally, batch size is placed in the power of 2.