Skip to main content

Posts

Showing posts from October, 2020

A Brief Overview of GPT-3 by OpenAI

    You have probably already seen some articles like "A robot wrote an entire article. Aren't you scared yet, human?" So, who is the robot here?    It's GPT-3 model. It's a transformer based language model. The full form of GPT is Generative Pre-trained Transformers. This model is developed by OpenAI. There were GPT-2 and other models released by OpenAI previously. GPT-3 was released in May 2020. GPT-3 is more robust than its predecessors. Though architecturally it doesn't have that mush difference.   GPT-3 can write articles, poems, and even working code for you*, given some context. There are some limitations which I am going explain later in this article. It's a language model means given a text, it probabilistically predicts what tokens from a known vocabulary will come next in that string. So, it's sort of a autocomplete that we see on a phone keyboard. We type a word, and then the keyboard suggests another word that can come next. What sets GPT

Machine Learning Loss Functions in Practice

  Error/Loss functions are used to estimate the loss of a model so that the weights can be updated to reduce the error rate on the next iteration.  As you have clicked in this article, I am assuming you know the fundamental stuffs of machine learning pipelines and you want to know about loss functions specifically. So, let's jump directly to loss functions. I will also show you how you can use these loss functions in Scikit-learn/Pytorch. Broadly, we can categorize loss functions in two categories.  Loss functions for Regression problems. Loss functions for Classification problems. Regression problems The two most common loss functions for regression problems are: MSE( Mean Squared Error)  MAE (Mean Absolute Error)  MSE / Quadratic Loss / L2 Loss   If the target values falls into Gaussian/ Normal distribution, then it is the preferred loss function for regression problems. MSE is the sum of squared distances between target variables (ground truth) and predicted values. The implemen