Logistic regression jmp11/1/2022 ![]() ![]() #Logistic regression jmp codeYou might have noticed that in the code throughout we have been using matrix multiplication to achieve the expressions we want. x contains the training examples and y contains the labels (the admission result in our case). We first call the load_data function to load the x and y values. We know the hypothesis for linear regression is in fact the sigmoid function so mathematically speaking our hypothesis is: The first step to defining the architecture of your model is to define the hypothesis. Well now that we have our data ready to go let's start coding the actual linear regression model! Hypothesis It is possible to even predict elliptical and non-geometric decision boundaries using logistic regression. It is imperative to note that logistic regression can be used to predict way more complex decision boundaries than what our current problem shows. It is very important to note here that our decision boundary is simple only because our dataset just happens to be distributed such that the decision boundary is approximately a straight line. Plot of the data (source: image by the author)įrom just a quick glance at the above plot it becomes quite clear that our data does have a decision boundary which in this case appears to be somewhat a straight line. In order to visualize the data lets define the plot_data function we had called from load_data. This will allow us to understand why (if at all) logistic regression is the way to go for our given dataset and the associated problem. Plotting the dataīefore we jump to coding our model lets take a second to analyze our data. If you have followed the “ Coding Linear Regression from Scratch” post you probably already know what this function is doing and you could just skip to the juicy coding part of the post but if you haven't, then stick around because we are about to take a deeper dive into the distribution of our data. You might have noticed another function call within the load_data function: plot_data(data, data). ![]() This function returns x and y (note x is made up of the first 2 columns of the dataset whereas y is the last column of the dataset as that is the result column hence in order to return x and y we are returning data and data respectively from the function). We will be calling the above function later to load the dataset. Our aim is to predict this label y.Īlright so now that we know what our data looks like lets define a function to load the data. ![]() In our dataset the first two columns are the marks in the two tests and the third column is the decision label ( y) encoded in binary (i.e y = 1 if admitted and y = 0 if not admitted). Our goal today is to build a logistic regression model that will be able to decide (or better put classify) whether an applicant should be granted admission or not. A record consists of the marks of the applicant in two entrance exams and the final admission decision (whether the candidate is admitted or not). This dataset contains the historical records of applicants. So for our purpose today we will be using a very simple student admission result dataset which can be found here. Your machine learning models will always need data to “learn” from. So let's get started! DataĪt the core of any machine learning algorithm is data. In this post, we will be coding a logistic regression model from the very basics using python. ![]()
0 Comments
Leave a Reply.AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |