Video reading area
Origin Ten Technology is driven by a desire to make the world a better place.
This area has been designed to improve Video accessibility for user with hearing problems and for those who just prefer to read.
Algorithms and Data Structures Tutorial – Full Course for Beginners By freeCodeCamp.org
This is a falling course from Treehouse. We have free code Camp. Our longtime fans of their learning platform. They were kind enough to let our nonprofit make this course freely available on our YouTube channel. If you like. This course, Treehouse has a lot more courses. Like this one. The link is in the description along with time codes to the different sections in this course.
Hi, my name is poisson. I’m an instructor here at treehouse and welcome to introduction to algorithms. Whether you are a high school or college student, a developer in the industry, or someone who is learning to code, you have undoubtedly running to the term algorithm for many people. This word is kind of scary. It represents this body of knowledge. That seems just Out Of Reach, only people with computer science degrees, know about algorithms now to others, this brings up feelings.
Things of imposter syndrome you might already know how to code but you’re not a real developer because you don’t know anything about algorithms personally. It made me frame certain jobs as above my skill level because the interview contained algorithm questions. Well, whatever your reasons are in this course, our goal is to dispel all those feelings and get you comfortable with the basics of algorithms. Like any other subject. I like to start my courses, with what the course.
Is and is not in this course, we’re going to cover the very basic set of knowledge that you need. As a foundation for learning about algorithms. This course is less about specific algorithms and more about the tools. You will need to evaluate algorithms understand how they perform compare them to each other and make a statement about the utility of an algorithm in a given context. Now, don’t worry. None of this will be theoretical and we will learn these concepts by using well-known algorithms.
In this course, we will also be writing code. So I do expect you to have some programming experience. If you intend to continue with this topic, you can definitely stick around even if you don’t know how to code, but you might want to learn the basics of programming. In the meantime, in this course, we will be using the Python programming language python reads a lot, like regular English and is the language. You will most likely encounter when learning about algorithms these days, if you don’t know how to code, or
Or if you know how to code in a different language. Check out the notes section of this video, for links to other content that might be useful to you as long as you understand, the fundamentals of programming, you should be able to follow along pretty well, if you’re a JavaScript developer, or a student, who’s learning JavaScript, for example, chances are good. That you’ll still be able to understand the code. We write later. I’ll be sure to provide links along the way if you need anything to follow up on.
Let’s start with something simple. What is an algorithm? An algorithm is a set of steps or instructions for completing a task. This might sound like an over simplification. But really that’s precisely what an algorithm is a recipe is an algorithm your morning routine. When you wake up is an algorithm and the driving directions, you follow to get to a destination is also an algorithm in computer science. The term algorithm more specifically means
The set of steps a program takes to finish a task. If you’ve written code before any code really? Generally speaking, you have written an algorithm given that much of the code. We write can be considered an algorithm. What do people mean when they say you should know about algorithms. Now consider this, let’s say I’m a teacher in a classroom and I tell everyone, I have an assignment for them on their desks. They have a picture of a maze and their task is to come up with a way to find the quickest way.
A out of the Maze and everyone does their thing and comes up with a solution. Every single one of these Solutions is a viable solution. And is a valid example of an algorithm, the steps one needs to take to get out of the maze, but from being in classrooms, or any group of any sort, you know, that some people will have better ideas than others. We all have a diverse array of skill sets over time. Our class picks the best of these Solutions, and anytime we want to solve a maze, we go with
One of these Solutions, this is what the field of algorithms is about. There are many problems in computer science, but some of them are pretty common. Regardless of what project you’re working on different people have come up with different solutions to these common problems. And over time. The field of computer science has identified several that do the job. Well, for a given task, when we talk of algorithms, we’re referring to two points. We’re primarily saying there’s an established body.
Knowledge on how to solve particular problems. Well, and it’s important to know what the solutions are now, why is it important? If you’re unaware that a solution exists, you might try to come up with one yourself and there’s a likelihood that your solution won’t be as good or efficient. Whatever. That means, compared to those that have been thoroughly reviewed, but there’s a second component to it, as well. Part of understanding algorithms is not just knowing that an algorithm exists.
It’s but understanding when to apply it understanding, when to apply an algorithm requires properly, understanding the problem at hand and this arguably is the most important part of learning about algorithms and data structures. As you progress through this content. You should be able to look at a problem and break it down into distinct steps. When you have a set of stamps, you should then be able to identify which algorithm or a data structure is best for the task at hand. This concept is
Called algorithmic thinking and it’s something we’re going to try and cultivate together as we work through our content. Lastly learning about algorithms gives you a deeper understanding about complexity and efficiency in programming. Having a better sense of how your code will perform in different situations, is something that you’ll always want to develop and hone.
Algorithmic thinking is why algorithms also come up in big Tech interviews. Interviewers don’t care as much that you are. Able to write a specific algorithm in code, but more about the fact that you can break a seemingly insurmountable problem in two, distinct components and identify the right tools to solve each distinct component. And that is what we plan on doing. In this course though. We’re going to focus on some of the tools and Concepts, you will need to be aware of before we can dive.
The topic of algorithms if you’re ready, let’s get started.
Hey again, in this video we’re going to do something unusual. We’re going to play a game. And by we I mean me and my two friends here, Brittany and John, this game is really simple and you may have played it before. It goes something like this. I’m going to think of a number between 1 and 10 and they have to guess what? The number is busy, right? When they guess a number. I’ll tell them if they’re guess is too high or too low. The winner is the one with the fewest rise. All right, John, let’s start with you. I’m thinking of a number between 1 and 10. What is it?
Between you and me. The answer is three. Quick question. Does the range include 1 and 10. That is a really good question. So what John did right there was to establish the bounds of our problem. No solution works on every problem, and an important part of algorithmic thinking, is to clearly Define what the problem set is and clarify what values count as inputs. Yeah, 1 and 10 are both included. Is it one too? Low? Is it too, too low? Is it three, correct?
Okay, so that was an easy one. It took John three tries to get the answer. Let’s switch over to Brittany and play another round using the same number as the answer. Okay, Brittany. I’m thinking of a number between 1 and 10 inclusive. So, both 1 and 10 are in the range. What number am I thinking of is it five too high to too low? Is it three, correct? All right. So what we had there was two very different ways of playing the same game somehow with even such a simple game. We saw different approaches to figuring out a solution.
To go back to algorithmic thinking for a second. This means that with any given problem. There is no one best solution instead, what we should try and figure out is what solution works better for the current problem. In this first pass at the game. They both took the same amount of turns to find the answer. So it’s not obvious who has the better approach, and that’s mostly because the game was easy. Let’s try this one more time. This time. The answer is 10. All right, John, you first is it one too, low.
Is it to still too low? Is it three too low, is it for too low? Is it five still too low. Is it six too low? Is it seven too low. Is it 8:00? Hello. Is it? Nine below is a 10, correct? You got it. Okay. So now same thing but with Brittany this time, is it 5:00 to low? 82, low. Is it 1:51 Le? It’s 10. Alright, so here we start to see a difference between their strategies. When the answer was three. They both took the same.
Of tents. This is important when the number was larger, but not that much larger 10. In this case. We start to see that. Brittany strategy did better. She took four tries while John took 10. We’ve played two rounds so far, and we’ve seen a different set of results. Based on the number they were looking for. If you look at John’s way of doing things, then the answer being 10, the round. We just played, is his worst case scenario. You will take the maximum number of turns 10, to guess it when we picked a random.
A number like three it was hard to differentiate which strategy was better because they both performed exactly the same but in John’s worst case scenario a clear winner in terms of strategy, emerges in terms of algorithmic thinking, we’re starting to get a sense that the specific value, they’re searching for may not matter. As much as where that value lies in the range that they’ve been given. Identifying. This helps us understand our problem better. Let’s do this again for a range of numbers from 1.
100. We’ll start by picking five as an answer to trick them. Okay. So this time we’re going to run through the exercise. Again this time from 1 to 100 and both 1 and 100. I included it. Was it why this point without even having to run through it? We can guess how many tries John is going to take since he starts at 1 and keeps going. He’s going to take five tries as we’re about to see if our cool, correct. Okay. Now for Brittany’s turn
Is it 52? Hi, is it 25 still too high? Is it 13 too high? Is it 72? Hi, is it for too low? Is it 6-2 high? As if I’ve right. Let’s evaluate John took five tries Brittany? On the other hand took seven tries. So John wins this round but again in determining who strategy is preferred. There’s no clear winner. Right now, what this tells us is that it’s not particularly useful to look at the easy.
Is where we arrive at the number of fairly quickly because it’s at the start of the range. Instead. Let’s try one where we know John is going to do poorly. Let’s look at his worst case scenario, where the answer is 100 and see how Brittany performs in such a scenario. Okay, John, let’s do this one. More time 1 through 100 again. Is it one? We can fast-forward this scene because well, we know what happens. John takes 100 tries.
Hi, Brittany Europe. Is it 52 low? Is it 75 too? Low 88 too low 94 solo is a 97. Hello. 99 below 100. Okay, so that took Brittany seven turns again. And this time she is the clear winner. If you compare their individual performances for the same number set, you will see that Brittany’s approach leaves. John’s in the dust. When the answer was five. So right around the start of the range, John took 52
It’s, but when the answer was 100, right? At the end of the range, he took 100 tries. It took him 20 times. The amount of tries to get that answer compared to Brittany. On the other hand, if you compare Brittany’s efforts, when the number was five, she took seven tries, but when the number was 100, she took the same amount of tries. This is pretty impressive. If we pretend that. The number of tries is the number of seconds, it takes Brittany and John to run through their attempts. This is a good estimate for how fast their Solutions are.
Okay, we’ve done this a couple times and Brittney and John are getting tired. Let’s take a break in the next video. We’ll talk about the point of this exercise. In the last video. We ran through an exercise where I had some of my co-workers. Guess what number, I was thinking. What was the point of that exercise? You might be thinking. Hey, I thought I was here to learn about algorithms. The exercise. We just did, was an example of a real life situation. You will run into when building websites apps and writing code both.
Is taken by John and Brittany to find the number. I was thinking of our examples of searching for a valley. It might be weird to think that there’s more than one way to search, but as you saw in the game, the speed at which the result was obtained, differed between John and Brittany. Think about this problem from the perspective of a company, like Facebook, at the time of this recording, Facebook has two point one, nine billion active users. Let’s say you’re traveling in a different country and meet someone. You want to add on Facebook, you go.
Into the search bar and type out. This person’s name, if we simplify how the Facebook app works. It has to search across these two point one. Nine billion records and find the person you are looking for the speed at which you find this person, really matters. Imagine what kind of experience it would be if when you search for a friend Facebook but a base spinning activity indicator and said come back in a couple hours. I don’t think we’d use Facebook as much if that was the case from the company’s perspective.
Active working on making search as fast as possible using different, strategies. Really matters. Now. I said that the two strategies Brittany and John used were examples of search more specifically these our search algorithms. These strategy John took where he started at the beginning of the range and just counted one number after the other is a type of search called linear search. It is also called sequential search, which is a better description of how it works or
Or even simple search since it really is quite simple. But what makes his approach an algorithm as opposed to just looking for something. Remember, we said that an algorithm is a set of steps or instructions to complete. A task linear, search is a search algorithm and we can Define it like this. We start at the beginning of the list or the range of values. Then we compare the current value to the Target if the current value is the target value that we’re looking for. We’re done.
If it’s not, we’ll move on sequentially to the next value in the list. And then repeat Step 2. If we reach the end of the list, then the target value is not in the list. This definition has nothing to do with programming. And in fact, you can use it in the real world. For example, I could tell you to walk into a bookstore and find me a particular book, and one of the ways you could do it is using the linear search algorithm. You could start at the front of the bookstore and read the cover or the spine.
Of every book to check that it matches the book that you’re looking for. If it doesn’t, you go to the next book and repeat until you find it or run out of books. What makes this an algorithm is the specificity of how it is defined in contrast to just jumping into a problem and solving it as we go along and algorithm follows a certain set of guidelines and we use the same steps to solve the problem. Each time. We faced it. An important first step to defining the algorithm isn’t the
Itself, but the problem we’re trying to solve our first guideline is that an algorithm must have a clear problem statement? It’s pretty hard to Define an instruction set when you don’t have a clear idea of what problem. You’re trying to solve in defining, the problem. We need to specify how the input is defined and what the output looks like. When the algorithm has done, its job for linear search, the input can be generally described as a series of values and the output is a
You matching the one we’re looking for right now. We’re trying to stay away from anything code related. So this problem statement definition is pretty generic. But once we get to code, we can actually tighten this up. Once we have a problem, an algorithm is a set of steps that solves this problem. Given that the next guideline is that an algorithm definition must contain a specific set of instructions in a particular order. We really need to be clear about the order in which these instructions are executed.
Taking our simple definition of linear search. If I switched up the order and said, move sequentially to the next value before specifying that first comparison, step if the first value where the target one, our algorithm wouldn’t find it because we move to the second value before. Comparing that you might think. Okay. That’s just an avoidable mistake. Kind of Common Sense. The thing is computers, don’t know any of that and just do exactly as we tell them. So specific order is really important.
The third guideline is that each step in our algorithm definition with zombie, a complex, one and needs to be explicitly. Clear what I mean by that is that you shouldn’t be able to break down any of the steps into further into additional some tasks. Each step needs to be a distinct one. We can’t define linear search as search until you find this value because that can be interpreted in many ways and further broken down into many more steps. It’s not clear.
Next and this one might seem obvious, but algorithms should produce a result. If it didn’t, how would we know whether the algorithm works or not to be able to verify that our algorithm works correctly. We need a result. Now when using a search algorithm, the end result can actually be nothing which indicates that? The value wasn’t found. But that’s perfectly fine. There are several ways to represent nothing in code. And as long as the algorithm can produce, some result, we can
Stand Its Behavior. The last guideline is that the algorithm should actually complete and cannot take an infinite amount of time. If we let John loose in the world’s largest library and asked him to find a novel, we have no way of knowing whether he succeeded or not, unless he came back to us with a result. Okay. So quick recap, what makes an algorithm an algorithm and not just something you do.
One, it needs to have a clearly defined problem statement, input and output. When using linear search, the input needs to be just a series of values, but to actually use Brittany strategy. There’s one additional precondition. So to speak. If you think about her strategy, it required that the numbers be sorted. In ascending order. This means that where the input for John is just a series of values to solve the problem. The input to Britney’s, algorithm needs to be a sordid series of values.
He’s so clearly defined problem statement, clearly defined input and clearly defined output. S the steps in the algorithm need to be in a very specific order.
The steps also need to be distinct. You should not be able to break it down into further subtasks next, the algorithm should produce a result. And finally, the algorithm should complete in a finite amount of time. These guidelines not only help us Define what an algorithm is, but also helps us verify that the algorithm is correct. Executing. The steps in an algorithm for a given input must result in the same output every time.
If in the game I played the answer was 50 every time. Then every single time John must take 50 turns to find out that the answer is 50. If somehow he takes 50 turns in one round, then 30. The next. And we technically don’t have a correct algorithm consistent results for the same set of values is how we know that the algorithm is correct. I should stress that we’re not going to be designing any algorithms on our own and we’ll start off and spend most of our time.
Learning the tried-and-true algorithms that are known to efficiently. Solve problems. The reason for talking about what makes for a good algorithm though, is that the same set of guidelines makes for good algorithmic thinking, which is one of the most important skills. We want to cultivate when we encounter a problem before rushing in and thinking about Solutions. Well, we want to do is work through the guidelines. First we break down the problem into any possible number of smaller problems. Where each problem can be.
Clearly defined in terms of an input and an output. Now that we know how to generally Define an algorithm. Let’s talk about what it means to have a good algorithm. An important thing to keep in mind is that there’s no one single way to measure whether an algorithm is the right solution because it is all about context. Earlier. We touched on two concepts, correctness and efficiency. Let’s define correctness more clearly because before we can evaluate an algorithm on efficiency, we need
Ensure its correctness before. We Define our algorithms. We start by defining our problem. In the definition of that problem. We have a clearly defined input, satisfying, any preconditions and a clearly defined output. An algorithm is deemed. Correct. If on every run of the algorithm against all possible values in the input data, we always get the output. We expect part of correctness means that for any possible input the algorithm.
Should always terminate or end if these two are not true, then our algorithm is incorrect. If you were to pick up an algorithms textbook and look up correctness, you will run into a bunch of mathematical Theory. This is because traditionally algorithm correctness is proved my mathematical induction, which is a form of reasoning used in mathematics to verify that a statement is correct. This approach involves writing, what is called a specification and a correctness.
Proof, we won’t be going into that in this course. Prove through induction is an important part of Designing algorithms, but we’re confident that you can understand algorithms both in terms of how and when to use them without getting into the math. So if you pick up a textbook and feel daunted, don’t worry, I do too. Well. We can still figure things out without it. Alright. So once we have a correct algorithm, we can start to talk about how efficient and algorithm is remember that this efficiency
He ultimately matters because they help us solve problems faster and deliver a better, end user experience in a variety of fields. For example, algorithms are used in the sequencing of DNA and more efficient sequencing algorithms, allow us to research and understand diseases better and faster, but let’s not get ahead of ourselves. We’ll start simple by evaluating John’s linear search algorithm in terms of its efficiency.
First, what do we mean by efficiency? There are two measures of efficiency when it comes to algorithms time and space. It sounds really cool and very sci-fi, huh? Efficiency measured by time something. You’ll hear called time complexity is a measure of how long it takes. The algorithm. To run. Time complexity can be understood generally outside the context of code and computers because how long it takes to complete. A job, is a universal measure of efficiency the less time.
Time you take the more efficient you are. These second measure of efficiency is called space complexity and this is pretty computer specific. It deals with the amount of memory taken up on the computer. Good algorithms need to balance between these two measures to be useful. For example, you can have a blazingly fast algorithm, but it might not matter if the algorithm consumes more memory than you have available. Both of these Concepts time and space complexity are measured using the same.
Check but it is a very technical sounding metric. So let’s build up to it slowly and start simple a few videos ago. I played a game with Brittany and John where they tried to guess the number. I was thinking of effectively, they were searching for a value. So how do we figure out how efficient each algorithm is and which algorithm was more suited to our purposes? If we consider the number of tries. They took to guess or search for the value as an indicator of the time.
They take to run through the exercise. This is a good indicator of how long the algorithm runs for a given set of values. This measurement is called the running time of an algorithm and we’ll use it to Define time complexity in the game. We played four rounds. Let’s recap those here. Focusing on John’s performance in round one. We had ten values, the target was three. And John took three turns in round, 2, we had ten values. The target was 10. And John took ten turns around.
B, we had 100 values, the target was five. John took five tries. And finally, in round 4, when the target was 100, given 100 values. John took 100 tries on paper. It’s hard to gauge anything about this performance when it comes to anything with numbers though. I like to put it up on a graph and compare visually on the vertical or y-axis. Let’s measure the number of tries. It took John to guess the answer or the running time of the algorithm on the horizontal.
Or x axis. What do we put for each turn? We have a number of values as well as a Target value. We could plot the target value on the horizontal axis, but that leaves some context and meaning behind, it’s far more impressive that John took five tries. When the range went up to 100 then when he took three tries for a maximum of ten values, we could plot the maximum range of values, but then we’re leaving out the other half of the picture.
There are data points. However, that satisfy both requirements. If we only plot the values, where the target, the number John was looking for was the same as the maximum range of values. We have a data point that includes both the size of the data set as well as his effort. There’s an additional benefit to this approach as well. There are three ways. We can measure how well John does or in general, how well any algorithm does.
First, we can check how well John does in the best case or good scenarios from the perspective of his strategy. In the range of 100 values. The answer being alone. Number like three at the start of the range is a good scenario. He can guess it fairly quickly. One is his best case scenario or we could check how well he does on average. We could run this game a bunch of times and average out the running time. This would give us a much better picture of John’s performance over time, but our estimates would
Too high, if the value, he was searching for was at the start of the range or far too low if it was at the end of the range, let’s imagine a scenario where Facebook naively implements linear search. When finding friends, they looked at the latest US Census so that 50% of names start with the letters, a through J, which is the first 40 percent of the alphabet and thought, okay, on average linear search serves us well, but what about the rest of those whose names start with a letter? After J in the alphabet?
Searching for my name would take longer than the average and much longer for someone whose name starts with the letter Z. So while measuring the run time of an algorithm on average might seem like a good strategy. It won’t necessarily provide an accurate picture by picking the maximum in the range. We’re measuring how our algorithm does. And the worst case scenario analyzing the worst case scenarios quite useful because it indicates that the algorithm will never perform worse than we expect. There’s
Room for surprises. Back to our graph, we’re going to plot the number of tries a proxy for running time of the algorithm against the number of values in the range which will shorten to n. And here also represents John’s. Worst case scenario. When n is 10, you takes 10 turns when n is 100, you takes 100 turns. But these two values alone are insufficient to really get any sort of visual understanding. Moreover. It’s not realistic. John me.
Take a long time to work through 100 numbers, but a computer can do that in no time to evaluate the performance of linear search, in the context of a computer. We should probably throw some harder and larger ranges of values added. The nice thing is by evaluating a worst-case scenario. We don’t actually have to do that work. We know what the result will be for a given value of n using linear search. It will take n tries to find the value in the worst case scenario. So let’s add a few values.
In here to build out this graph. Okay. So, we have a good picture of what this is starting to look like as the values, get really large, the running time of the algorithm gets large as well. Well, we sort of already knew that before we dig into this runtime, any deeper. Let’s switch tracks and evaluate Brittany’s work by having something to compare against, it should become easier to build a mental model around time. Complexity, the algorithm John used linear search seemed familiar to us and you could understand it because
it’s how most of us search for things in real life. Anyway, Britney’s approach on the other hand got results quickly. It was a bit harder to understand. So let’s break it down. Just like John’s approach, Brittany started with a series of values or a list of numbers as her input where John just started at the beginning of the list and searched sequentially. Brittany strategy is to always start in the middle of the range from there. She asks a comparison question, is the number in the middle of the range equal to the answer.
She’s looking for. And if it’s not, is it greater than or less than the answer, if it’s greater than she can, eliminate all the values less than the one, she’s currently evaluating. If it’s lesser than the answer. She can eliminate all the values greater than the one she’s currently evaluating with the range of values that she’s left over with she repeats this process until she arrives at the answer. Let’s visualize how she did this by looking at round three. In round 3, the number of values.
In the range was 100. The answer was five. The bar here represents the range of values. One of the left 100 of the right. And this pointer represents the value. Britney chooses to evaluate. So she starts in the middle at 50. She asks, is it equal to the answer? I say it’s too high. So this tells her that the value she is evaluating is greater than our Target value, which means there’s no point in searching, any of the values to the right of 50 that is values greater than 50 in.
Range. So she can discard those values all together. She only has to consider values from 1 to 50. Now, the beauty of this strategy, and the reason why a Britney was able to find the answer in such few turns is that with every value. She evaluates she can discard half of the current range on her second turn. She picks the value in the middle of the current range, which is 25. She asks the same question. I say that the value is too high again, and this tells her that she can discard.
Art everything greater than 25 and the range of values drops from 1 to 25. Again. She evaluates the number in the middle roughly. So that be 13 here. I tell her, this is still too high. She discards the values greater loose devalued 7, which is still too high. Then she moves to 4, which is now too low. She can discard everything less than 4, which leaves the numbers 4 through 7, he or she picked 6, which was too high, which only leaves one value 5.
This seems like a lot of work, but being able to get rid of half the values. With each turn is what makes this algorithm much more efficient. Now, there’s one subtlety to using binary search and you might have caught on to this for this search method to work. As we’ve mentioned, the values need to be sorted with linear search. It doesn’t matter if the values are sorted since a linear search algorithm, just progresses sequentially checking, every element in the list. If the target value exists in the list, it will.
Be found, but let’s say this range of values, 1, 200 was unsorted, rittany would start at the Middle with something like 14 and ask if this value is too low or too high. I say it’s too high. So she discards everything less than 14. Now, this example starts to fall apart here because well, britni knows what numbers are less than 14 and greater than 1. She doesn’t need an actual range of valleys to solve this a computer. However, does need that. Remember search algorithms are running.
Lists containing all sorts of data. It’s not always just a range of values. Containing numbers in a real use case of binary search, which we’re going to implement in a bit. The algorithm wouldn’t return the target value because we already know that it’s a search algorithm. So we’re providing something to search for instead. What it returns is the position in the list of the target. Occupies without the list being sorted, a binary search algorithm would discard all the values to the left of 14, which over here could include the
Ian, where our Target value is, eventually we’d get a result back saying the target value doesn’t exist in the list, which is inaccurate earlier when defining linear simple search. I said that the input was a list of values and the output was the target value or more specifically the position of the target value in the list. So, with binary search, there’s also that precondition the input list must be sorted. So let’s formally defined binary search first, the input, a sorted list of values.
The output, the position in the list of the target value were searching for, or some sort of values indicate that the target does not exist in the list. Remember, our guidelines for defining an algorithm. Let me put those up again really quick. The steps in the algorithm need to be in a specific order. The steps also need to be very distinct. The algorithms should produce a result and finally the algorithm should complete in a finite amount of time. Let’s use those to Define this algorithm Step 1.
We determine the middle position of the sorted list Step 2, We compare the element in the middle position to the Target element, step 3, if the elements match, we return the middle position and end if they don’t match in Step 4, we check whether the element in the middle position is smaller than the target element. If it is, and we go back to step 2 with a new list that goes from the middle position of the current list to the end of the current list in step 5.
The element in the middle position is greater than the target element. Then again, we go back to step 2 with a new list that goes from the start of the current list to the middle position of the current list. We repeat this process until the target element is found or until a sub list contains only one element. If that single element sublist does not match the target element, then we end the algorithm indicating that the element does not exist in the list.
Okay, so that is the magic behind. How Brittany managed to solve the round, much faster in the next video. Let’s talk about the efficiency of binary search. We have a vague understanding that Brittany’s approach is better in most cases, but just like with linear search, it helps to visualize this much like we did with linear search when determining the efficiency of an algorithm. And remember we’re still only looking at efficiency in terms of
Same time, complexity as its called. We always want to evaluate how the algorithm performs in the worst case scenario. Now, you might be thinking all that doesn’t seem fair because given a series of data, if the target value were searching for is somewhere near the front of the list, then linear search, May perform just as well. If not slightly better than binary search, and that is totally true. Remember a crucial part of learning algorithms is understanding what works better in a given context when
I’m measuring efficiency though. We always use the worst-case scenarios as a benchmark because remember it can never perform worse than the worst case. Let’s plot these values on the graph. We started earlier with the number of tries or the run time of the algorithm on the y axis and the maximum number of values in the series or n on the horizontal axis to represent the worst case scenario. We have two data points when n equals 10. We need to look for it, tries using binary search. And when any
Was 100, it took seven tries, but even side-by-side these data points are sort of meaningless. Remember, that while there is quite a difference between the runtime of linear search and binary, search at an N, value of hundred for computer. That shouldn’t matter. What we should check out is how the algorithm performs at levels of n that might actually slowly computer down as n grows larger and larger. How do these algorithms compared to one? Another? Let’s add that to the graph.
Okay. Now a picture starts to emerge as n gets really large the performance of these two algorithms differ significantly. The difference is kind of Staggering actually, even with the simple game. We saw that binary search was better. But now we have a much more complete idea of how much better. For example, when n is 1000. The runtime of linear search measured by the number of operations or turns his also 1000 for binary search. It takes just
Ten operations. Now. Let’s look at what happens when we increase n by a factor of ten. A ten thousand linear, search takes 10,000 operations, while binary, search takes 14 operations and increase by a factor of 10 in binary. Search only needs four more operations to find a value. If we increase It Again by a factor of 10, once more to an N value of 100,000 binary, search takes only 17 operations. It is blazing fast.
What we’ve done here is plotted on a graph how the algorithm performs as the input set. It is working on increases. In other words. We’ve plotted the growth rate of the algorithm also known as the order of growth different algorithms grow at different rates and by evaluating the growth rates, we get a much better picture of their performance because we know how the algorithm will hold up as n grows larger.
This is so important. In fact, it is the standard way of evaluating an algorithm and brings us to a concept called Big O, you might have heard this word thrown about and if you found it confusing, don’t worry. We’ve already built up a definition in the past few videos. We just need to bring it all together. Let’s start with a common statement. You’ll see in studies and algorithms. Big O is a theoretical definition of the complexity of an algorithm as a function of the size.
Wow, what a mouthful. This sounds really intimidating, but it’s really not. Let’s break it down. Big O is a notation used to describe complexity and what I mean by notation is that it simplifies everything. We’ve talked about down into a single variable, an example of complexity written in terms of Big O, looks like this. As you can see, it starts with an uppercase letter O, that’s why we call it Big. Oh, it’s literally a big O. The o
From order of magnitude of complexity. So that’s where we get the Big O from now complexity here refers to the exercise. We’ve been carrying out in measuring efficiency. If it takes Brittany for tries, when n is 10. How long does the algorithm take when n is 10 million? When we use Big O for this, the variable used which we’ll get to distills that information down. So that by reading the variable, you get a big picture view without having to run through data points and graphs just like we did.
It’s important to remember that complexity is relative. When we evaluate the complexity of the binary search algorithm. We’re doing it relative to other search algorithms. Not all algorithms, Big O is a useful notation for understanding both time, and space complexity. But only when comparing amongst algorithms, that solve the same problem. The last bit, in that definition of Big O is a function of the size and all this means is that Big O measures complexity.
As the input size grows because it’s not important to understand how an algorithm performs in a single data set. But in all possible data sets.
You will also see Big O referred to as the upper bound of the algorithm and what that means is that Big O measures how the algorithm performs in the worst case scenario. So that’s all Big O is nothing special. It’s just a notation that condenses the data points and graphs that we’ve built up down to one variable. Okay. So what do these variables look like for John strategy, linear search. We say that it has a Time complexity of Big O and then and so that’s again Big O.
With an N inside parentheses for Brittany strategy, binary search. We say that it has a Time complexity of Big O of log, n, that’s bigger with something called a log and an N inside parentheses. Now, don’t worry. If you don’t understand that, we’ll go into that in more detail later on in the course.
Each of these has a special meaning but it helps to work through all of them to get a big picture view. So over the next few videos, let’s examine what are called, common complexities, or a common values of Big O that you will run into and should internalized. In our discussions of complexity. We made one assumption that the algorithm as a whole had a single measure of complexity that isn’t true and we’ll get it how we arrived at these measures for the entire algorithm at the end of this exercise, but each step in the algorithm.
Them has its own space and time complexity in linear search. For example, there are multiple steps and the algorithm goes like this. Start at the beginning of the list or range of values, compared the current value to the Target. If the current value is the target value that we’re looking for. We’re done. If it’s not, we’ll move on sequentially to the next value in the list and repeat Step 2. If we reach the end of the list and the target value is not in the list. Let’s go back to step 2 for a second.
Comparing the current value to the Target, does the size of the data set matter for this step? When we’re at Step 2. We’re already at that position in the list and all we’re doing is reading the value to make a comparison. Reading the value is a single operation and if we were to plot it on a graph of run time per operations against n, it looks like this a straight line that takes constant time. Regardless of the size of n, since this takes the same amount of
In any given case we say that the runtime is constant time. It doesn’t change in Big O notation. We represent this as Big O with a one inside parentheses. Now when I first started learning all this, I was really confused as to how to read this. Even if it was in my own head. Should I say Big O of 1? When you see this written, you’re going to read this as constant time. So reading a value in a list is a constant time operation. This is the most ideal.
Eel case when it comes to run times because input size does not matter. And we know that regardless of the size of n, the algorithm runtime will remain the same The Next Step Up in complexity. So to speak is the situation. We encountered with the binary search algorithm traditionally explaining the time, complexity of binary search involves math. I’m going to try to do it both with and without
When we play the game using binary search, we notice that with every turn we were able to discard half of the data, but there’s another pattern that emerges that we didn’t explore. Let’s say n equals 10. How long does it take to find an item at the 10th position of the list? We can write this out. So we go from 10 to 5 to 8 to 9 and then down to 10 here. It takes us four tries to cut down the list to just one element and find the value. We’re looking for.
Or let’s double the value of n 2, 20, and see how long it takes for us to find an item at the 20th position. So we started 20 and then we pick 10, from there we go. To 15 17, 19. And finally 20. So here it takes us five tries. Okay, let’s double it again. So that n is 40 and we try to find the item in the 40th position. So when we start at 40, the first midpoint, we’re going to pick as 20. From there we go to 30 and 35 37.
Nine. And then 40. Notice that, every time we double the value of n, the number of operations, it takes to reduce the list down to a single element, only increases by 1. There’s a mathematical relationship to this pattern and it’s called a logarithm of n. You don’t really have to know what logarithms truly are. But I know that some of you like, underlying explainers, so I’ll give you a quick one. If you’ve taken algebra classes, you may have learned about exponents. Here’s
A quick Refresher.
Two times one equals two. Now. This can be written as 2 raised to the first Power because it is our base case 2 times 1 is 2. And 2 times 2 is 4. This can be written as 2 raised to the second power because we’re multiplying 2 Twice. First, we multiply 2 times 1, then the result of that times 2 2 times 2 times 2 is 8 and we can write this as 2 raised to the third power because we’re multiplying two three times.
In to raise to 2 and 2 raise to 3, the two and three there are called exponents and they Define how the number grows with to raise to 3, we start with the base value and multiply itself, three times. The inverse of an exponent is called a logarithm. So if I say log to the base 2 of 8 equals 3, I’m basically saying the opposite of an exponent. Instead of saying how many times do I have to multiply this value? I’m asking how many?
Times. Do I have to divide 8 by 2, to get the value one, this 63 operations. What about the result of log to the base 2 of 16 that evaluates to for? So why does any of this matter notice that this is sort of how binary search works? Log to the base 2 of 16 is 4? If n was 16, how many tries does it take to get to that last element? Well, we start in the middle at 8:00. Let’s too low. So we move to 12.
And we move to 14 then to 15 and then to 16, which is five tries or log to the base 2 of 16 plus 1, in general for a given value of n. The number of tries. It takes to find the worst case scenario is log of n plus 1 and because this pattern is overall a logarithmic pattern. We say that the runtime of such algorithms is logarithmic, if we plot these data points on our graph, a logarithmic run.
Looks like this in Big O notation. We represent a logarithmic runtime as Big O of log n, which is written as Big O with log n inside parentheses or even sometimes as lnn inside parentheses. When you see this, read it as logarithmic time, as you can see, on the graph as n grows really large the number of operations grows very slowly. And eventually flattens out. Since this line is below.
Below the line for a linear runtime, which we’ll look at in a second. You might often hear algorithms, with logarithmic runtimes being called sublinear, logarithmic or sub linear runtimes are preferred to linear because they’re more efficient. But, in practice linear, search has its own set of advantages, which will take a look at in the next video. Next up. Let’s look at the situation. We encountered with the linear search algorithm. We saw that in the worst case scenario, whatever the value of n,
Was John sick exactly that many tries to find the answer as in linear search. When the number of operations to determine the result in the worst case scenario is at most the same as n, we say that the algorithm runs in linear time. We represent this as Big O of n. Now you can read that as Big O of n. Like I just said, or you can say linear time, which is more common. When we put that up on a graph against constant time. And logarithmic time. We get a line that looks
It’s like this, any algorithm that sequentially reads the input will have linear time. So remember anytime, you know, a problem involves reading every item in a list that means a linear runtime. As you saw from the game. We played Brittany strategy using binary, search was clearly better and we can see that on the graph. So if we had the option, why would we use linear search? Which runs in linear time? Remember that binary search had a precondition the input set had?
Be sorted while we won’t be, looking at sorting algorithms. In this course, as you learn more about algorithms, you’ll find that sorting algorithms have varying complexities themselves, just like search does. So we have to do additional work prior to using binary search. For this reason in practice, linear search ends up being more, performant up, to a certain value of n, because the combination of sorting first, and then searching using binary search ads up.
The next common complexity you will hear about is when an algorithm runs in quadratic time, if the word quadratic sounds familiar to you. It’s because you might have heard about it in math class, quadratic is a word that means an operation raised to the second power or when something is squared. Let’s say you and your friends are playing a tower defense game and to start it off. You’re going to draw a map of the terrain. This map is going to be a grid and you pick a random number to determine
I mean how large this grid is. Let’s set n, the size of the grid to for next you need to come up with a list of coordinates. So you can place towers and enemies at stuff on this map. So how would we do this? If we start out horizontally, we’d have coordinate points that go 11 12, 13 and 14. Then you go up one level vertically and we have points 21, 22, 23 and 24. Go up, one more and you have the points 3 1, 3, 2 3.
33 and 34 and on that last row you have the points, 41, 42, 43, and 44. Notice that we have a pattern here for each row. We take the value and then create a point by adding to that, every column value the range of values go from 1 to the value of n. So we can generally think of it this way for the range of values from 1 to n. For each value in that range. We create a point by combining that value with the range of values.
One to end again doing it this way for each value in the range of 1 to n. We create an N number of values and we end up with 16 points, which is also n times n or N squared. This is an algorithm with a quadratic runtime because for any given value of n We Carry Out N squared number of operations. Now, I picked a relatively easy. So to speak example here because in English at least we often denote Map size.
Sizes by height times width. So we would call this a 4×4 grid, which is just another way of saying 4 squared or N squared in Big O notation. We would write this as Big O of N squared or say that this is an algorithm with a quadratic runtime. Many search algorithms have a worst-case quadratic runtime, which you’ll learn about soon. Now, in addition to quadratic run times. You may also run into cubic run times as you encounter different algorithms in such an algorithm.
For a given value of n. The algorithm executes n raised to the third power number of operations. These aren’t as common as quadratic algorithms though. So we won’t look at any examples, but I think it’s worth mentioning throwing up on our graph, quadratic and cubic runtimes. Look like this. So this is starting to look pretty expensive, computationally as they say. We can see here that for small changes in N. There is a pretty significant change in the number of operations that we need to carry out.
The next worst case runtime we’re going to look at is one that’s called quasi linear and is sort of easier to understand for lack of better word by starting with the Big O notation quasi-linear runtimes are written out as Big O of n Times. Log n, we learned what log n was, right. A logarithmic runtime. Whereas n grew the number of operations, only increase by a small factor with a quasi linear runtime. What we’re saying, is that for every value of n, we’re going to execute
Eight log and number of operations, hence, the runtime of n Times, log n. So, you saw earlier with the quadratic runtime that for each value of n, we conducted an operations. It’s sort of the same in that as we go through the range of values in N, we’re executing log n operations in comparison to other runtimes. A quasi linear algorithm has a runtime that lies somewhere between a linear runtime and a quadratic runtime. So, where would we expect to see this?
This kind of run time in Practical use, while sorting algorithms is one place. You will definitely see it. Mergesort for example is a sorting algorithm that has a worst case runtime of Big O of n log n. Let’s take a look at a quick example. Let’s say we start off with a list of numbers that looks like this and we need to sort it mergesort starts by splitting this list into two lists down the middle and then takes each sub lists and split that in half down the middle. Again. It keeps
He’s doing this until we end up with a list of just a single number. When we’re down to single numbers. We can do one sort operation and merge these sub lists back in the opposite direction. The first part of merge sort Cuts those lists into sub lists with half the numbers. This is similar to binary search where each comparison operation cuts down the range to half the values, you know, the worst case runtime in binary search is log n. So the splitting
operations have the same runtime Big O of log n or logarithmic. But splitting into half isn’t the only thing we need to do with merge sort. We also need to carry out comparison operations so we can sort those values. And if you look at each step of this algorithm, we carry out an N number of comparison operations, and that brings the worst-case running time of this algorithm to end times log. In also known as quasi linear.
Don’t worry, if you didn’t understand how merge sort works, that wasn’t a point of this demonstration. We will be covering mergesort soon in a future course. The runtimes, we’ve looked at so far are all called polynomial. Runtimes an algorithm is considered to have a polynomial runtime. If for a given value of n its worst case runtime is in the form of n raised to the K power or k, just means some value. So you could be N squared where k equals 2 for quadratic runtime and
Cubed for cubic runtime and so on. All of those are in the form of n raised to some power. Anything that is bounded by this. And what I mean by that is if we had a hypothetical line on our graph of n raised to the K power, anything that falls under this graph is considered to have a polynomial running time, algorithms with an upper bound or a runtime with a big 0 value that is polynomial. Are considered efficient algorithms and are likely to be used in practice.
Next class of run times that we’re going to look at our a run times that we don’t consider efficient. And these are called exponential run times with these run times as n increases slightly the number of operations increases exponentially. And as we will see in a second, these algorithms are far too expensive to be used an exponential. Runtime is an algorithm with a big old value of some number raised to the nth power. Imagine that you wanted to break into.
A locker that had a padlock on it. Let’s assume you forgot your code. This lock takes a two digit code and the digit for the code, ranges from 0 to 9. You start by setting the dials to zero and then with the first dial remaining on zero, you change. The second dial to one and try and open it. If it doesn’t work decided to to then try again, you would keep doing this and if you still haven’t succeeded with the second dial set to 9, then you go back to that first dial, set it to 1 and start the second dial.
Over the range of values, you’d have to go through is 0 0 to 99, which is 100 values. This can be generalized as 10 to the second power. Since there are 10 values on each dial, raised to two dials searching through each individual value, until you stumble on the right. One is a strategy called Brute Force, and Brute Force algorithms, have exponential, runtimes here. There are two dials. So, n is 2, and each dial has 10 values.
Again, we can generalize this algorithm as 10 raised to n where n represents the number of dials. The reason that this algorithm is so inefficient is because with just one more dial on the lock. The number of operations increases significantly, with three dials, the number of combinations in the worst case scenario, where the correct code is the last digit. In the range is 10 raised to 3 or 1,000 values with an additional wheel. It becomes 10, raise to 4 or 10 thousand values.
As n increases the number of operations increases exponentially to a point where it’s unsolvable in a realistic amount of time. Now, you might think well any computer can crack a four digit numerical lock. And that’s true because n here is sufficiently small but this is the same principle that we use for passwords in a typical password field implemented. Well, users are allowed to use letters of the English alphabet. So up to 26 characters numbers from 0 to 9.
And a set of special characters of which they can be around 33. So typically that means each character in a password can be one out of 69 values. This means that for a one character, password. It takes 69 to the nth power. So 1, which equals sixty nine operations in the worst case scenario, to figure out the password, just increasing, n22 increases. The number of operations needed to guess the password 269 squared.
Or 4760 one operations now usually on a secure website, there isn’t really a limit but in general, passwords are limited to around 20 characters in length, with each character, being a possible, 69 values and they’re being 20 characters. The number of operations needed to guess. The password in the worst case scenario is 69 raised to the 20th power or approximately 6 followed by 3600 number of operations.
An Intel CPU with five-course can carry out roughly about 65,000 million instructions per second. That’s a funny one number. I know to crack our 20 digit passcode In This Very simplistic model. It would take this Intel CPU to race. It 20th, power years to Brute Force the password. So while this algorithm would eventually produce a result. It is so inefficient that it’s pointless. This is one of the reasons why people
I’ll recommend you have longer passwords since brute-forcing is exponential. In the worst case. Each character. You add increases the number of combinations by an exponential, the next class of exponential algorithms is best highlighted by a popular problem. Known as the traveling salesman. The problem statement goes like this, given a list of cities and the distance between each pair of cities. What is the shortest possible route that visits each City and then returns to the
City. This seems like a simple question, but let’s start with a simple case three cities, a b and c to figure out what the shortest route is. We need to come up with all the possible routes with three cities. We have six routes in theory, at least some of these routes can be discarded because ABC is the same as CBA but in the opposite direction, but as we do know, sometimes going from a to c through be may go through a different route than C to a through b, so we’ll stick.
To the six routes and from there, we could determine the shortest. No big deal. Now, if we increase this 24 cities, we jump to 24 combinations, the mathematical relationship that defines this is called a factorial and is written out as n followed by an exclamation point. Factorials are basically n times. N minus 1 repeated until you reach the number one. So for example, the factorial of 3 is 3 times 2 times 1, which is 6, which is
Number of combinations. We came up with, for three cities. The factorial of 4 is 4 times, 3 times 2 times, 1 or 24, which is the number of combinations. We arrived at with four cities in solving, the traveling salesman problem. The most efficient algorithm will have a factorial runtime or a combinatorial runtime. As it’s also called at low values of n algorithms with a factorial, runtime may be used, but with an N value of say, 200 it.
Would take longer than humans have been alive to solve the problem for sake of completeness. Let’s plot a combinatorial runtime on our graph so that we can compare an algorithm such as one that solves the traveling salesman problem as a worst-case runtime of Big O of n. Factorial studying exponential run times, like this are useful for two reasons first in studying, how to make such algorithms efficient. We develop strategies that are useful across the board and
Potentially be used to make existing algorithms even more efficient. Second. It’s important to be aware of problems that take a long time to solve knowing right off the bat. That a problem is somewhat unsolvable. In a realistic time means you can focus your efforts on other aspects of the problem. As beginners though. We’re going to steer clear of all this and focus. Our efforts on algorithms with polynomial, run times, since we’re much more likely to work with and learn about such algorithms. Now that we know some of the
Owen complexities in the next video. Let’s talk about how we determine the complexity of an algorithm because there are some nuances over the last few videos. We took a look at common complexities that we would encounter in studying algorithms. But the question remains, how do we determine what the worst case complexity of an algorithm is earlier. I mentioned that even though we say that an algorithm has a particular upper bound or worst case, runtime each step in a given algorithm can have
Run times, let’s bring up the steps for binary search again. Assuming the list is sorted. The first step is to determine the middle position of the list in general. This is going to be a constant time operation, many programming languages hold on to information about the size of the list. So we don’t actually need to walk through the list to determine the size. Now, if we didn’t have information about the size of the list, we would need to walk through counting each item, one by one.
Until we reach the end of the list and this is a linear time operation. But realistically, this is a big O of 1 or constant time. Step two is to compare the element in the middle position to the Target element. We can assume that in most modern programming, languages. This is also a constant time operation because the documentation for the language tells us it is step three is our success case. And the algorithm ends. This is our best.
Yes, and so far, we have only incurred to constant time operations. So, we would say that the best case runtime of binary search is constant, which is actually true. But remember that best case is not a useful metric set. For if we don’t match is splitting the list in two sub lists, assuming the worst case scenario. The algorithm would keep splitting in two sub lists until a single element list, is reached with the value that we’re searching for the runtime for.
This step is logarithmic since we discard half the values each time. So in our algorithm, we have a couple steps that are constant time and one step that is logarithmic overall, when evaluating the runtime for an algorithm. We say that the algorithm has as its upper bound the same runtime as the least efficient step in the algorithm. Think of it this way. Let’s say you’re participating in a triathlon, which is a race that has a swimming running and it’s like
Component, you could be a phenomenal swimmer and a really good cyclist, but you’re pretty terrible run, no matter how fast you are at swimming or cycling. Your overall race time, is going to be impacted the most, by a running race time. Because that’s the part that takes you. The longest, if you take an hour, 30 to finish the running component, 55 minutes to swim and 38 minutes to bike. It won’t matter if you can fine-tune your swimming technique down to finish in 48 minutes.
And your cycle time to 35 because you’re still bounded at the top by your running time, which is close to almost double your bike time. Similarly, with the binary search algorithm. It doesn’t matter how fast we make the other steps there already. As fast as they can be. In the worst case scenario, the splitting of the list down to a single element list is what will impact the overall running time of your algorithm. This is why we say that the time complexity or run time of the algorithm in the worst case.
Is Big O of log n or logarithmic as I alluded to though your algorithm may hit a best case runtime and in between the two best and worst case have an average runtime as well. This is important to understand because algorithms don’t always hit their worst case, but this is getting a bit too complex for us for now. We can safely ignore average case performances and focus. Only on the worst case in the future. If you decide to stick around, we’ll Circle back and talk about this more. Now that you
About algorithms, complexities and big. Oh, let’s take a break from all of that and write code in the next video.
So far, we’ve spent a lot of time in theory. And while these things are all important, things to know, you get a much better understanding of how algorithms work when you start writing some code. As I mentioned earlier. We’re going to be writing python code in this and all subsequent algorithm courses. If you do have programming experience, but in another language, check the notes section of this video for an implementation in your language, if you don’t have any experience, I’ll try my best to explain as we go along.
On the video, you are watching right now. You should see a launch workspaces button. We’re going to use a treehouse coding environment call workspaces to write all of our code. If you’re familiar with using python in a local environment, then feel free to keep doing. So work spaces is an in-browser coding environment and will take care of all the setup and installation so you can focus on just writing and evaluating code workspaces is quite straightforward to use on the
left here. We have a file Navigator pane, which is currently empty since we haven’t created a new file on the top. We have an editor where we write all our code and then below that we have a terminal or a command line prompt where we can execute the scripts that we write. Let’s add a new file here. So at the top in the editor area, we’re going to go to file new file and we’ll name. This linear underscore search dot. Hi.
In here, we’re going to Define our linear search algorithm as a standalone function. We start with the keyword def which defines a function or a block of code. And then we give it the name, linear underscore search. This function will accept two arguments first. The list were searching through and then the target value. We’re looking for both of these arguments are enclosed in a set of parentheses and there’s no space between the name of the function.
Ian and the arguments. After that we have a colon. Now, there might be a bit of confusion here since we already have this target value. What are we searching for? Unlike the game? We played at the beginning where John’s job was to find the value. In a true implementation of linear search. We’re looking for the position in the list where the value exists if the target is in the list, then we return its position. And since this is a list that position is going to be denoted by
Index value. Now if the target is not found, we’re going to return, none. The choice of what to return in. The failure case may be different in other implementations of linear search. You can return -1 since that isn’t typically an index value. You can also raise an exception which is python speak for indicating an error occurred and I think for us the most straightforward value we can return here is none. Now, let’s add a comment to clarify this so hit enter to go to the next line and then we’re going to add
Three single quotes and then below that. On the next line, will say, a Returns the position or the index position of the target if found else returns, none. And then on the next line will close off those three quotes. This is called a docstring and is a python convention for documenting. Your code. The linear search algorithm is a sequential algorithm that compares each item in the list.
Till the target is found to iterate or Loop or walk through our list. Sequentially. We’re going to use a for Loop. Now, typically when iterating over a list in Python, we would use a loop like this would say for item in list. This assigns the value at each index position to that, local variable item. We don’t want this though. Since we primarily care about the index position, instead. We’re going to use the range function.
Ian in Python to create a range of values that start at 0 and end at the number of items in the list. So I’ll say, for I, I stands for index here in range starting at 0 and going all the way up to the length of the list.
We can get the number of items in the list using the Len function. Now, going back to our talk on complexity and how individual steps in an algorithm can have its own runtimes. This is a line of code that we would have to be careful about python keeps track of the length of a list. So this function call here. Land list is a constant time operation. Now, If This Were A naive implementation, let’s say we wrote The implementation of the list and we iterate over the list, every
Can we call this length function? Then? We’ve already incurred a linear cost. Okay. So once we have a range of values that represent index positions, in this list, we’re going to iterate over that using the for Loop and assign each index value to this local variable. I using this index value. We can obtain the item at that position using subscript notation on the list. Now. This is also a constant time operation because the language says so so we’ll do if list so
So once we have this value, which will get by using subscript notation, so it’s a list. I once we have this value will check if it matches the target. So if the value at I equals Target, well, if it does, then we’ll return that index value because we want the position. And once we hit this return statement, we’re going to terminate our function, if the entire for Loop is executed. And we don’t hit this return statement. Then the target does not exist in the list. So at the bottom here,
Say return. None, even though all the individual operations in our algorithm run in constant time, in the worst case scenario, this for Loop here will have to go through the entire range of values and read every single element in the list. Therefore giving the algorithm a bigger value of n or running in linear time. Now, if you’ve written code, before you’ve definitely written code like this a number of times and I bet you didn’t know, but all along you are implementing. What is essentially a well.
Unknown algorithm. So, I hope this goes to show you that algorithms are pretty approachable topic. Like everything else. This does get Advanced but as long as you take things slow, there’s no reason for it to be impossible. Remember that not any block of code counts as an algorithm to be a proper implementation of linear search. This block of code must return. A value must complete execution in a finite amount of time and must output the same result every time for a given input set. So let’s
Verify this with a small test. Let’s write a function called verify that. Accepts an index value. If the value is not none. It prints the index position. If it is none, it informs us that the target was not found in the list. So deaf verify, and this is going to take an index value. And will say, if index is not none, then we’ll print
Target found at index.
Oops, that’s a colon here.
Index else.
That needs to go back. There we go. Else will say Target.
Not found in list. Okay, using this function. Let’s define a range of numbers now, so, this will be a list numbers and we’ll just go from one to let’s say 10.
Now, if you’ve written python code before, you know, that I can use a list comprehension to make this easier, but we’ll keep things simple. We can. Now use our linear search function to search for the position of a Target value in this list. So we can say, result equal linear, underscore search and we’re going to pass in the numbers list. That’s the one we’re searching through and we want to look for the position where the value 12 exists and then we’ll verify this result.
If our algorithm works correctly, the verify function should inform us that the target did not exist. So, make sure you save the file which you can do by going up to file and save or hitting command s. And then Below in the terminal.
You’re going to type out python linear search or you can hit Tab and it should autocomplete linear search dot pi as you can see. Correct. The target was not found in the list. So the output of our script is what we expect for a second test. Let’s search for the value 6 in the list. So you can copy this command C to copy and then paste it again and we’ll just change 12 here to 6 and then come back down to the terminal, hit the up Arrow to execute the same.
And again and hit enter. You’ll notice that I forgot to hit save so it did not account for that. New change. Will try that again and there, you’ll see that if it works correctly, which it’s did, the index should be number five. Run the program on your end and make sure everything works as expected. Our algorithm returned a result in each case, it executed in a finite time and the results were the ones we expect in the next video. Let’s tackle binary search in the last video. We left off with an
Implementation of linear search. Let’s do the same for binary search so that we get an understanding of how this is represented in code. So we’ll do this in a new file back to file new file. And we’ll name this one binary search dot py like before we’re going to start with a function named binary search. So we’ll say def binary underscore search that takes a list and a Target if you remember by.
Search works by breaking the array or list down into smaller sets until we find the value. We’re looking for, we need a way to keep track of the position of the list that were working with. So, let’s create two variables first, and last to point to the beginning and end of the array. So first equal
Zero. Now, if you’re new to programming list, positions are represented by index values that start at zero instead of one. So here, we’re setting first, 20 to point to the first element in the list. Last is going to point to the last element in the list. So we’ll say last equal Len list minus 1. Now. This may be confusing to you. So a quick sidebar to explain what’s going on. Let’s say we have a list containing five elements.
If we called Len on that list, we should get five back because there are five elements. But remember that because the position number start at 0, the last value is not at position 5. But at for in nearly all programming languages, getting the position of the last element in the list is obtained. By determining the length of the list and deducting one, which is what we’re doing. Okay, so we know what the first and last positions are when we start the algorithm for our next line of code.
We’re going to create a while loop. A while loop takes a condition and keeps executing the code inside. The loop until the condition evaluates to false for our condition. We’re going to say to keep executing this Loop until the value of first is less than or equal to the value of last. So while first less than or equal to last, well, why you ask, why is this our condition? Well, let’s work through this implementation and then a
Jules ation should help inside the while loop. We’re going to calculate the midpoint of our list since that’s the first step of binary search midpoint equal. So we’ll say first plus last and then we’ll use the floor division double slash here divided by 2. Now the two forward. Slashes here are what python calls a floor division operator. What it does is it rounds down to the nearest whole number. So if we have an eight elements,
DeRay first is 0, last is 7, if we divided 0 plus 7, which is 7 by 2, we would get 3.5. Now 3.5 is not a valid index position. So we round that down to three using the floor division operator. Okay. So now we have a midpoint. The next step of binary search is to evaluate whether the value. Add. This midpoint is the same as the target we’re looking for. So, say if list value at midpoint equals the
It. Well if it is then we’ll go ahead and return the midpoint. So we’ll say return midpoint the return statement terminates our algorithm and over here. We’re done. This is our best case scenario.
Next, we’ll say else. If list at midpoint.
Or value at midpoint is less than the Target. Now here. If the value is less, the value at midpoint is less than the target. Then we don’t care about any of the values lower than the midpoint. So we redefine first to point to the value after the midpoint. So we’ll say midpoint + 1. Now, if the value at the midpoint is greater than the target, then we can discard the values after the midpoint and redefine last to point to the value. Prior to the
Point. So I say else.
Last equal, midpoint minus 1. Let’s visualize this, we’re going to start with a list of nine integers to make this easier to understand. Let’s specify these integers to be of the same value as its index position. So we have a range of values from 0 to 8. Our Target is the worst case scenario. We’re looking for the position of the value 8 at the start. Our algorithm sets first to point to the index 0 and last two point two.
Length of the list – 1 which is 8. Next, we hit our while loop. The logic of this Loop is going to be executed. As long as the value of first is not greater than the value of last. Or as we’ve defined it. We’re going to keep executing the contents of the loop as long as first is less than or equal to last on the first pass. This is true. So we enter the body of the loop. The midpoint is first plus last divided by 2 and rounded down, so we get a nice even for
The value at this position is for now. This is not equal to the Target. So we move to the first else if 4 is less than 8. So now we redefine first 2.2 midpoint plus 1, which is 5, first is still less than last, so we run through the body of the loop. Again. The midpoint is now 66 is less than 8. So we move first two point two. Seven seven is still less than or equal to 8, so we
For another iteration of the loop. The midpoint is 7, oddly enough and seven is still less than the target. So we move first 2.28, first is equal to last now, but our condition says, keep the loop going as long as first is less than or equal to last. So this is our final time through the loop. The midpoint is now 8, which makes the value at the midpoint equal to the Target. And we finally exit our algorithm and return the position.
Mission of the Target now. What if we had executed all this code and never hit a case where midpoint equal the target? Well, that would mean the list did not contain the target value. So, after the while loop at the bottom will return none. We have several operations that make up our binary search algorithm. So, let’s look at the run time of each step. We start by assigning values to first, and last, the value assigned to last involves a
To the Len function to get the size of the list, but we already know this is a constant time operation in Python. So both of these operations run in constant time inside the loop. We have another value assignment and this is a simple division operation. So, again, the runtime is constant in the next line of code were reading a value from the list and comparing the midpoint to the Target. Both of these again, are constant time operations. The remainder of the code is
In a series of comparisons and value assignments. And we know that these are all constant time operations as well. So if all we have are a series of constant time operations, why does this algorithm have in the worst case? A logarithmic runtime? It’s hard to evaluate by just looking at the code. But the while loop is, what causes the runtime to grow, even though all we’re doing is a comparison operation by redefining first and last over here, or rather in the
Two steps over here. We’re asking the algorithm to run as many times as it needs until first is equal, or greater than last each time. The loop. Does this the size of the data set, the size of the list grows, smaller by a certain Factor, until it approaches a single element, which is what results in the logarithmic runtime.
Okay, just like with linear search. Let’s test that our algorithm works. So we’ll go back to linear. Search that pie and we’re going to copy paste. So command C to copy, if you’re on a Mac, then go back to binary search and at the bottom.
Oops, we’re going to paste in that verify function. Okay, we’ll also go back and grab this numbers. You know what, let’s go ahead and copy. All all of these things so numbers and the to verify cases will paste that in as well. And the only thing we need to change her is instead of calling linear search. This is going to call binary search.
Okay, we’ll hit command s to save the file. And then I’m going to drag up my console and will run python binary, search .pi and hit enter. And you’ll see like, just like before we get the same results back. Now note that an extremely important distinction needs to be made here. The numbers list that we’ve defined for our test cases, right here has to be sorted. The basic logic of binary search, relies on the fact that if the
I get is greater than the midpoint, then, our potential values. Lie to the left or vice versa since the values are sorted in ascending order. If the values are unsorted, our implementation of binary search May return, none, even if the value exists in the list and just like that, you’ve written code, to implement to search algorithms. How fun was that? Hopefully, this course has shown you that it isn’t a topic to be afraid of and that algorithms like any other
Epic with code can be broken down and understood piece by piece. Now. We have a working implementation of binary search, but there’s actually more than one way to write it. So in the next video, let’s write a second version. I’m going to create a new file, as always file, new file and we’ll name this recursive underscore, binary underscore search .p, why?
Okay, so we’re going to add our new implementation here so that we don’t get rid of that. First implementation. We wrote let’s call this new function, recursive binary search. Unlike our previous implementation. This version is going to behave slightly differently in that. It won’t return the index value of the target element. If it exists instead, it will just return a True Value. If it exists in a false, if it doesn’t. So recursive.
Underscore binary underscore search. And like, before this is going to take a list. It accepts a list and a Target to look for in that list will start the body of the function by considering what happens if an empty list is passed in, in that case, we would return false. So I would say if the length of the list which is one way to figure out if it’s empty, if it’s equal to 0, then we’ll return false. Now, you might be thinking that in the previous.
Of binary search. We didn’t care if the list was empty. Well, we actually did but in a roundabout sort of way, so in the previous version of binary search, our function had a loop and that Loop condition was true when first was less than or equal to last. So as long as it’s less than or equal to last, we continue the loop. Now if we have an empty list then first is greater than last and the loop would never execute and we return none at the bottom. So
This is the same logic. We’re implementing here. We’re just doing it in a slightly different way. If the list is not empty, will Implement an else Clause. Now here we’ll calculate the midpoint by dividing the length of the list by 2 and rounding down. Again. There’s no use of first and last here. So we’ll say length of list and then using the floor division operator will divide that by 2. If the value at the midpoint, which will check
I think if list using subscript notation will say, midpoint as index. Now, if this value at the midpoint is the same as the target, then we’ll go ahead and return true so far. This is more or less the same except for the value that were returning. Let me actually get rid of all that.
Okay. Alright. So if this isn’t the case, let’s Implement an else Clause. Now, here we have two situations. So first, if the value at the midpoint is less than the target. So if value at midpoint
Is less than the target. Then we’re going to do something new. We’re going to call this function. Again this recursive binary search function that were in the process of defining. We’re going to call that again. And we’re going to give it the portion of the list that we want to focus on in the previous version of binary search. We moved the first value to point to the value after the midpoint. Now here, we’re going to create a new list using what is called.
They slice operation and create a sub list that starts at midpoint, plus 1 and goes all the way to the end. We’re going to specify the same Target as a search Target, and when this function call is done, will return the value. So we’ll see a return. The return is important. Then we’ll call this function again, recursive binary search, and this function takes a list. And here we’re going to use that subscript notation.
Perform a slice operation by using two indexes. A start and an end. So we’ll say our new list that were passing in needs to start at midpoint plus one, and then we’ll go all the way to the end. And this is a python syntactic sugar. So to speak. If I don’t specify an end index python knows to just go all the way to the end. All right. So this is our new list that we’re working with and we need a Target will just pass it through. If you confuse bear with me, just like before will visualize this at
And okay, we have another else case here. And this is a scenario where the value at. The midpoint, is greater than the target. Which means we only care about the values in the list from the start, going up to the midpoint. Now in this case as well. We’re going to call the binary search function again and specify a new list to work. With this time. The list is going to start at the beginning and then go all the way up to the midpoint. So it looks the same will say a return recursive.
Binary search.
We’re going to pass in a list here. So if we just put a colon here, without a start index, python knows to start at the beginning and we’re going to go all the way up to the midpoint. The target here is the same, and this is our new binary search function. So, let’s see if this works.
Actually, yes, down here will make some space and will Define a verify function. We’re going to copy paste, the previous one because we’re not returning none or an integer here. So, we’ll verify the result that we pass in and we’ll say, print Target found.
And this is just going to say true or false, whether we found it. Okay, so like before we need a numbers list and we’ll do something 1. 2 3 4 all the way up to 8.
Okay, and now let’s test this out. So we’ll call our recursive, binary search function and will pass in the numbers list. And the target here is 12.
We’re going to verify this.
Verify the result, make sure it works and then we’ll call it again this time, making sure that we give it a Target that is actually in the list. So here we’ll say 6 and we’ll verify this again.
Make sure you hit command s to save. And then in the console below, we’re going to type out python recursive, binary search .pi run it. And you’ll see that we verified that search works. Well, we can’t verify the index position of the target value which is a modification to how our algorithm works. We can guarantee by running across all valid inputs. That search works as intended. So why write a different search algorithm?
A different binary search algorithm. And what’s the difference between these two implementations? Anyway, the difference lies in these last four lines of code that you see here. We did something unusual here. Now before we get into this, a small word of advice. This is a confusing topic and people get confused by all the time. Don’t worry, that doesn’t make you any less of a programmer. In fact, I have trouble with it often and always look it up.
Including when I made this video, this version of binary search is a recursive binary search. A recursive function is one that calls itself. This is hard for people to grasp sometimes because there’s few easy analogies that make sense, but you can think of it and sort of this way. So, let’s say you have this book that contains answers to multiplication problems. You’re working on a problem. And you look up an answer in the book, the answer for your problem says
Ten to the answer for problem 52. Okay. So you look up problem 52 and there it says add 12 to the answer for problem 85. Well, then you go and look up the answer to problem 85. And finally, instead of redirecting, you somewhere else that answer says 10. So you take that 10 and then you go back to problem 52. Because remember, the answer for problem. 52 was to add 12 to the answer for problem 85. So you take that.
10. And then you now have the answer to problem. 85. So you add 10 to 12 to get 22, then you go back to your original problem where it said to add 10 to the answer for problem 52. So you add 10 to 22, and you get 32 to end up with your final answer. So, that’s a weird way of doing it. But this is an example of recursion, the solution to your first look up in. The book was the value obtained by another look up in the same book, which was followed by yet another
Cup in the same book, the book told you to check the book until you arrived at some base. Value, our function Works in a similar manner. So let’s visualize this with an example of list. Like before we have a 9 element list here with values 0 through 8. The target were searching for is the value 8 will check if the list is empty by calling Len on it. This list is not empty. So we go to the else Clause. Next. We calculate the midpoint 9 divided by 2.
Who is 4.5 rounded down is 4. So our first midpoint value is for will perform. Our first. Check is the value at the midpoint equal to the Target. Not true. So we go to our else Clause will perform another check. Here is the value at the midpoint less than the Target. Now, in our case. This is true earlier. When we evaluated this condition. We simply change the value of first here. We’re going to call the recursive binary search function again and give it a new list to
With the list starts at midpoint plus 1. So at index position, 5 all the way to the end, notice that this call to recursive, binary search in sight of recursive, binary search, includes a return statement. This is important and we’ll come back to that in a second. So now we’re back at the top of a new call to recursive binary search with effectively a new list. Although technically just a sub list of the first one.
The list here contains the numbers, six, seven, and eight, starting with the first check. The list is not empty. So we move to the else. The midpoint, in this case, length of the list, 3 divided by 2 rounded down, is one, is the value at the midpoint equal to the Target on the value. At that position is 7. So no in the else we perform. The first check is the value at the midpoint less than the target, indeed, it is. So we call
Recursive binary search again and provided a new list. This list starts at midpoint plus 1 and goes to the end. So, in this case, that’s a single element list since this is a new call to recursive, binary, search. We start back up at the top is the list, empty know, the midpoint is 0, is the value at the midpoint, the same as the target it is. So now we can return true. Remember, a minute ago. I pointed out that when we
all recursive binary search from inside the function itself. It’s preceded by a return statement that plays a pretty important role here. So, back to our visualization, we start at the top and we call binary search with a new list, but because that’s got a return statement before it. What we’re saying is, hey, when you run binary, search on this, whatever value, you get back return it to the function that called you. Then at the second level. We call binary search again along with another
Return statement, like, with the first call. We’re instructing the function to return a value back to the code that called it at this level. We find the target. So, the function returns true back to the caller. But since this inner function was also called, by a function with instructions to return. It keeps returning the True Value. Back up, until we reach the very first function that called glean back to our book of answers, recursive binary search instructs itself to keep working on.
The problem until it has a concrete answer. Once it does, it works its way backwards giving the answer to every function that called it until the original caller, has an answer. Now. Like I said at the beginning, this is pretty complicated so you should not be concerned. If this doesn’t click honestly, this is not one thing that you’re going to walk away with knowing fully how to understand recursion. After your first try. I’m really not lying when I say, I have a pretty hard time with recursion.
Now, before we move on, I do want to point out. One thing, even though the implementation of recursion is harder to understand, it is easier. In this case to understand how we arrived at the logarithmic run time, since we keep hauling the function with smaller lists. Let’s take a break here in the next video. Let’s talk a bit more about recursion and why it matters in the last video we rotate
Version of binary search that uses a concept called recursion, recursion might be a New Concept for you. So let’s formalize how we use it.
A recursive function is one that calls itself. In our example, the recursive binary search function called itself, inside the body of the function. When writing a recursive function. You always need a stopping condition. And typically, we start the body of the recursive function, with this stopping Condition. It’s common to call the stopping condition. The base case in our recursive, binary search function. We had to stop,
And conditions. The first was what the function should return, if an empty list is passed in. It seems weird to evaluate an empty list because you wouldn’t expect to run search on an empty list. But if you look at how our function Works, recursive binary search keeps calling itself. And with each call to itself, the size of the list is cut in half if we searched for a Target that didn’t exist in the list. Then the function would keep having it.
Self until it got to an empty list. Consider a three element list with numbers 1 2 3. We’re we’re searching for a target of for on the first past. The midpoint is 2 so the function would call itself with the list three on the next past. The midpoint is 0 and the target is still a greater. So the function would call itself this time passing in an empty list because an index of 0 plus 1, in a single element list doesn’t exist.
When we have an empty list, this means that after searching through the list, the value wasn’t found. This is why we Define an empty list as a stopping condition or a base case that returns false. If it’s not an empty list, then we have an entirely different set of instructions. We want to execute first. We obtain the midpoint of the list. Once we have, the midpoint. We can introduce our next base case or stopping condition. If the value at the midpoint is the same as the target then.
We return true with these two stopping conditions. We’ve covered all possible. Paths of logic through the search algorithm. You can either find the value or you don’t. Once you have the base cases. The rest of the implementation of the recursive function is to call the function on smaller, some lists, until we hit. One of these base cases, going back to our visualization for a second. We see that recursive binary search calls itself. A first time, which then calls itself.
Self again, for the initial list. We started with the function, only calls itself a few times before a stopping condition. Is reached. The number of times, a recursive function calls itself is called recursive depth. And the reason I bring all of this up is because if after you start learning about algorithms, you decide you want to go off and do your own research. You may start to see a lot of algorithms implemented using recursion.
The way we implemented binary search. The first time is called an iterative solution. Now, when you see the word iterative, It generally means the solution was implemented. Using a loop structure of some kind, a recursive Solution. On the other hand is one that involves a set of stopping conditions and a function that calls itself, computer scientists and computer science textbooks, particularly from back in the day, favor and are written in what are called functional languages.
In functional languages, we try to avoid changing data that is given to a function in our first version of binary search. We created first and last variables using the list and then modified first. And last, as we needed to arrive at a solution functional languages, don’t like to do this. All this modification of variables and prefer a solution using recursion and language, like python, which is what we’re using, is the opposite and doesn’t like recursion. In fact,
Has a maximum recursion depth after it, which are function will halt execution. Python, prefers an iterative solution. Now, I mentioned all of this for two reasons, if you decide that, you want to learn how to implement the algorithm in a language of your choice. That’s not python. Then you might see a recursive solution as the best implementation. In that particular language. I’m an iOS Developer for example, and I work with a language called Swift. Swift is
Different from python in that it doesn’t care about recursion depth and doesn’t need tricks. Where doesn’t even matter how many times your function calls itself? So if you want to see this in Swift Code, then you need to know how recursion works well, and now you have some idea and the second reason I bring it up is actually way more important and to find out onto the next video at the beginning of this series. I mentioned that there were two ways of measuring the efficiency of an algorithm. The first was time complexity or how the run time of an algorithm
Grows as n grows larger. The second is space complexity. We took a pretty long route to build up this example, but now we’re in a good place to discuss. Space complexity space complexity is a measure of how much working storage or extra storage is needed as a particular algorithm grows. We don’t think about it much these days, but every single thing we do on a computer takes up space in memory, in the early days of computing, considering memory.
Usage was of Paramount importance, because memory was limited and really expensive. These days were spoiled. Our devices are rich with memory. This is okay. When we write everyday code because most of us aren’t dealing with enormously large data sets. When we write algorithms. However, we need to think about this because we want to design our algorithms to perform as efficiently as it can, as the size of the data set. N grows really large like time complexity.
Space complexity is measured in the worst-case scenario using Big O notation, since you are familiar with the different kinds of complexities. Let’s Dive Right into an example.
In our iterative implementation of binary search. The first one, we wrote that uses a while loop. Let’s look at what happens to our memory usage, as n gets large. Let’s bring up that function. Let’s say we start off with a list of 10 elements. Now inspecting the code. We see that our solution relies heavily on these two variables first, and last
First points to the start of the list and last to the end, when we eliminate a set of values, we don’t actually create a sub list. Instead. We just redefine first and last, as you see here to point to a different section of the list since the algorithm only considers the values between first and last, when determining the midpoint by redefining first. And last, as the algorithm proceeds, we can find a solution using just the original list.
This means that for any value of n, the space complexity of the iterative version of binary search is constant or the iterative version of binary search takes constant space. Remember that we would write this as Big O of one. This might seem confusing because as n grows, we need more storage to account for that larger list size. Now, this is true, but that storage is not what space complexity cares about measuring. We
About what additional storage is needed as the algorithm runs and tries to find a solution. If we assume something simple, say that for a given size of a list represented by a value. N, it takes n amount of space to store it, whatever that means, then for the iterative version of binary search, regardless of how large the list is, at the start middle and end of the algorithm process. The amount of storage required does not get larger than
An n. And this is why we consider it to run in constant space. Now. This is an entirely different story with the recursive version. However, in the recursive version of binary search, we don’t make use of variables to keep track of, which portion of the list. We’re working with. Instead. We create new lists every time with a subset of values or sub lists with every recursive function call. Let’s assume we have a list of size n and in the worst case scenario the target element,
Is the last in the list calling the recursive implementation of binary search on this list and Target would lead to a scenario like this. The function would call itself and create a new list that goes from the midpoint to the end of the list, since we are discarding, half the values, the size of the sub list is n by 2. This function will keep calling itself creating a new sub list, that’s half the size of the current one until it arrives at a single element list and a
Open condition this pattern that you see here, where the signs of the sub list is reduced by a factor on each execution of the algorithmic logic. What we’ve seen that pattern before you remember where? This is exactly how binary search works. It discards half the values every time until it finds a solution. Now, we know that because of this pattern, the running time of binary search is logarithmic. In fact, this space complexity of the recursive version of binary search is the same.
If we start out with a memory allocation of size, n that matches the list on each function call of recursive, binary search. We need to allocate additional memory of size n by 2 and by 4 and so on until we have a sub list that is either empty or contains a single value because of this, we say that the recursive version of the binary search algorithm runs in logarithmic, time with a big O of log n. Now, there’s an important caveat here.
This totally depends on the language. Remember how I said that a programming language? Like, Swift can do some tricks to wear recursion depth, doesn’t matter. The same concept applies here. If you care to read more about this concept, it’s called tail optimization that’s called tail optimization. Because if you think of a function as having a head and a tail, if the recursive function, call is the last line of code in the function, as it is. In our case, we
All this tail recursion since it’s the last part of the function that calls itself and the trick that Swift does to reduce the amount of space and therefore, Computing overhead to keep track of this recursive calls is called tail, call optimization or tail call elimination. It’s one of those things that you’ll see thrown around a lot in algorithm discussions, but may not always be relevant to you. Now. What if any of this is relevant to us? Well, python does not
Implement tail, call optimization. So the recursive version of binary search takes logarithmic space if we had to choose between the two implementations, given that time complexity or runtime of both versions, The iterative and recursive version are the same. We should definitely go with the iterative implementation in Python, since it runs in constant space. Whoo. Okay, that was a loss. But all of this, with all of this, we’ve now established two important ways to distinguish
Between algorithms but handle the same task, and determine which one we should use. We’ve arrived at what I think is a good spot to take a long break and let all of these new Concepts sink in. But before you go off to the next course, let’s take a few minutes to recap. Everything. We’ve learned so far while we did Implement two algorithms in this course and actual code, much of what we learned here was conceptual and will serve as building blocks for everything. We’re going to learn in the future. So let’s list all of it out.
The first thing we learned about and arguably the most important was algorithmic thinking, algorithmic thinking is an approach to problem-solving that involves breaking a problem down into a clearly defined input and output along with a distinct set of steps that solves the problem. By going from input to Output algorithmic thinking is not something you develop overnight by taking one course, so don’t worry, if you’re thinking I still don’t truly know how to apply what I learned here. Algorithmic thinking.
Sinks in after you go through several examples, in a similar fashion to what we did today. It also helps to apply these Concepts in the context of a real example, which is another thing we will strive to do moving forward, regardless. It is important to keep in mind that the main goal here is not to learn how to implement a specific data structure or algorithm off the top of your head. I’ll be honest. I had to look up a couple code Snippets for a few of the algorithms myself in writing this course, but in going through this
This you now know that binary search exists and can apply it to a problem, where you need a faster search algorithm. Unlike most courses, where you can immediately apply what you have learned to build something. Cool, learning about algorithms and data structures will pay off more in the long run. These second thing we learned about is how to define and Implement algorithms. We’ve gone over these guidelines several times. So I won’t bore you here again at the end, but I will remind you that. If you’re often confused about how
To effectively break down a problem in code to something more manageable following those algorithm guidelines is a good place to start. Next, we learned about Big O and measuring the time complexity of algorithms. This is a mildly complicated topic, but once you’ve abstracted the math away, it isn’t as hazy a topic as it seems. Now, don’t get me wrong. The math is pretty important. But only for those designing and analyzing algorithms. Our goal is more about how to understand and evaluate.
Rhythms, we learned about common runtimes, like constant linear, or logarithmic, and quadratic runtimes. These are all fairly new Concepts. But in time, you will immediately be able to distinguish the run time of an algorithm based on the code. You write and have an understanding of where it sits on an efficiency scale. You will also in due time, internalized runtimes of popular algorithms, like the fact that binary search runs in logarithmic, time and constant space and be able to
Commend alternative algorithms for a given problem. All in all over time. The number of tools in your tool. Belt will increase. The next we learned about two important, search algorithms and the situations in which we select one over the other. We also implemented these algorithms in code so that you got a chance to see them work. We did this in Python, but if you are more familiar with a different language and haven’t gotten the chance to check out the code Snippets, we’ve provided you should try your hand at
getting it yourself. It’s a really good exercise to go through. Finally. We learned about an important concept. And a way of writing, algorithmic code through recursion, recursion is a tricky thing, and depending on the language, you write code with you may run into it more than others. It is also good to be aware of, because as we saw in our implementation of binary search, whether recursion was used or not affected the amount of space we used,
Don’t worry, if you don’t fully understand how to write recursive functions. I don’t truly know. Either. The good part is, you can always look these things up and understand how other people do it. Anytime you encounter recursion in our courses moving forward. You’ll get a full explanation of how and why the function is doing what it’s doing and that brings us to the end of this course, I’ll stress again that the goal of this course, was to get you prepared for learning about more specific algorithms by
Reducing you to some of the tools and Concepts, you will need moving forward. So if you’re sitting there thinking I still don’t know how to write many algorithms or how to use algorithmic thinking. That’s okay. We’ll get there, just stick with it. As always. Have fun and happy coding.
Hi, my name is poisson. I’m an instructor at treehouse and welcome to the introduction to data structures course. In this course, we’re going to answer one. Fundamental question. Why do we need more data structures than a programming language provides? Before? We answer that question, some housekeeping, if you will. In this course, we’re going to rely on Concepts. We learned in the introduction to algorithms course. Namely, Big O notation.
Ocean space and time complexity and recursion, if you’re unfamiliar with those Concepts or just need a refresher, check out the prerequisites courses listed. In addition. This course does assume that you have some programming experience. We’re going to use data structures that come built into nearly all programming languages, as our point of reference, while we will explain the basics of how these structures work. We won’t be going over, how to use them in practice. If you’re looking to learn how to program before digging.
Into this content check the notes section of this video for helpful. Links. If you’re good to go, then awesome. Let’s start with an overview of this course. The first thing we’re going to do is to explore a data structure. We are somewhat already familiar with arrays. If you’ve written code before, there’s a high chance you have used an array. In this course, we’re going to spend some time understanding, how arrays work. What are the common operations on an array and what are the run times associated with those operations. Once we’ve done.
Done that. We’re going to build a data type of our own called a linked list in doing. So we’re going to learn that there’s more than one way to store data. In fact, there’s way more than just one way. We’re also going to explore what motivates us to build specific kinds of structures. And look at the pros and cons of these structures will do that by exploring for common operations, accessing a value searching for a value. Inserting a value, and deleting a value.
After that, we’re actually going to circle back to algorithms and Implement a new one. A sorting algorithm in the introduction to algorithms course, we implemented. A binary search algorithm a precondition to binary search. Was that the list needed to be sorted. We’re going to try our hand at sorting a list and open the door to an entirely new category of algorithms. We’re going to implement our sorting algorithm on two different data structures and explore. How the implementation of one algorithm can defer.
Were based on the data structure being used. We’ll also look at how the choice of data structure potentially influences, the run time of the algorithm in learning about sorting. We’re also going to encounter another General concept of algorithmic thinking called divide and conquer along with recursion dividing conquer will be a fundamental tool that we will use to solve complex problems. All in due time in the next video. Let’s talk about a raise.
A common data structure built into nearly every programming language is the array array, is, are a fundamental data structure in can be used to represent a collection of values, but it is much more than that. Arrays are also used as building blocks to create even more custom data types and structures. In fact, in most programming, languages text is represented using the string type and under the hood strings are just a bunch of characters stored in a particular order in an array before we go.
A further and dig into arrays. What exactly is a data structure? A data structure, is a way of storing data. When programming, it’s not just a collection of values and the format, they’re stored in. But the relationship between the values in the collection, as well as the operations applied on the data stored in the structure, an array is one of very many data structures in general. An array is a data structure that stores a collection of values where each value is referenced using it.
Index or a key, a common analogy for thinking about arrays is as a set of train cars. Each car has a number and these cars are ordered sequentially inside each car, the array, or the train in this analogy store some data. While this is the general representation of an array, it can differ slightly from one language to another, but for the most part, all these fundamentals, remain the same in a language like Swift or Java arrays are homogeneous.
Containers which means they can only contain values of the same type. If you use an array to store integers in Java, it can only store integers in other languages. Arrays are hydrogenous structures that can store any kind of value in Python. For example, you can mix numbers and text with no issues. Now, regardless of this Nuance, the fundamental concept of an array is the index. This index value is used for every operation on the array from accessing.
I’ll use to inserting updating and deleting in Python the language. We’re going to be using for this course. It’s a tiny bit confusing, the type that we generally refer to as an array. In most languages is best represented by the list. Type in python, python does have a type called array as well, but it’s something different. So we’re not going to use it. While python calls it, a list. When we use a list in this course, we’ll be talking about Concepts that apply to arrays.
As well in other languages. So definitely don’t skip any of this as one more thing in computer science. A list is actually a different data structure than an array. And in fact, we’re going to build a list later on in this course, generally though. This structure is called a linked list as opposed to just list. So hopefully the terminology isn’t too confusing.
To properly. Understand how arrays work? Let’s take a peek at how arrays are stored under the hood. An array is a contiguous data structure. This means that the array is stored in blocks of memory that are right beside each other with no gaps. The advantage of doing this is that retreating values is very easy in a non-contiguous data structure. We’re going to build one soon. The structures stores, a value, as well as a reference to where the next value is.
To retrieve that next value, the language has to follow that reference also called a pointer to the next block of memory. This adds some overhead, which as you will see increases the runtime of common operations. A second ago. I mentioned that depending on the language erase can either be homogeneous containing the same type of value or head erogenous, where any kind of value can be mixed. This Choice also affects the memory layout of the array, for example, in a language, like, C Swift or
Ava where a razor homogeneous when an array is created since the kind of value is known to the language compiler and you can think of the compiler as the brains behind the language, it can choose a contiguous block of memory that fits the array size and values created, if the values were integers, assuming, an integer took up space represented by one of these blocks, then for a five item array, the compiler can allocate 5 blocks of equally sized memory.
In Python, however, this is not the case. We can put any value in a python list. There is no restriction. The way this works is a combination of contiguous memory and the pointers or references. I mentioned earlier when we create a list in Python, there is no information about what will go into that array, which makes it hard to allocate. Contiguous memory of the same size. There are several advantages to having contiguous memory since the values are stored beside each other.
Accessing the values happens in almost constant time. So, this is a characteristic. We want to preserve the way python gets around. This is by allocating contiguous memory and storing in it. Not the value we want to store but a reference or a pointer to the value that’s stored somewhere else in memory by doing this, it can allocate equally sized contiguous memory. Since regardless of the value size, the size of the pointer, to that value, is always going to be equal.
This incurs an additional cost in that when a value is accessed. We need to follow the pointer to the block of memory where the value is actually stored. But python has ways of dealing with these costs that are outside the scope of this course. Now that we know how an array stores its values. Let’s look at common operations that we execute on an array. Regardless of the kind of data structure. You work with all data structures are expected to carry out for kinds of operations at minimum. We need to be able to access and read values.
Stored in the structure. We need to be able to search for an arbitrary value. We also need to be able to insert a value at any point into the structure. And finally, we need to be able to delete structures. Let’s look at how these operations are implemented on the array structure. In some detail, starting with access elements in an array are identified. Using a value known as an index. And we use this index to access and read the value. Most programming languages follow a zero-based.
Whirring system when it comes to a raise. And all this means is that the first index value is equal to 0, not 1, generally speaking, when an array is declared a base amount of contiguous memory is allocated, as the array storage computers, refer to memory through the use of an address, but instead of keeping a reference to all the memory allocated for an array, the array only has to store the address of the first location because the memory is contiguous using the Base address of the array can
Calculate the address of any value by using the index position of that value as an offset. If you want to be more specific, think of it this way. Let’s say we want to create an array of integers and in each integer takes up a certain amount of space in memory that we’ll call Em. Let’s also assume that we know how many elements were going to create. So the size of the array is some number of elements will call n. The total amount of space that we need to allocate is n times the space per item, m.
If the array keeps track of the location in memory where the first value is held. So let’s label that M 0, then it has all the information it needs to find any other element in the list. When accessing a value in an array, we use the index to get the first element in the list. We use the 0th index to get the second. We use the index value 1 and so on, given that the array knows, how much storage is needed. For each element. It can get the address of any element by starting off with the
Dress for the first element and adding to that the index value times the amount of storage per element for example to access the second value. We can start with M 0 and 2 that add M times the index value, 1 giving us m 1 as the location in memory. For the second address. This is a very simplified model, but that’s more or less how it works. This is only possible because we know that array memory is contiguous with no gaps.
Let’s switch over to some code. As I mentioned earlier. We’re going to be using python in this course, if you don’t know how to code or you’re interested in this content, but no language other than python check, the notes section of this video. For more information, while the code will be in Python. The concepts are Universal and more importantly, simple enough, that you should have no issue following along in your favorite programming language. And to get started, click on the launch workspaces button on the video.
Your page that you’re watching right now. This should spin up an instance of a treehouse workspace and in browser coding environment, right now, your workspace should be empty and that’s expected. So let’s add a new file in here. I’m going to go to file new file and we’ll call this a raised dot p. Y Pi, creating a list in Python is quite simple. So we’ll call this new underscore list. We use a set of square brackets around a set of values.
To create a list. So one and we comma separate them. So space to and Space 3, this allocates a base amount of memory for the array to use. Or when I say, array know that in Python, I mean, a list, since this is python, the values aren’t stored in memory instead. The values 1 2, & 3 are stored elsewhere in memory and the array stores references. To each of those objects to access a value.
Use a subscript along with an index value. So, to get the first value, we use the index 0. And if we were to assign this to another variable would say, result equal new list. We write out new list since this is the array that we’re accessing the value from and then a subscript notation, which is a square bracket and then the index value as we saw since the array has a reference to the base location in memory the position of any element.
Ament can be determined pretty easily. We don’t have to iterate over the entire list. All we need to do is a simple calculation of an offset from the base memory, since we’re guaranteed that the memory is contiguous for this reason. Access is a constant time operation on an array or a python list. This is also why an array crashes if you try to access a value using an index that is out of bounds of what the array stores.
If you’ve used an array before you’ve undoubtedly, run into an error or a crash where you tried to access a value using an index that was larger than the number of elements in the array. Since the array, calculates, the memory address on the fly. When you access a value with an out-of-bounds index, as it’s called, the memory address returned, is not one. That’s part of the array structure and therefore, cannot be read by the array and python. This is represented by an index error, and we can make this happen by
by using an index. We know our array won’t contain. Now. I’m writing out my code here inside of a text editor, which obviously doesn’t run the code. So, let’s drag up this console area here and I’m going to write python to bring up the python interpreter. And in here we can do the same thing. So I can say new list equal 1 comma 2 comma 3 and now this is an interpreter, so it’s actually going to evaluate our code.
Alright, so now we have a new list. If I type out new list, it gets printed out into the console. Okay, I can also do new list square bracket zero. And you’ll see that I get the value 1, which is the value stored at beat 0th index. Now, to highlight, that index error, we can do new list and inside the square brackets. We can provide an index. That we know our array doesn’t contain. So here, I’ll say index, 10, and if I hit enter, you’ll see it’s a index error list index.
Dex out of range. And those are the basics of how we create and read values from an array in the next video. Let’s take a look at searching in the last video. We learned what happens under the hood when we create an array and read a value using an index. In this video. We’re going to look at how the remaining data structure operations work on a raise. If you took the introduction to algorithms course, we spend time learning about to search algorithms, linear search, and binary search.
While arrays are really fast and accessing values. They’re pretty bad at. Searching taking an array. As is the best we can do, is use linear search for a worst-case. Linear runtime, linear search works by accessing and reading each value in the list until the element in concern is found. If the element we’re looking for is at the end of the list, then every single element in the list will have been accessed and compared even though accessing and comparing our constant time operations having to do this.
Every element results in an overall linear time. Let’s look at how search works in code in Python. We can search for an item in an array. In one of two ways. We can use the in operator to check whether a list contains an item. So I can say, if one in new underscore list then print true the in operator actually calls a contains method
That is defined on the list type which runs a linear search operation. In addition to this. We can also use a for Loop to iterate over the list manually and perform a comparison operation. So I can say, for n in new list.
If n equals 1, then print true. And then after that break out of the loop, this is more or less the implementation of linear search. If the array of were sorted. However, we could use binary search but because sort operations, incur a cost of their own languages. Usually stay away from sorting the list and running, binary search. Since for smaller arrays, linear search on its own, maybe faster. Now again, remember that.
This is an editor. This is just a text file. None of these lines are of code are evaluated so you can try that out in here. So we’ll copy that and come down here and say Python and hit enter and then it starts up. We can paste in our list and now we can try it. What we just did. So if one in new list, print true,
there you go. It prints true now because we’ve already learned about linear and binary search in a previous course. There’s nothing new going on here. What’s more interesting to look at? In my opinion is inserting and deleting values in an array. Let’s start with inserting in general most array implementations support, three types of insert operations. The first is a true insert using an index value where we can insert an element anywhere in the list. This
Shane has a linear runtime. Imagine you wanted to insert an item at the start of the list. When we insert into the first position. What happens to the item that is currently in that spot. Well, it has to move to the next spot at index value 1, what happens to the second item and index position one, that one moves to the next spot at index position to this keeps happening until all elements have been shifted forward, one index position. So in the worst case scenario,
you inserting at the 0th position of an array. Every single item in the array has to be shifted forward and we know that any operation that involves iterating through every single value means a linear runtime. Now, the second way we can insert an item into an array, is by appending. A pending, although technically an insert operation in that it inserts, an item into an existing array doesn’t incur the same runtime cost because appends simply add the
To the end of the list we can simplify and say that this is constant time. This is a constant time operation, but it depends on the language implementation of array to highlight. Why that matters? Let’s consider how lists in Python work in Python. When we create a list. The list doesn’t know anything about the size of the list and how many elements were going to store creating a new empty list. Like so so numbers equal and too?
Empty brackets, so this creates a list and allocates a space of size n plus 1. Since n here is 0 there are no elements in this array. In this list space is allocated for a one-element list to start off because the space allocated for the list and the space used by the list are not the same. What do you think happens when we asked python for the length of this list? So I can sail in numbers?
We correctly, get zero back. This means that the list doesn’t use the memory allocation as an indicator of its size. Because as I mentioned it has allocated space for a one-element list, but it returns zero, so it determines it in other ways. Okay, so numbers this list currently has space for one element. Let’s use the append method defined on the type to insert a number at the end of the list. So you say numbers.’
Hend, I’ll pass in to now the memory allocation and the size of the list are the same since the list contains one element. Now, what if I were to do something like this numbers dot append and needs to be a DOT.
And I had another value, 200, that, since the list, only has an allocation for one item at this point before I can add the new element to the list. It needs to increase the memory allocation. And thereby the size of the list. It does this by calling a list resize operation list resizing is quite interesting because it shows the Ingenuity in solving problems like this.
Python doesn’t resize the list. Accommodate just the element. We want to add instead. In this case. It would allocate for blocks of memory to increase the size to a total of four contiguous blocks of memory. It does this so that it doesn’t have to resize the list, every single time. We add an element, but at very specific points. The growth pattern of the list type in Python, is 048 1625.
Thirty-five, forty six and so on, this means that as the list size approaches, these specific values, resize is called again. If you look at when the size of the list is, for this means that when a pending for more values until the size of eight, each of those append operations, do not increase the amount of space taken at specific points. However, when resizing is triggered, space required increases as memory, allocation increase,
Uses this might signify that the append method has a non constant space complexity, but it turns out that because some operations don’t increase space and others do. When you average all of them out, append operations. Take constant space. We say that it has an amortized constant space complexity. This also happens with insert operations. If we had four element array, we would have four elements and a memory allocation of for
And insert operation at that point doesn’t matter where it happens on the list, but at that point, it would trigger a resize inserting is still more expensive though because after the resize, every element needs to be shifted over one, the last insert operation that is supported in most languages is the ability to add one list to another in Python. This is called an extent and looks like this. So it’s a numbers and if you let me actually clear out the console.
She, well, let’s exit python will clear this out. Sore, back of the top and we’ll start again. So I’ll say numbers and we’ll set it to an empty list. And now we can say numbers that extent and as an argument, we’re going to pass in a new list entirely. So here we’ll say 4 comma, 5 comma 6. And then once I hit enter, if I were to print out numbers, you’ll see that it now contains the values 4 5 & 6.
Six, so extend takes another list to add extend effectively, makes a series of append calls on each of the elements in the new list until all of them have been appended to the original list. This operation has a runtime of Big O of K where K represents the number of elements in the list that we’re adding to our existing list. The last type of operation, we need to consider our delete operations, deletes are similar to inserts in that when
Delete operation occurs. The list needs to maintain correct index values. So we’re an insert shifts every element to the right and delete operation shifts, every element to the left, just like an insert as well. If we delete the first element in the list, every single element, in the list, needs to be shifted to the left. Delete operations. Have an upper bound of Big O of n. Also known as a linear runtime. Now that we’ve seen how common operations work on a data.
Structure that we’re quite familiar with, let’s switch tracks and build our own data structure over the next few videos. We’re going to build a data structure that you may have worked with before a linked list before we get into what a linked list is. Let’s talk about why we build data structures. Instead of just using the ones that come built into our languages.
Each data structure solves a particular problem. We just went over the basics of the array data structure and looked at the cost of common operations that we carry out on arrays. We found that arrays were particularly good at accessing reading values happens in constant time, but arrays are pretty bad at inserting and deleting both of which run in. Linear time, linked lists on the other hand are somewhat better at this. Although there are some caveats and if we’re trying to solve a problem that involves far more,
Inserts and deletes than accessing a linked list, can be a better tool than an array. So what is a linked list, a linked list is a linear data structure where each element in the list is contained in a separate object called a node. A node models, two pieces of information, an individual item of the data. We want to store and a reference to the next node. In the list. The first node in the linked list is called the head of the list. While the last note, is called a tail.
The Head and the tail nodes are special the list. Only maintains a reference to the Head. Although in some implementations. It keeps a reference to the tail as well. This aspect of linked list is very important and as you’ll see most of the operations on the list, need to be implemented quite differently compared to an array, the opposite of the head, the tail denotes, the end of the list every node other than the tail points to the next node in the list, but tail doesn’t point to anything.
This is basically how we know. It’s the end of the list. Nodes are what are called self-referential objects. The definition of a node includes a link to another node and self-referential. Here means the definition of node includes. The note itself. Linked lists often come in two forms. A singly linked list where each node stores a reference to the next node in the list or a doubly linked list where each node stores, a reference to both the node before and after if an array
Is a train with a bunch of cars in order. Then a linked list is like a treasure hunt. When you start the hunt. You have a piece of paper with a location of the first treasure. You go to that location and you find an item along with the location to the next item of treasure when you finally find an item. That doesn’t also include a location, you know that the Hunt is ended now that we have a high level view of what a linked list is. Let’s jump into code and build one together. Will focus on building a singly linked list.
This course, there are advantages to having a doubly linked list, but we don’t want to get ahead of ourselves.
Let’s start here by in creating a new file.
We’re going to put all our code for our linked list. So we’ll call this linked underscore list dot pi. And first we’re going to create a class to represent a node.
Say class node.
Now node is a simple object in that it won’t model much. So first, We’ll add a data variable. It’s an instance, variable here called data and will assign the value. None, initially. And then, we’ll add one more will call this next node and to this will assign none as well. So, we’ve created two instance variables data, to hold onto the data that were storing. And next node to point to the next node in the list. Now. We need to
Add a Constructor to make this class easy to create. So, we’ll add an init method here, that takes self, and some data to start off. And all we’re going to do is assign data to that instance, variable, we created. So that’s all we need to model node. Before we do anything else, though. Let’s document this. So right after the class definition, let’s create a doc string. So three quotes next line and we’ll say an object first.
Restoring a single node of a linked list and then on the next line will say, models to attributes.
Data and the link to the next node in the list and then we’ll close this doc string off with three more quotation marks. Okay? Using the node class is fairly straightforward so we can create a new instance of node with some data to store. Now, the way we’re going to do this is we’re going to bring up the console, and we’re going to type out, like we’ve been typing out before python followed by the name of this.
Skip to that we wrote which is linked list, linked underscore list at Pi. But before we do that, we’re going to pass an argument to the python command. We’re going to say Dash or hyphen. I and then the name of the script linked underscore list dot Pi. So what this does is this is going to run the python Rebel. The read evaluate print Loop in the console, but it’s going to load the contents of our file into that so that we can use it. So I’ll hit enter and we have a new instance going.
And now we can use the node in here so we can say, N1 equal node. And since we Define that Constructor, we can pass it some data. So we’ll say 10 here. Now. If we try to inspect this object, the representation return isn’t very useful, which will make things really hard to debug as our code grows. So for example, if I type out N1, you’ll see that we have invalid instance here, but it’s not very helpful the way it’s printed out. So we can customize this by adding a
Presentation of the object using the wrapper function. Now in the terminal still will type out. Exit like that, hit enter to exit the console and then down here. Let’s add in some room.
Okay, and here we’ll say def double underscore rapper, another set of double underscores. And then this function takes the argument self. And in here we can provide a string representation of what we want printed to the console. When we inspect that object inside of it, inside of the console. So here, we’ll say return again. This is a string representation. So inside quotes will say node. So this represents a node instance and the
Data, it contains. He will say percent s, which is a python way of substituting something into a string string interpolation and outside of the string, we can say percent again. And here we’re saying we want to replace this %, s with self dot data. Okay, let’s hit save. And before we move on. Let’s verify that this works. So I’m going to come in here.
Type clear to get rid of everything and then we’ll do what we did again and you can just hit the up Arrow. A couple times to get that command. All right, so hit enter and now just so you know, every time you run this, you start off, you know, from scratch so N1 that we created earlier not there anymore. So let’s go ahead and create it and one equal node 10, and we can type n 1 again and hit enter and you have a much better representation now. So we can see that we have a node and it contains the data 10.
We can also create another one and two equal node that contains the data 20. And now we can say N1 dot next node equal n 2. So N1 now points to end to and if we say N1 dot next node.
You’ll see that it points to that node. The node containing twenty nodes are the building blocks for a list. And now that we have a node object, we can use it to create a singly linked list. So again, I’m going to exit out of this.
And then go back to the text editor.
And here we’ll create a new class. So class linked list. The linked list class is going to define a head and this attribute models. The only node that the list is going to have a reference to. So here, we’ll say ahead and we’ll assign none initially, and then like we did earlier, let’s create a Constructor. So double underscore in it. Double underscore the stakes self.
And then inside like before we’ll say self dot head equal, none. This is the same as doing this, so we can actually get rid of that and just use the Constructor. Okay. So again, this head attribute models. The only note that the list will have a reference to since every node points to the next node to find a particular node. We can go from one note to the next in a process called list traversal in the class Constructor.
It will have set the default value of Head To None, So that new lists created are always empty. Again. You’ll notice here that I didn’t explicitly, declare, the head, attribute at the top of the class definition. And don’t worry. That’s not an oversight. The self dot head, in the initializer means that it’s still created. Okay. So that’s all there is to modeling a linked list. Now, we can add methods that make it easier to use this data structure. First, a really simple docstring to provide some information.
So here we’ll create a doc string, three quotation marks, and then we’ll say a singly linked list and then close it off.
A common operation carried out on data. Structures is checking whether it contains any data or whether it’s empty at the moment to check. If a list is empty. We would need to query, these instance variables head and so on every time ideally we would like to not expose the inner workings of our data structure to code that uses it. Instead. Let’s make this operation more explicit by defining a method. So we’ll say def is empty.
And this method takes self as an argument and here, we’ll say it return. Self dot head, double equal, none. All we’re doing here is checking to see if head is none. If it is this condition evaluates to True, which indicates the list is empty. Now, before we end this video, let’s add one more convenience method, to calculate the size of our list, the name convenience method indicates that what this method is doing is not providing any additional functionality that our
Structure can’t handle right now. But instead making existing functionality easier to use, we could calculate the size of our linked list by traversing it every time using a loop until we hit a tail node, but doing that every time is a hassle. Okay, so we’ll call this method size. And as always, it takes self.
Unlike calling Len on a python list. Not to be confused with the linked list, which is a constant time operation. Our size operation is going to run in linear time. The only way we can count, how many items we have is to visit each node and call next until we hit the tail node. So we’ll start by getting a reference to the Head will say, current equal self dot head. Let’s also Define a local variable named count with an initial value of zero.
That will increment. Every time we visit a node, once we have the tail count will reflect the size of that list next. We’ll Define a while loop that will keep going until there are no more nodes. So say, while current while current is the same as writing out while current does not equal none, but it’s more succinct. So we’ll go with this. Former. If the latter is more precise for you. You can go with that. Now, inside this Loop will increment the count.
It value so count plus equal 1 plus equal. If you haven’t encountered it before is the same as writing count equals count, plus 1. So if count is zero initially, so it’s 0 plus 1 is 1 and then we’ll assign that back to count. Okay, so count plus equal 1. Next, we’re going to assign the next node in the list to current. So, current equal current.next.
Note this way, once we get to the tail and call next node, current will equal none and the while loop terminates. So the end we can return count. As you can see. We need to visit every node to determine the size. Meaning our algorithm runs in linear time. So, let’s document this up in our doc string, which will add now to size will say Returns the number of nodes in the list.
Takes linear time. Let’s take a break here. We can now create lists check if they’re empty and check the size in the next video. Let’s Start implementing some common operations at the moment. We can create an empty list, but nothing else. Let’s define a method to add data to our list. Technically speaking. There are three ways we can add data to a list. We can add nodes at the head of the list, which means that the most recent node, we created will be the head and the
The first node, we created will be the tail or we could flip that around most recent nodes are the tail of the list. And the first node to be added is the head. I mentioned that one of the advantages of linked lists over arrays, is that inserting data into the list, is much more efficient than to the array. This is only true if we’re inserting at the head or the tail. Technically speaking. This isn’t an insert and you’ll often see this method called add prepend? If the data is added to the head or append, if it’s added to the tail.
A true insert is where you can insert the data at any point in the list, which is our third way of adding data. We’re going to circle back on that if we wanted to insert at the tail, then the list needs a reference to the tail node. Otherwise, we would have to start at the head and walk down the length of the list or Traverse it to find the tail. Since our list, only keeps a reference to the Head. We’re going to add new items at the head of the list.
Now before we add our new method, I’d forgot that I didn’t show you in the last video, how to actually use the code. We just added and how to check every time. You know, when we add new code that it works correctly. So like before we’re gonna bring up the console and here we’re going to say python Dash. I linked underscore list dot Pi which should load it load the contents of our file and now we’ll start here by creating a linked list. So L equal linked list.
And then we’ll use a node. So N1 equal node with the value 10 and now we can assign N1 to the nodes or to the linked lists head attribute. So l 1 dot head, equal n 1 and then we can see if size works correctly. So if we call l 1 dot size and since this is a method we need a set of parentheses the n and enter you’ll see that we get back one correctly. Okay, so it works.
Now let’s add our new method which we’re going to call AD ad is going to accept some data to add to the list inside of a node, will say def, add and every python method takes self as an argument. And then we want to add some data to this node. So we’re going to say data for the second argument inside the method first. We’ll create a new node to hold onto the data. So new underscore node equal.
Node with the data before we set the new node as the head of the list. We need to point. The new nodes next property at whatever note is currently at head this way, when we set the new node as the head of the list. We don’t lose a reference to the old head. So new underscore node, dot next node equal, self dot head. Now if there was no no dat head this correctly sets next node to none.
Now, we can set the new note as the head of the node. So say self dot head equal new underscore node, because the insert operation is simply a reassignment of the head and next node properties. This is a constant time operation. So let’s add that in as a doc string.
First, what the method does. So it adds a new node.
Containing data at the head of the list.
This operation takes constant time which is our best case scenario. Okay, let’s test this out. So I’m going to bring the console back up will exit out of our current Ripple and we’ll load the contents of the file again, and now we don’t need to create a node like we did earlier. So we can say, L, equal linked list l dot, add one.
Okay, let’s see if this works. Will call size and if it worked, then linked list. Should now have a size of 1. There we go. You can also do l dot, add to l dot add 3.
And l, dot size. Should now be three. There we go. Now, for, I were to Type L and just hit print again. What we get in the rebel is nothing useful. So like before we’ll implement the wrapper function for our linked list. Now. I’m just going to copy paste this in and we’ll walk through it. Okay. So this is what our implementation of repper looks like for the linked list object. You can grab this code from the notes section.
Of this video. Okay. So the top, you’ll see a doc string, where it says. It returns a string representation of the list and like, everything we need to do with a linked list. We need to Traverse it. So this is going to take linear time. We start by creating an empty list. Now. I need to distinguish. This is a python list, not a linked list. So we create an empty list called nodes. And two nodes. We’re going to add strings that have a description that provided description of each node, but we’re not going to use the description.
That we implemented in the node class because we’re going to customize it a bit here. Next we start by assigning self dot had to current. So we sort of have a pointer to the Head node. As long as current does not equal. None, which means we’re not at the tail. We’re going to implement some logic. So in the first scenario, if the node assigned to current is the same as the head, then we’re going to append this string to our nodes list and the string is simply going to say,
That hey, this is a head node, and it contains some data which will extract using current data. Next scenario is if the node assigned to currents next node is none. Meaning we’re at the tail node, then we’ll assign a different kind of string. So it’s the same as earlier, except we’re saying tail here and then finally in any other scenario, which means we’re not at the head or not at the tail will simply print the nodes value inside and again, will extract a using current data with every iteration.
Of the loop will move current forward by calling current.next node and reassigning it. And then at the very end when we’re done, we’ll join all the strings that are inside the nodes list together, using the python join method and will say that with every joint. So when you join these two strings together to make one string, you need to put this set of characters in between. All right. So, let’s see what this looks like. So I’m going to come down here, exit out of the console again, clear it out.
Load the contents of the file again, and let’s try that. So we’ll say L, equal linked list.
All right, so l dot, add 1, l dot add to l dot, add three, that seems enough. And then now if I type out L and hit enter, we get a nice string representation of the list. So you can see that we add every new node to the head. So we added one first one ends up being the tail because it keeps getting pushed out then too. And then finally 3. So 3 is at the head so far. We’ve only implemented a single method which functions much like
Append method on a python list or an array except it adds it to the start of the linked List. It pre pens it like a pen. This happens in constant time in the next video. Let’s add the ability to search through our list for the search operation. We’re going to define a method that takes a value to search for and returns either the node containing the value. If the value is found or none, if it isn’t, so right after I actually know what will make sure represent the
Last function or last method in our class. So we’ll add it above it. So here will say def search self and then key.
In the last video, we implemented the wrapper method to provide a string representation of the list. So we’re going to use similar logic here to implement. The search function will start by setting a local variable current to point to the head of the list. While the value assigned to current is a valid node. That is it isn’t none will check if the data on that node matches the key that were searching for so while current
We’ll say, if current dot data is the key, then we’ll return current. If it does match will go ahead and return it like we’ve done here, but if it doesn’t, we’ll assign the next node in the list to current and check again, so we’ll say else, current equal current next node.
Once we hit the tail node and haven’t found the key current get set to none and the while loop exits at this point. We know the list doesn’t contain the key so we can return none. Okay? That completes the body of our method. Let’s add a doc string to document this. So up at the top will say search for the first node containing data that matches.
The key that is important because if our linked list contains more than one node with the same value it is a matter. We’re going to return the first one with this implementation will also say here that it Returns the node or none. If not found in the worst case scenario will need to check every single node in the list before we find the key or fail. And as a result, this operation runs in linear time.
So, say tanks o of n or linear time.
So far, we haven’t seen anything that indicates this data structure, has any advantage over an array or a python list, but we knew that I mentioned the strength of linked lists comes in inserts and deletes at specific positions. Will check that out in the next video. But as always, before we end this one, let’s make sure everything works. So we’ll load the contents of the file again.
L equal linked list.
And then we’ll say l dot, add 10 mL dot, add 22, doesn’t matter. Now that add 45 and one more. Now that add 15, now we can say l dot search and we need to give it a value. So we’ll say 45 and this returns a node or none. So we’ll say n equal and then would enter if this works and should be a node. Okay? Weirdly and
Does not work here at least. It says it’s not a node, which means I made a mistake in timing out our code. And looking at it immediately. It’s fairly obvious. So this return none needs to be outside of the while loop. Okay, so I’m going to hit save now. So make sure it’s on the same indentation here, which means it’s outside the while loop and then we’ll run through this again.
Okay, so L is linked list L, then add 10. Now, that add to now that add 45, and what was the last one? We did. I believe it was 15 and now we should be able to say, l dot search. Remember, we’re assigning this to an Ode to a variable. So, l dot search.
45.
And there you go. We get that node back and we can hit L, and we’ll see a representation of our list. Okay. So, again, in the next video, inserts and deletes at specific positions, insert operations on linked lists are quite interesting, unlike arrays, where, when you insert an element into the array, all elements, after the particular index need to be shifted with a linked list. We just need to change the references to next on a few notes and we’re good to go since each node points.
Next one, by swapping out these references. We can insert a node at any point in the list in constant time. Much, like binary search though. There’s a catch, to find the node at that position. We want to insert, we need to Traverse the list and get to that point. We just implemented our search algorithm for the linked list type, and we know that this runs in linear time. So, while actually inserting is fast finding the position in the list you want to insert it is not. This is why I
Mentioned that there were some caveats to inserting. Anyway, let’s see what this looks like in code.
Will Define a method named insert that takes data to insert along with an index position? So we’ll do this after search right here, say def insert, and this takes some data to insert and a position to insert it at
You may be thinking, wait a minute, linked lists don’t have index positions. Right? And you’re correct, but we can mimic that behavior by just counting the number of times we access next node, if the index value passed into this argument is 0. That means we want to insert the new node at the head of the list. This is effectively the same behavior as calling ad which means the logic is the same. So we don’t need to repeat it. We can call the add method we wrote earlier. So we’ll say if
Index.
If index equals zero or if index is 0, then self dot add data. If the index is greater than 0 then we need to Traverse the list to find the current node at that index. So if index is greater than 0, now before we do that, we need to create a new node, containing the data. We want to insert. So say nu equal node with some data.
I’m going to assign index the argument passed to our function to a local variable named position and the head of the list to a variable named current position equal index.
Current equal self dot head. Every time we call current.next node. Meaning we’re moving to the next node in the list will decrease the value of position by one when position is 0 will have arrived at the note. That’s currently at the position. We want to insert it in reality though. We don’t want to decrease it all the way to zero. Imagine. We have a list with five nodes and we want to insert a node at position 3 to insert an image.
At position 3, we need to modify the nodes at positions, two and three, no twos. Next node attribute is going to point to the new node and the new nodes. Next note, attribute will point to node 3 in this way. An insert is a constant time operation. We don’t need to shift every single element. We just modify a few. Next node, references in a doubly linked list. We can use node 3 to carry out both of these operations. Note 3 in a
Doubly linked list, would have a reference to node 2 and we can use this reference to modify all the unnecessary links and a singly linked list, though, which is what we have if we kept decreasing position until we’re at zero, we arrived at node 3. We can then set the new node next node property to point to node 3. We have no way of getting a reference to node to which we also need for. This reason. It’s easier to decrease position to just one when it equals.
One and stop at node 2. So in here will say,
while position is greater than 1. Now, while the position is greater than 1, will keep calling next node and reassigning, the current node. So current equal, no dot next node, and at the same time will decrement position. So position equal to position.
– 1, which you can also succinctly. Write as minus equal 1 this way. When the position equals 1, the loop exits and current will refer to the node at the position before the insert point. So, outside the while loop will say previous equal current and next equal, current.next node to make things more clear. What I’ve done here is name the node before.
The new one previous and the node after the new one. Next, all that’s left to do now is to insert the new node between previous and next. So we’ll say previous dot next node equal new and then new DOT next node equal next. Now. It seems like there’s an issue in variable naming here. And I’m most probably conflicting with some globally named next variable. So actually, go ahead and call this next.
Snowed and previous node so that we don’t mess things up here. Previous node.
So the dot next node is obviously the attribute on a node, but this is just a local variable. Let’s document this method. So up at the top. Well a doc string and you will say inserts, a new node, containing data at index position.
Insertion takes constant time.
But finding the node.
At the insertion point takes linear.
Time.
Let’s add this to the next line. There we go. And then we’ll say therefore it takes an overall linear time.
This is why even though we can easily insert a new node without having to shift the rest, ultimately adding to either the head or the tail. If you have a reference is much more efficient. We have one more operation to add to our linked list, that will make it a robust data structure. Much like inserts. Removing a note, is actually quite fast and occurs in constant time, but actually get to the node that we want to remove and modify the next connections. We need to Traverse the entire list in our worst case.
In the worst case this takes linear time. Let’s add this operation to our data structure.
There are two ways we can Define the remove method one, where we provide a key to remove as an argument and one where we provide an index. Now, in the former the key refers to the data, the node stores. So in order to remove that note, we would first need to search for data that matches. The key. I’m going to implement that first method, which will call remove, and I’ll leave it up to you to get some practice in and Implement a remove at index method to complete our data structure. So we’ll add.
After the insert method right here.
Remove is going to accept a key which will need to search for before we can remove a node earlier. We defined a search method that found a node containing data that matches a key. But we can’t use that method. As is for the implementation of remove. When we remove a node much like the insert operation. We need to modify. The next node references the node before the match needs to point to the node after the match. If we use the search method, we defined earlier.
We get the node. We want to remove as a return value, but because this is a singly linked list. We can’t obtain a reference to the previous node. Like I said, earlier, if this was a doubly linked list, we could use the search method. Since we would have a reference to that previous node, will start here by setting a local variable named current to point to the Head. Let’s also defined a variable named previous that will set To None to keep track of the previous node.
Note as we Traverse the list. Finally, let’s declare a variable named found that will set to false found is going to serve as a stopping condition for the loop. That will Define will use the loop to keep traversing the linked list. As long as found is false, meaning we haven’t found the key that we’re looking for. Once we found, it will set found to true and the loop terminates. So let’s set up our Loop. So it’s a while current and not found.
Here we are defining a while loop that contains two conditions. First. We tell the loop to keep iterating as long as current does not equal none. When current equals none. This means we’ve gone past the tail node, and the key doesn’t exist. The second condition asks the loop to keep evaluating as long as not found equals true. Now, this might be tricky because it involves a negation here right now.
And is set to false so not found not false equals true, this not operator. Flips the value. When we find the key and we set found two true, not true, not found with equal false then and the loop will stop the end in the while loop means that both conditions current being a valid node and not found equaling. True, both have to be true. If either one of them evaluates to false, then the loop will terminate
Now, inside the loop, there are three situations that we can run into first. The key matches. The current nodes data and current is still at the head of the list. This is a special case because the head doesn’t have a previous node, and it’s the only node being referenced by the list. Let’s handle this case. So we’ll say if current data double equals the key and current is self dot head, which you can write out as current equal self-taught head or current.
Is self dot head. Now, if we had this case.
Will indicate that we found the key by setting found to true.
And then this means that on the next pass, this is going to evaluate to false because not Tru will be false and then the loop terminates. Once we do that. We want to remove the current node. And since it’s the head node, all we need to do is point head to the second node in the list, which we can get by referencing. The next node, attribute on current self dot head equal, current.next node. So when we do this, there’s nothing.
Getting to that first node. So it’s automatically removed. The next scenario is when the key matches data in the