13. Open Source Robot Navigation, with Steve Macenski
In this episode, Audrow Nash speaks to Steve Macenski, who is the Open Source Robotics Engineering Lead at Samsung Research America. Steve leads the Nav2 project, which is an open source robot navigation framework. In this conversation, they talk about the problem and challenges of robot navigation, how Nav2 works at a high level, hybrid planners, and about Steve’s experience working with the Nav2 community.
- 0:00:00 - Start
- 0:00:39 - Introducing Steve and Nav2
- 0:01:07 - History of the navigation stack and Nav2
- 0:06:10 - What problems does Nav2 solve?
- 0:08:05 - How Nav2, MoveIt, and ROS Control relate
- 0:10:59 - Challenges in navigation
- 0:14:22 - Working in a grid world
- 0:24:36 - Motion primitives
- 0:40:26 - Potential fields
- 0:46:12 - Unknown maps
- 0:48:03 - Relating paths to wheel velocities
- 0:50:55 - When is Pathplanning used?
- 0:52:57 - Hybrid planners
- 0:54:59 - Changing environments
- 0:57:11 - Behavior trees
- 1:01:42 - Behavior tree example
- 1:06:16 - Behavior trees versus alternatives
- 1:08:19 - Nav2 and SLAM
- 1:10:41 - Nav2 for different types of surface environments
- 1:13:27 - Experience with the Nav2 community
- 1:18:02 - Open Source as a business model
- 1:19:09 - Getting involved
- 1:22:29 - Career advice
The transcript is for informational purposes and is not guaranteed to be correct.
(0:00:02) Audrow Nash
This is a conversation with Steve Macenski, who is the Open Source Robotics Engineering lead at Samsung Research America. Steve leads the Nav2 two project, which is an open source robot navigation framework. In this interview, Steve and I talk about the problem and challenges in robot navigation, how Nav2 works at a high level, hybrid planners, and about Steve's experience working with the Nav2 community. This is the Sense Think Act Podcast. I'm Audrow. Nash. Thank you to our founding sponsor Open Robotics. And now here's my conversation with Steve. Steve, would you introduce yourself?
(0:00:42) Steve Macenski
Hi, yeah, I'm Steven Macenski. And the open source engineering lead for a box at Samsung research. So I'm the lead developer on the ROS to navigation stack. And I sit on the TSC. For us, too, as well. And the TSC is the Technical Steering Committee. Correct? Yeah. So it's like the the board for ROS, let's say. So it's a bunch of different companies that kind of come together with Idris and ROS 2 to make sure that we can see the project the right direction.
(0:01:07) Audrow Nash
So yeah, tell me a bit about the navigation stack or not. And then how that relates to NAV two.
(0:01:14) Steve Macenski
So I'm going to navigate stack was the the, you know, the software built at Willow Garage to do you know, mobile based application, you know, primarily for the PR to robot. But now even in those early days, they understood that a lot of the same technology could be applied to other mobile platforms. So I made at the time they made it, it was awesome. The fact that there was a software out there that was free and open source and with a strong community around it with documentation and examples and all that kind of good stuff. But you know, over time, you know, that was, you know, in the very early days of robotics, where basically the light robot you could buy if your home was like, you know, an iRobot Roomba. And there really were few of any kind of service robots, you know, in application. What year do you think this was? Was this like, 22,000? Or when do you think it was like 12 ish is when started. And then there's a big push to update some algorithms. And I think the summer of 2011 2012, when a bunch of insurance guy came in, and absolute new new variations of the algorithms.
(0:02:12) Audrow Nash
Okay, so you had this open source navigation stack? And that was with Willow Garage. And then how did that evolve? How did that how did we go from there? Yeah,
(0:02:23) Steve Macenski
so it's after we'll closed in the 2013 timeframe, kind of all major developments on the navigation, kind of kind of halted. So you had folks like David Liu and Mike Ferguson that continued to maintain things over time, to date. So David was, you know, working at Lucas robotics and working as a consultant doing doing various navigation work. And then Mike Ferguson was starting, what was the precursor to Fetch robotics, then later, Fetch robotics. And so he was using in the early days, some of the navigation stuff, I think, these days, I have to assume everything's been stripped out their systems at this point, but in the early days to try and build that MVP to get the venture capital money, that that's the he was working on on the napkin stuff, as well. But then, you know, nothing really major change from from 2013, basically, until about, you know, 2019 or so when, you know, ROS 2 started hitting kind of more mature milestones. So I think with without see is when Matt Hanson, in the open source robotics team at Intel, had a big post a discourse saying that, hey, there, they want to work on this roster navigation stack. They didn't want it to just be a straight port for boss, one that wants to redesign from the ground up based on like, the current requirements that are basically is asking people to air their grievances about the boss one stack, or what kinds of things they were swapping out the stack because they didn't meet the requirements. And so that that actually, that discourse posts, I still today occasionally open up and go down a list of refers to it to make sure Yeah, to make sure like, okay, oh, like I forgot about this, we still haven't addressed this point. And I'll add that to my queue. But at this point, a lot of those lob abroad, things that people said are have not been handled for the most part. So really, Matt, and the team at Intel, really put like the legwork in to get this get the ball rolling. And when they did that's kind of when I when I jumped on on the bandwagon to start working on things. But that's essentially how that app to project kind of got going. And how we got a lot of the architectural elements we see today.
(0:04:20) Audrow Nash
Gotcha. So if I understand correctly, so it was started at Willow Garage in 2010. Ish. And the as a general navigation library, and then ROS 1 came out. And it was used with ROS 1. So the general navigation code was using ROS 1. It continued and people were using that. And then ROS 2 came out in an early version, he said bouncy, and then you then ask the community for what was wrong or what could be improved. And then this is how we have come to NAV to
(0:04:58) Steve Macenski
Yeah, I mean, that's it It wasn't me that that was Matt and the team at Intel, they were really the ones in the early days when you're porting stuff over and starting to do some of the core architecture elements that were taking the lead. So I think they had like six people working on it. So they there was a lot of professional software engineering time put into it. And in those early
(0:05:16) Audrow Nash
days, nice, and how did you get involved?
(0:05:19) Steve Macenski
Yeah, so I've been kind of doing open source stuff for a bit before that point. So when I was doing the robotics team at semi robotics, I built the STL layer, as well as the SAM toolbox library. And so I've already done a couple of ROS con talks on those topics. So Samsung was looking to hire somebody to work with the open source group on various open source box technologies. And they've reached out to Matt Hanson asking kind of who they might recommend that would be good for this. And they said that, you know, I might be good, good, good candidate. At the time, I was leaving Sydney robotics, and so is really good transition to go over to Samsung research, to work on this open source technology full time, versus just, you know, working on these problems that are specific to Cindy, for the end products, and then open sourcing the little bits here and there that makes sense open source. You know, now I can start working on open source things and make big infrastructure changes that are that are fully in the open.
(0:06:11) Audrow Nash
Yeah, it's awesome. It's an awesome position to be in. It's fun. Oh, yeah. With. So tell me kind of at a high level, what problems does not have to solve?
(0:06:24) Steve Macenski
For work to solve? Yeah, so now, nav two is basically sitting between you have a bunch of motors and a motor controller that can take take philosophy commands to go forward and your autonomy application where you want robots, you say I want robot to go there. So it kind of deals with everything in the middle from the the mapping and localization systems, which I consider to be slightly separated from navigation, I consider the navigation, one problem set and localization to be another problem set. But we do have reference implementations that work with AMC LSF toolbox for that localization element. And then on the navigation side, and we say, well, now that we know where we are within some global representation of the world, and we have this global representation of the world, how do we, you know, navigate the space in order to, you know, go to our to our final goal. So this could be in a warehouse, like the position on a shelf, where you want to have go pick a box, or with your Roomba say, you know, what's this little pattern I want to nap on the floor in order to clean my space? So we solved that problem of saying, how do we take some sensor data from from just hardware and populate a representation of our environment, given the current state of the world? And then once we have the state of the world, how do we make a route that goes from where I am to where I want to be within that environment. And then finally, dealing with the the lower level system of okay, how I now follow this path that goes from where I am to where I want to be accurately efficiently. And there's obviously more little bells and whistles on widgets, here and there to help make that you know, either more efficient or more configurable or deal with fault tolerances and things like that. But that'd be kind of the core problem navigation and actually tries to solve.
(0:08:07) Audrow Nash
So it's, it's sounds like several things. How do you? How do you find a good delimiter? Between math two and other ROS libraries? Like ROS to control for example, or move it or any of these other ones?
(0:08:27) Steve Macenski
Yeah, so I guess these all become parallel pillars. I mean, these are all by themselves really significant applications that have their own specific developer code bases and don't own individual ecosystems, and are they themselves you know, equivalently complex challenges to solve. So move that I say in the most simplistic terms is our navigation, which was actually the the the library name from Willow Garage that became move it was application. So it's often the same kinds of problems as as the mobile robot navigation that nav two solves, but for arms, so it's taking potentially sensor data to build an environmental representation. In the case of movement, they're using Okta map and with 3d Octrees, then they're using sampling based motion planning in order to get a a rough plan from point A to point B. And using various smoothing techniques usually optimization base to smooth those out to be a little bit less clunky on the on the motors earlier while you're going through the sampling base planner would give you naturally and then finally, then having those the same kind of local trajectory controllers, we have Neptune now with the hyperplane schema. So there's somebody a very, very analog problem, pretty much everything they exist in that one nephew, rather, and move and have some sort of direct analog to each other for the most part. And then your ROS control is is kind of sitting in that lower level on the actual electronics solving the motor control issue of saying, you know, I have this I have this motor and I want to control this motor to do something interesting for me. So if you have a differential drive robot, how do I take this this block The command I get for navigation and and follow that with, with my my motor efforts, or for manipulation, which is frankly, more of what ROS control is used for has, you know more advanced impedance controllers and other other type of modeling to deal with, you know, the kind of manipulation problem. For the most part, the navigation control systems at the lower level are, you know, very simple, right, this is the kind of stuff that undergraduates might do in their, their, their intro to controls lab, it's not not overly complex,
(0:10:30) Audrow Nash
like a, like a proportional integral derivative or PID controller kind of thing or, yeah, to try to get it to a position.
(0:10:39) Steve Macenski
Yeah, pictures are pretty, pretty common, as well as it potentially if you're working on warehousing robot where you have, you know, different different weights of opposite objects on top of it. You there's other slightly more advanced control scheme you might use, that's based on on efforts rather than just just photocurrents
(0:10:59) Audrow Nash
Petrit. Now, why is can you tell me some of the challenges and making something perform navigation? Like, what are some of the problems that nav two is solving, and there's some flexibility around the ways it salts but so
(0:11:15) Steve Macenski
yeah, so, you know, one of the biggest challenges is just taking both the representation of your environment you have from from mapping or localization or you know, other previous data that you've collected and somehow assembled. And then the current data you have about the environment. So if you have a 2d laser scanner on it, or you have a RGB camera on it, which provides you depth information, as well as color camera information, in, you know, radars, sonars tall sensors, I mean, there's, there's a large variety of different sensors that you can utilize that robot. And most robot systems actually have multiple these types of sensors on them. So the problem is dealing with how do we take all that sort of information and populate something that is now useful to solve the navigation problem. And so we do that with a navigation stack with the the cost of 2d package, which is essentially a direct port. For last one, we have some additional plugin layers, we've added for more advanced capabilities, but you know, high level, exact same package, long term, we'd like to actually replace that with something a bit a bit more advanced. But at the moment, that's where we're at. So that's one of the big problems. Another problem is saying, Okay, well, now that we have this populated representation of my environment with current sensor data is being fused into it. How do I actually plan within that space so that my robot can can, you know, achieve its goal? And so there's a variety of different methods of doing that, obviously. So there's, there's, you know, holonomic planners that work well with like 2d, or grid, basically techniques that work very well with
(0:12:44) Audrow Nash
holonomic. It means it's like a car or something, right? And non holonomic. robot can drive like sideways and things where a car has to go forward or backward. That's what all means, right? Or?
(0:12:58) Steve Macenski
Yeah, it's a hot holonomic means it can move in any direction. So think about a robot with Swiss wheels, or whatever they're called, or yeah, those little roller wheels. Yeah. Or, or potentially, where you can pick the wheels themselves to be able move sideways. Yep, those are considered to be holonomic robots. On a first order approximation, we also say that differential drive robots, if they're circular are also holonomic. Because if you can pit it perfectly in place as a different drive robot, and you're circular, so your your footprint isn't moving, you can you can achieve the same kind of behavior as with a truly holonomic robot,
(0:13:33) Audrow Nash
because the angle is facing and then you can drive anyway. I see. Yeah.
(0:13:37) Steve Macenski
Yeah, precisely. So for those kind of robots, it's really easy to leverage just like, you know, your usual a storage actress algorithm that you might learn in, like an intro to algorithms class, but those are perfectly serviceable, you probably want to do some smoothing on it, if you are doing a direct research so that you don't have like little jaggedy lines on thing. But you know, it's there's there's different ways we do that through navigation functions, which are basically generating these potential fields of cost that you're you're tracing on instead of just like directly just doing a grid search. But I mean, at a at a conceptual level, they're, they're, you know, the same kind of class of algorithm. As a star. Yeah, it is. They solve the same kind of problems. Yeah, they're, they're giving you a path through space.
(0:14:24) Audrow Nash
Yep. And if I remember correctly, so for a star, it's a heuristic, a search that you say the goal is that way and I'm going to try to search in that direction. And then you find the optimal in a grid world you'll find the optimal solution of what's the shortest path to get there. And what was extras? Because it is it more general thing or forget how it relates?
(0:14:55) Steve Macenski
Just breath first, it just goes in every direction at the same time. You don't have there's not a heuristic that that saying, like go in that direction, it's just gonna go out and expand its hole, find the optimal solution.
(0:15:06) Audrow Nash
Okay, and then we feel for these if you have a start and an end spot, and so it's going to search the space, and it's going to find it. And we're representing space in a grid world, or some sort of representation. And so that's why you would have these kind of jagged edges, rather than a smooth way to get there, per se, if you go up into diagonal to the grid.
(0:15:29) Steve Macenski
Yeah, so So you had like, basically, the motion model motion model, so to speak, that you're using is, is the the search neighborhood. So you're saying, I can search, you know, left, right up down for like the four connects neighborhood or say uptown, you know, up, down, left, right, and then the diagonal? So that's the eight connected space, and so that they'll let you then work at, you know, how do you plan to go 45 degrees, but you're still restricted only to these, these, you know, exactly the angle increments of either going, you know, zero 90 or 45 degrees for your planner. So this works, you know, fairly well, when you're working with just a simple, you know, simple base that like you might see on your on your on your Roomba over this kind of breaks down, it becomes more of a significant issue is when you're working with your larger noncircular robots. So think about like a forklift, or you're thinking about like, I think that's robotic cells, like a big platform robot, I don't, I don't know they call it. It's like something 1500 bucks. The thing is, and those are obviously like, very noncircular. And so even though they are a differential drive robot and get pivot in place, if you need to do a bit bit more work to make sure that it's a viable the plant or a space when you're in a narrow aisle way where maybe the robot can just trivially rotate in place,
(0:16:42) Audrow Nash
because it could swing and hit something because it Yeah, exactly. It's rectangular or something. Okay,
(0:16:48) Steve Macenski
yeah. And the other big kind of class of problem that that you have working with those techniques is, when you have drive trains aren't, you know, trivial that you can treat essentially as holonomic. So you mentioned before, like cars like the Ackermann model, which is the car model we talked about. So that that's, that's one example of where if you just plan a path as is go here, go straight to the right, you know, your car can't do that, right. That's not, that's not a feasible way of getting from point A to point B for the actual drive train at the robot. So you have to then take into account, not just like the the emotion model of saying this for a cap space, but instead, using in your search space, user space planner, actual motion primitives that better describe the capabilities of your drive train. And that's where we have some of the new algorithms that Neptune provides in the snap planner, which is the hybrid, a star planner, and the state lattice planner, which would enable us to set different search patterns, so that you can create physically feasible paths that robots with with constrained drive trains can actually navigate exactly and not not Not, not not not treat as holonomic.
(0:17:57) Audrow Nash
So by search patterns, human kind of simulating the position in space, to try to see if it's valid from one position to another. So like, I have a car, I'm going to try to move it from one spot to another. And I'm going to kind of project it out into the future and try to see if that path gets us there, or what do you mean by search?
(0:18:21) Steve Macenski
Yeah, so So imagining in this this 2d example, we said, If I'm at the cell, I can, the cells I can recently traverse are either up down and left or right in this forecasted space. So that that will be the motion model that we're using to, to to expand search. Because then once you go up, we apply the same for then you just keep doing that recursively. And you get, you know, eventually like a chessboard span basically.
(0:18:43) Audrow Nash
(0:18:44) Steve Macenski
So with for instance, let's, let's talk about the simpler one first. So for hybrid, a star, which the name kind of breaking that down you a hybrid. So you know, there's something different about this than a star. So you're still using this, like heuristic search algorithm at its core. And what that's doing is, is what we're first who is hybrid sampling, which is that instead of just naively saying we're going to, we're going to visit each of these grid cells just as the grid structure. Instead, we're going to treat our search neighborhood as primitives of actual velocity commands the robot can achieve. So for a car like model, you can either turn left and drive a bit, you can go straight and drive bit or you can turn right and those are the premises.
(0:19:27) Audrow Nash
The turn left a little bit turn right a little bit go straight. Yeah, I suppose backwards as well. And backwards, or whatever. Okay, precisely.
(0:19:35) Steve Macenski
Yeah. So those are the vacancies. Yeah. And then, you know, for the sake of implementation, the length of these primitives you want to set is something interesting and not, you know, arbitrarily just like driving the future a bunch because then you'd otherwise you'd end up going in circles on the left circles on the right and so straight forever. You have to decide, like, what are the lengths of these things I want to work with.
(0:19:55) Audrow Nash
Okay, and you want to swap it basically, so that you can find things okay. So you don't go all the way in that circle, if you turn your wheel a little bit to the left kind of thing. Yeah.
(0:20:05) Steve Macenski
But because we're also still working. So to go back, we're still working in this grid space, though. So we need to make sure that we go small amount, but needs to be enough. So that guarantees that we leave our current cells. So as we're expanding the search over time, we don't have situations where we're revisiting the same cell we currently had, was certainly something that can be dealt with. Yep. Yeah. Or actually, they, it's all geometry, you have read the code. And this is like, you know, six lines, where I go through step by step how to mentally drive this, this this idea. But while the paper itself of the hybrid star from from Thron, and folks from the DARPA Urban Challenge, or grand challenge, one of the challenges, they they, I think, allow you to do that to have some sort of solution space for how they select which one to use. I instead say I don't want to deal with that problem. I'm assuming I printed sufficiently long, so we always guaranteed to leave a cell. So there are a minimum square root of two cell lengths away from each other. So that way, we're guaranteed for any time that the proof hits a cell, it's always guaranteed to leave that cell in the next iteration.
(0:21:06) Audrow Nash
And are we ourselves always squares?
(0:21:11) Steve Macenski
Yeah, um, well, I guess they don't don't have to be. But I mean, the first I have to they are, yeah, gotcha. They could be circles. They could be hexagons, I've actually I did some, some research and a couple papers I've, I've just not written for publication, but written terms of like, you know, to take in regarding actually having an environmental Model B, you have a hex hexagonal cells, rather than grid cells. But that ends up having a whole lot of other issues. But it's interesting to think
(0:21:38) Audrow Nash
about. Okay, so you have this grid, and you have a way of describing how your robot moves in this grid. If you have a robot that's like a car or something you have some idea for, if you tell it to drive forward at left for this angle, or something, how it goes into another cell? How do you just it sounds like a hard representation, because I'm imagining, if I turn the wheel 10 degrees to the left, like, and I drive, I'm not in the middle of a cell, but I guess I still am in a cell. And you can use that for planning your trajectory.
(0:22:19) Steve Macenski
So rather than just so with these grid based methods, or essentially visiting, let's say, like, we go up, down, left and right, yeah, we could say that, that, you know, you're you're not busy and arbitrary place in that cell, right. Because if you blow that any, any any space, we blow it up infinitely, to go to look at it. So we look a little cell and make it really big. It's like, well, we're visiting any random place we're visiting, let's say the center or an edge, depending on how you want to find it, of that cell and each cell are visiting the center of, but when you're working with these motion primitives, you're right, we don't actually have exact angle, these aren't computers that were ending exactly in the center of every single cell every single time. So we do actually have the cash in the planning sequence, where within that, that, that that that cell, we've actually landed, and then when we go back, and we visit that cell to expand it, again, we're actually not using the center of that cell, we're again using that that position within the cell that we actually visited, which then, you know, if you compound that over and over again, to build a path, that means that it's exactly drivable in this grid
(0:23:17) Audrow Nash
space. A little confused by that?
(0:23:23) Steve Macenski
Yeah, um, so if you were, if you were to have, for instance, like a naive implementation of this, you might think about saying, let's approximate that I had this this this primitive that I'm projecting forward at a curve, and it ends at within a cell, but not in the center of the cell, it might be natural for you to think if you come from an from like a 2d grid lands space to say, Oh, well, you know, these, these, these cells, these B cells are sufficiently small that I can just store that as the as I've entered the cell. And then the next iteration, when you when you visit that cell to expand it to get its neighbors, you could then say, Okay, well, let's just pretend like we started at the center and expand it again. But if you keep doing that, then the path you get at the end of it actually isn't then respecting the constraints on that motion model. Because you're you're not accurate, you're that the small errors compound, each of each individual primitive. So instead of, yeah, so So doing that when we when we, we visit, we visit a cell and we're trying to find its neighbors, we're not using an approximation of the position that we're using. As like the center, we're using the real position of the actual cell coordinates used, so that as that compounds together that's going to be guaranteed to be drivable within the kinematics of the of the the robot system.
(0:24:37) Audrow Nash
Interesting. So how does it work if I want to go from one spot to another spot, and I have these motion primitives, which say I can turn and for these motion primitives, if I have a car? Do you have like basically infinite motion primitives because you have the whole degree of angles that the car can go or how Is it? How do you go from one spot to another? Do you know what I'm asking? With? Motion permitted? Yeah, yeah. So you have the continuous range between, say, minus 45 and positive 45, that the car can travel in, that the wheels could face.
(0:25:14) Steve Macenski
Yeah, so so we subsample that space. So essentially, we have for the hybrid star, and in the current implementation that that exists today, you have a left turn, you have a right turn either straight turn, and then you have the reverse equivalents of those things. So you have either three or six motion primitives you're working with, that are the most extreme terms, right? The the your your robot model can obtain, and it randomly samples in
(0:25:37) Audrow Nash
that space, or just uses one of one of the six will expand.
(0:25:42) Steve Macenski
So expand all of them, right. So every time we visit a cell from an A star, we expand every single cell, and we set a heuristic cost at the end value of that cell that goes into a priority queue to sort who I the the most optimal trajectory possible. And then when we go back through and visit the next cell in that priority queue, you're getting back out that that next cell that you want to go to, so you're not, that's not a goal. It's not like a the, the, it was not a heuristic search, and you're using a set A dexterous algorithm on this kind of thing. And right, every single aisle six, aisle six, aisle six, and you just be spreading and spreading and spreading out, which is problematic. That's where your priority,
(0:26:20) Audrow Nash
so which is very clever. We're Yeah, it seems like you're going to do it.
(0:26:26) Steve Macenski
Yeah. That's pretty standard for a star's that that's kind of, at least to my understanding, that's how most people do it.
(0:26:33) Audrow Nash
Gotcha. So you, but it's not. So you're only using these primitives of like extreme angles. So it's all the way this all the way left all the way right or straight? If we're just doing the forward case. In this case, if I How do you arrive at a solution where the robot doesn't drive like incredibly manically doing all the way left all the way? Right, on its way towards its goal?
(0:27:02) Steve Macenski
Yeah, so that's where we have a smoothing step at the end of this. So this we do this for
(0:27:07) Audrow Nash
the house space, basically, this is this is for you arriving at a solution. And then from that solution, you smooth it.
(0:27:15) Steve Macenski
Yeah. So so we have. So we thought through this, we have now a path that is rival maybe, yeah, little suurvey. But drivable. And so what we do is we take that path, and we put it into a smoothing algorithm, which will, you know, take out some of those extreme turns and make them a little bit more, or I guess, less aggressive. And then also goes through the boundary conditions to make sure that those remain tight during the entire entire process. So that part
(0:27:43) Audrow Nash
of boundary conditions and make sure, yeah, what do you mean by that?
(0:27:47) Steve Macenski
So sorry, to type that word. But so essentially,
(0:27:51) Audrow Nash
we mean, optimal at night, but what do you mean, like a good solo error? Or I don't know what you mean, actually.
(0:27:59) Steve Macenski
Yeah, so we have. So in the hyperstar, paper that is published that, you know, an autonomous driving company might use, for instance, has a pretty advanced move there. And in fact, this is the some of the information we used to use within the spec planner, that takes into account, you know, keeping spacing, the sames, which helps us smoothness, using our Fermoy diagram to help you drive away from obstacles and then limiting the curvature so that you you keep those constraints that you had during the planning process. And I think a couple other terms as well. And those then spin through an optimization problem to give you an optimal solution. Given those those cost functions, this ends up being a very computationally expensive thing to do. And it's ultimately why I removed that, and instead, Swift changed up a bit of the internal systems of how the search the search, or restrict operates, so that we don't have to use that kind of smoothing technique. But what we use instead is essentially like a super, super naive, smoother, like you might find in like, I don't know, like, one of those Udacity nano degree kind of things, we just go through very simply and say, like, let's make these things as close together as possible. And then iterate until we have very little change left in our in our in our outside path between
(0:29:12) Audrow Nash
these things, which actually things have close to each other. What I mean by that
(0:29:17) Steve Macenski
each of the the points on the path. Okay, so
(0:29:23) Steve Macenski
in opposition to land, it's maybe not as intuitive to somebody who isn't who's worked with it. But it's something that that's generally a very successful way of smoothing is just say, all my points my path, I want to make sure that they're each as close they're evenly distributed from each other. And when you iterate over that and try to minimize that cost function, you end up actually smoothing out the path from from turn to such which is a convenient and easy Oh, that's a waste smoothie. I
(0:29:50) Audrow Nash
see. It's basically like the shortest path from something is straight to it. And so if you're making all your points closer to each Other, you probably are working towards that. Because they will be further. That is a cool heuristic to use to smooth. Yeah. Interesting. So if
(0:30:09) Steve Macenski
you had no no other, other things doubting that solution, and essentially what you're right, and just being exact, perfect. Just ignore everything. Yeah, yeah. But that's why I have to add in things like, collision looks otherwise Yeah, definitely all into it, for sure. Yeah. And then, because we derive it kinematics, these will path we also want to add some weights on the original path itself. So that, like, if you have a little squiggle that could be made into a straight line, we want that to smooth out. But we don't necessarily want something to turn from a nice smooth turn, just like you know,
(0:30:41) Audrow Nash
Arvon drivers kind of thing.
(0:30:44) Steve Macenski
Yeah, where it's no longer actually also valid by the current motion model. And so I have, I tried this more in the actual the readme file. And I don't think it's worth getting to the specific details of this at the moment. But essentially, you can, you can, through logical proofs, find that the only time when you break feasibility is during sharp turns at the beginning and the end of the path. And so those are the boundary conditions, okay, so the boundary conditions totally break during this process. So if you were to try and drive the robot using just that pseudo described, both the start and the end would not be guaranteed to be drivable. So we do is we have an additional post processing step on the beginning and the end of it, where we use the Kinect motion model of the robot that we use previously during that planning technique to take the Star Pose and then sample across the new smooth path until we find a the shortest path that goes from the start the actual starting point to a point on the smooth path, or to give us a drivable connection point between those two areas. And so by doing that, then we end up with a smooth path that also then then meets the boundary conditions of drivability, and therefore the end path is
(0:31:51) Audrow Nash
drivable. So I don't know that I fully understand that. But can you think of it as just if the start and end will be wrong because of some weird boundary conditions in the optimization, you just extend them a little bit. So you consider like a little before the start and a little after the end? Assuming the motion primitive, just continue with something like this or?
(0:32:12) Steve Macenski
So I actually I tried that it didn't work. But yeah, I did. I did try that I tried a bunch of stuff before I found the solution that we currently use. So essentially, so this is where it gets it gets a little out into the weeds. So within these planners, we also have a concept called analytic expansion, where occasionally during our search process, we take our search position and our goal position. And we use a library called Oh MPL, the open motion planning library, which is commonly used for sampling bass planning implementations. They also have a very good implementation of the Ackermann models that you can use both for sampling but also in our
(0:32:51) Audrow Nash
model shows, our models are for tire with two steering ones, the front ones here.
(0:32:59) Steve Macenski
Yeah, yeah, like a double bicycle, or whatever. Is that? Yeah, yeah. So we can use that to then say, here's our start point, here's our end point and points opposes rights, their orientation associated with them, and then give me a path to where I am to where I want to go. And then we then check for collisions to make sure it's actually valid. And most the time during your your planning, right, that isn't valid, right, it's going through a wall or something. So you throw it out and continue searching, and then maybe another 20 iterations to try again, until you find a solution. And that helps a lot with the speed of the planner. So most of your planning time and a kingdom activities will planner is spent actually on just getting to that exact goal, position and orientation. So if you can speed it up by having something they'll find exact curve fits, that helps you quite a bit. So we apply that same concept, then after the smoothing process to say, here's our starting point, which is fixed as our as our starting point or our end goal point, which are both fixed. And then we basically say every, I think it's like five or 10, or some some number of points on the on the smooth output, we find the feasible solution between those two things, and we measure the length of it, and we check the feasibility for it. And we do that for you know, up until basically, you know, for I think it's bounded, but you know, far, far into the into the path until we find the shortest collision free path possible. The shortest point here is actually very important is if you were to try to find the feasible solution between your starting point and a point that's, you know, just just directly in front of it, that's how shifted just a little bit away, you don't get gained this loop to loop maneuver to make that feasible. You don't really want your robot doing a loop de loop. So we check so we continue to march through the path a few times until essentially that loop de loop goes away. And that now be a shorter path, then that full circular path
(0:34:41) Audrow Nash
I'm thinking of if you have like a piece of thread or something and you try to move it, you grab it in the middle somewhere with two hands and then you move it close to each other it'll do a loop and this is like your hands are the points and it has to do a loop to go through both points. Yes kind of thing.
(0:34:58) Steve Macenski
So you Yeah, exactly. So we we, we don't just just randomly marks through a bike five points or whatever, there's, there's a more principled approach we use based on on, you know, certainly the fact that we know that we build is exactly a circle. And therefore we can we can select interesting parts of the path based on that on that circle, the quarter circle. And yeah, that that's how we constrain the boundary conditions to be to be reasonable. And while that sounds very, it sounds like a very naive approach to it, I will say is very effective. And other things I did, which were maybe a bit more principled, did not work nearly as well. So this is the the solution, ultimately, we select strikes
(0:35:35) Audrow Nash
me that that's often the way in robotics, where it's like, this simple thing works really well, and is fast, and solves most of the problems. Like a lot of our heuristics seem to be this way.
(0:35:51) Steve Macenski
Um, yeah, yeah. Yeah. I mean, yeah.
(0:35:55) Audrow Nash
So yeah, I mean, there's obviously exceptions. But I'm just amazed how far you can get with simple representations for a lot of things. The so is this. Actually, once you have your, so you have your grid world, in your grid world, you use your motion primitives to kind of explore it for to get from one cell to another, and then you get this rough path that goes from start to end. And then once you have that rough path with like crazy, if you're driving like crazy, all the way left all the way, right turns, so it's all jagged. Once you have that, then you run this smoothing function on it, and the smoothing function is going to smoother on it. And that's going to make it so that you have a smoother path. And the clever thing about that is you look for what makes the points the closest distance because that's more likely the or that that is the more optimal solution. Right? Is that correct?
(0:37:03) Steve Macenski
Yeah. More or less? More or less? Yeah, don't wait, don't wait. yet. Don't only crush important, Vic, there's so when you get that path out from my brain star, it's it's not nearly as this crazy thing as possible. Because right, we're talking about this really small motion primitives. So if one thing is it's a little tweak, tweak, you know, you barely even notice it without a path when you have something happening.
(0:37:24) Audrow Nash
It's true. But if you had a robot, it would be crazy. Like imagine being on a car that was following that one, it would turn all the way left, then all the way right, to try to do something crazy. So but
(0:37:35) Steve Macenski
you also have like a local trajectory plan that's processing, that information has to be smoothed out yet. But before we get into that too much, the point I want to make there was that, because they're small, they really don't make that much of an impact it like super small, like like to to grid cells wide, you know, kind of a
(0:37:52) Audrow Nash
big grid. So we have an autonomous car. How big is a grid cell? In this
(0:37:58) Steve Macenski
application? Yeah, in my example, it's five centimeters, so it'd be five and 10 centimeters in size. Yeah, so it's very small. And because we're we're working on with optimal heuristics, it's not like we're like, here's the free space, and we're Whoo, you're wobbling all over the place. So it's very much like being driven towards that goal in a very reasonable way. And so that output path is actually already pretty, pretty smooth and pretty good looking. It's mostly just like small discrete discrete planning errors that you're dealing with, because we are working in this grid space, not continuous coordinates notes. So that kind of feeds a little bit to the heuristic functions that we're using here. So like I said, we're able to simplify the smoother quite a bit, because we made some changes internally to the planner for how the search is performed. So that the two ways that we approach that is in our traversal function, rather than just just looking at the cost value, and the length of primitive, we also enter penalty functions so that we penalize the path from making those extreme turns that if it starts, if it starts to turn, we penalize it stopping turning or penalizing, if it's going straight, to start turning just a little bit, you know, a percent 2%, you have something very small, which ends up making us the things are quite a bit smoother on the output. And the second thing we do is that rather than just using like a distance heuristic to say, like, here's how far I here's my goal, here I am, just just just compute the Euclidean distance between those two points and use as a heuristic function. Instead, we use to to heuristics, though take the maximum on one is the actual length of the path that would represent in Ackerman space. So we take using that same idea we talked about with the analog expansions and the boundary conditions, we take our current pose, our actual pose, we find what that minimum curve would be. So we set the maximum value and the value of a a new to this to this implementation that I'm writing paper about actually early, where it takes to count the obstacles within the space the same way the the Highbury star has a similar heuristic. But it takes that into a cost aware environmental representation. So rather than the smoother having a cost function in it, that takes into account the for noise diagram, which is similar to a potential field, to smooth out the path away from obstacles, we have our actual heuristic function that's driving the search take into account the search or the cost. And during that process,
(0:40:26) Audrow Nash
and the potential field, that's just a way of keeping things away from something else, right, it basically says Don't come close to this obstacle.
(0:40:36) Steve Macenski
Is that true? Or? Yeah, yes, essentially. Yeah. So so the NOI diagram is computing a potential field where costs are high or low neuron schools and lower high in the centers of Ioway spaces. So essentially, it finds these like, is a graph of the centers. And the notes were, like spaces like, you know, collide with each other. Which if you're dealing with a stack environment, like if you're dealing with, if you're planning on a map, that does not change over time, like you might do an autonomous driving application, that makes a lot of sense, your EQ compute that once super efficient, knows exactly where things around the center works over time. But when you're working with an environment, mental representation that's changing constantly, there's a bunch of data
(0:41:19) Audrow Nash
around or something and you're driving a robot. Yeah, okay.
(0:41:23) Steve Macenski
Yeah, you have to, you have to compute that every single iteration, which is super expensive, but we already actually actually have potential fields that we're working with, it's already pre computed as part of the inflation layer of the navigation stack. So we actually use the potential field computed from the inflation layer inflation to Yeah, so so we inflate from from political obstacles, like like my couch behind me, we're going to add a potential function off of that, so that we have more cost, the closer you come in proximity to that, that obstacle. And then, given the radius of your robot, if it's a circular robot, we apply then a lethal ban of cost in front of it so that, you know, if the robot would be centered at that cell, the part of the edge might be heading heading the couch. So because we have this this potential field already computed, we can use that to drive the robot away. The robot, the heuristic search away from the muscles just to be guessing.
(0:42:21) Audrow Nash
So you have because you have these, sorry, what did you call them? What was the cost? Or was it the potential fields? You have the potential fields, and you basically say, this area, because it's static, we know we don't want to go near and that one area will be defined? That's what you're saying?
(0:42:46) Steve Macenski
Yeah, yeah, we also apply that to, to new obstacles to right people and stuff, you still want to run out?
(0:42:52) Audrow Nash
Yeah, for sure. And it's but it's similar, you can use this kind of expensive computational, or this computationally expensive representation for static objects, but then a different one, that's not as expensive for moving objects.
(0:43:07) Steve Macenski
So the potential field that we use for an inflation layer is very quick to update. So it's actually not very computationally expensive at all. And that's why we use that within within the, the planning for discipline, rotation, and not not necessary what you do for autonomous driving. Another reason why we do this, which is like heart, it's a little subtle to explain, but so autonomous driving, in general, you want to stay in the center of the lane, or you want to stay in center of a space, which you are allowed to drive to maintain maximum distance away from from everything else around you, or Cynefin of your lane since. But in the case of autonomous robotics, that's not always optimal. Like sometimes you do want to do that, that that's really true. But a lot of times, maybe you want the robot to hug, hug a shelf, so that things could go in the opposite direction. If you're working in the scene of a lot of robot systems, or there's just different behaviors, you may want to have that curl in the NASDAQ, we allow you to to, essentially by tuning that inflation distribution on that inflation layer potential field. And so basically, because we're using that information in the planning itself, that means that your plants are going to actually respect the tuning to have this the tune system behavior you want to have based on the current the plantings that you that uses that potential oil, so you're actually getting more flexibility where you could still define system to operate that way if you want to, but you don't have to you can.
(0:44:29) Audrow Nash
That's cool. How does what do you call it inflation?
(0:44:35) Steve Macenski
So essentially, it's inflating the cause static obstacles or dynamic offices in the city. Yeah, inflates the cost. And in the the grid map representation.
(0:44:44) Audrow Nash
I was thinking inflation like in the economy or something and it's like, I don't I don't get the map. Okay, okay. It's inflating the cost, it's making it larger.
(0:44:53) Steve Macenski
Another way of thinking about it is risk which I think is really the the proper term I should be using here. It's it's the risk that look Let's say that the race is cool.
(0:45:01) Audrow Nash
Okay. And then so when you are driving the right when you are planning, what you'll do is at each location, you evaluate it for its potential and or for it for like, how good if the how good is the position? And one of these things that you take into consideration after being like, is this valid or not, is the risk. And so if you want to minimize the risk, then this is used in your priority queue that you mentioned to to sample more feasible and lower risk locations.
(0:45:35) Steve Macenski
Yeah, and that's another fancy function that has that can be tuned. So right, you can you can sense that. So I'm super sensitive to risk. And so I always want to stay in the center of aisles, that that's really important to me. So you can say, you know, I mean, obviously, nothing would ever give you a diversity inclusion path, because that that's invalid, or they're always going to be valid. They can say that maybe like, oh, yeah, I'm okay if it if it's a little closer obstacles, because I want to be able to, like, you know, hug a wall or hug a shelf or something, for instance. So these are all all tuple behaviors you have access to, because we're using the cost information in the search itself. And using the inflation layer to kind of see that based on the behavior of the robot system, you wanna you want to how does
(0:46:12) Audrow Nash
how does this work? Is this only true of known maps? So if I have a known cost map, then I can use these? Or is it is the ability to tune inflation or risk of things? Is it something that I can do as I am building a map of the environment?
(0:46:32) Steve Macenski
That's a good question. Um,
(0:46:34) Audrow Nash
maybe you have, like heuristics for features of the environment? Yeah.
(0:46:38) Steve Macenski
Yeah, I mean, both, I think I mean, if you have some, I mean, if you if you blindfold me, and then kidnapped me and throw me somewhere and say, Build me a map, and like, choose some parameters, right? Like there are there is a reasonable band of parameters space that you can, you can select, and you know, if you're in narrow spaces might be better parameters than if you're a super wide open spaces. So if I do nothing, it's like, right, yeah, I couldn't tell you. But usually, when you deploy a robot, you have some general understanding that kind of
(0:47:03) Audrow Nash
working so you can parameterize it based on heuristics about what you know about the space, if it's very open, or if it's very tight, you might set parameters differently that would change how these how the inflation of the costs map is generated? Yeah. Yeah. Can you tell me a bit about the priority queue? I think that's really interesting.
(0:47:27) Steve Macenski
I mean, that's just a that's just part of a star. Right. So I think that that's, I haven't any, any resource talking about here. So search would would explain it more articulately than I wouldn't the moment.
(0:47:37) Audrow Nash
Gotcha. Actually. Yeah, I guess you're right, that it is just a normal thing of a star because you're only searching a subset on your way to the solution, a subset of possible options. until hopefully, your or your solution.
(0:47:51) Steve Macenski
Actually, there's other ways of implementing it as well. That's just at least to me, that was just my knee jerk reaction thing, I have to do this thing. All right, I need to sort by lowest cost, then oh, ergo, prior to me. So easy. But
(0:48:03) Audrow Nash
so then, for you to once you have this path that started as kind, of course, but I guess the grid cells are so small, that they it's not that course, I guess, or it's not that jerky. But you have this you perform smoothing on it. And then you map it to motion from or you match it, map it to velocities? Do you get this kind of for free with the motion primitives? Or how do you relate your path to motion to like the velocities that each of the wheels should go?
(0:48:35) Steve Macenski
Yeah, so we have, that's where the the local trajectory planning also referred to as the controllers within the Neptune ecosystem, kind of kind of operate. So we use as referred to as the hyper planning scheme, which which move it has recently started to migrate over to as well. But this has been a built in feature of the nav navigation stack since the very beginning. So we separate the concerns of building a route from where we are to where we want to go, we separate that from the process of saying, Okay, how do we follow that path? And so the follow the path part is where we take that path that currently exists and we say how do I generate a trajectory? Which the difference between a path and a trajectory is the philosophy and or acceleration and or jerk and or other derivative information about the dynamics
(0:49:22) Audrow Nash
that's interesting involved? So I wonder if you were traveling like really fast might have changed the path that you were able to take more things like this like is this concern is it always fair that you can divorce the two
(0:49:38) Steve Macenski
I'm in I mean it can you can you always know anything all the time? No. So yeah, there's gonna be some sort of like weird edge cases, or Yeah, like, but generally speaking for a reasonable mobile robot system that that that's gonna be a valid way of operating. I while I'm not an expert in the autonomous driving space miners Sending of what they do is primarily large scale route planning. So rather than having like a path, like we have when we have like, you know, doing the grid search, they have like as far saying, like, go to this intersection turn left, and then you know, go straight and for a long time, then do turn, right, whatever. And these are just basically like, like event markers for when it changed during doing your events. And you have your local trajectory planner, which are computing, like the how do I like stay in my lane? Or how do I pass a car? So I can meet that route and given need? So there is no like
(0:50:29) Audrow Nash
path planning half person?
(0:50:33) Steve Macenski
Yeah, exactly. So in that situation, then you basically only have the velocity planner in that local time, per se, you don't have the goal path planner at all. So it doesn't always even make sense. Even have even even had the concept of a long term, you know, path at all, that there are situations where, especially if during doing district waypoint following that you start
(0:50:54) Audrow Nash
to so let's see, I want to get to that in a little bit. Because that's interest, I guess, when. So when is path planning most useful? I suppose, like when when do you really need it rather than just kind of waypoint planning? I guess anytime you're using web point, waypoints, you have to get from waypoint to waypoint. And so then you have to do path planning? Or how do you think of it? Yeah,
(0:51:19) Steve Macenski
it's good question. I guess I haven't thought about in those terms before, I guess, I think it usually like certain classes application. So help, like, navigating around like, hey, means it doesn't matter what you're going for, I guess, if you have like an A last mile delivery application, where it's like, you know, somewhat analog to autonomous driving system, where you have just a such insanely large environment, it just doesn't really make sense to have to dynamically plan from point A to point D, that's like 70 miles away, or whatever, you know, you, you, you're seven, wherever it's somebody is a little crazy, let's say seven miles away, are going on sidewalks. And so you might want to do some more thing where you have some, you know, Google Maps data, or OpenStreetMap data to say, here are the intersections and we're going through a sparse route based on on intersection information. And then you're basically just locally just saying, Go Go towards the intersection, stay on the sidewalk, dynamically go route things, and you don't really need a depth path between those two spaces. So A, but then if you're working in like a warehouse space, the two kinds of things you could do is also have like way points on critical points at the end of aisle ways, or whatever, and have a sparse route going going between them. But then you can also do a dense, you know, waypoint or a dense path planning methodology to be able to dynamically go around a lot of a lot of areas, and I think it just kind of depends on what kind of robot behavior you're looking for, you know, largely static, staying in these lanes, staying in the straight lines, or being more dynamic and able to, you know, navigate free space more flexibly
(0:52:57) Audrow Nash
Stacia, and then going back through getting the so you separate out the planning of the path to from planning the velocity commands you were talking about.
(0:53:10) Steve Macenski
Yeah, so So this is all kind of like the preferred is like, like hybrid planning, which, which is this technique of of splitting these things apart. So then within within the NASDAQ, we have several different options for local trajectory planners within take that path planner, output of the of the just the kinematic plan up, here are some points I want you to follow and convert that into that velocity commands the motor, the most, the most simple to explain is probably the regularly pursue controller that we currently have, which is new to NAV two, not available. And last one. And without do essentially, is pick a point on the path, that's a at least a certain distance away, maybe up to a maximum distance away, it could dynamically change depending on the robot speed for stability reasons. So talks at some point on the path and says, What is the curvature required for me to if I were to pick one Velocity Command to drive that to hit that point exactly on the dot, and then if I curvature, from a curvature in your current position, you can then back out what the velocities do need to apply in order to achieve that command. And then so you say, okay, robot, do this drive this this direction, you know, awesome start point on the path. And then the next iteration, let's say, you know, one 100 of a second later or 120 of a second later, you you say okay, well now what now what point of the path am I interested in and you do the exact same process over and over and over again. And as you're driving forward towards that goal, that goal is constantly changed on you see, are essentially following this carrot over over time. So that that's, that's that's one very simplistic way to convert your path into philosophy commands, then for your lower level of Tronic motor controllers to use a PID controller or whatever to convert into voltages,
(0:54:55) Audrow Nash
not just a current or whatever. There was a So how do you? How does it have to work with it? We talked a little bit about potential functions for the inflation. But how does it work with a bunch of people walking around or constantly changing environments? Just re computed all this? How do you? How do you think of it? Yes, this
(0:55:26) Steve Macenski
sort of the sensor data is really important so that you have information about what's now changed the world. And once you have that, that information, you can input that into cost map. So this might be from sensors like sonars, or RGB cameras provide you depth maps are going to give you points as well as color information, or 2d LIDAR. So for the vast majority of things, navigation to operate on a box with primarily depth, depth values, it wants to know where this thing is relative to a sensor frame. With that said, it's not hard to integrate modern AI techniques into as well we have plugin interfaces. So this is all possible on the sun as well. But once we have this dense information, we have input that into our into our grid map representation. And when we say that we had a sensor reading, hit hit a person is walking, we can say, okay, there's something here and put that into the grid map as a lethal obstacle. And then we can inflate then our cost map with with that inflation layer we talked about, which adds that potential function to it. So then the next cycle, when we want to compute a path plan that's now represented in our environment, and it can navigate around it, then most of these algorithms have clearing mechanisms so that once this this obstacle is gone for the scene, then we can we can clear it out use a ray casting from the sensor to that to a clearing measurement. So basically a measurement of nothing, right? If you don't see it anymore, it's therefore gone. So you can raycast from the sensor frame to that to that point, and clear out any cells in the middle is clearly these are going to be unoccupied. If you're able to view something much further away everything between that measurement and okay, it's clear, which then removes that lethal obstacle, which then removes that inflation perspective. So the next time you plan it updates in those days are
(0:57:10) Audrow Nash
gone. Gotcha. So is the general way that you run this, you run it over and over again, and constantly are planning and replanting?
(0:57:20) Steve Macenski
Yeah, you don't have to. But that that is that is a that is the default behavior. Currently, I'm actually there's some pull requests open. And I'm working with some folks to kind of refine the behavior additionally to remove some some oscillation behaviors that can sometimes exist in that situation. But essentially, yeah, every one second or so we're computing a new new path given the new representation of the environment. But this is where the, the way that naturally structured differently from the navigation stack starts to really be helpful is that we're using behavior trees to orchestrate these different capabilities. So the environmental or non governmental is with the the planner, the behaviors that control the local Treasury controllers, these things are all controlled by a central system to control the flow of data so that we can manipulate how the navigation system is operating. So this is done through what's referred to as a paper tree. And I definitely recommend reading more about that, then I'm going to describe it in less than an hour.
(0:58:21) Audrow Nash
Like, yeah, imagine.
(0:58:24) Steve Macenski
Yeah, so imagine you had like a tree like structure like you might find in a computer science. Course. And if you have a computer science background, I'm sorry, read a book about it. That can't go any any any less detail than that. I've heard so good to have
(0:58:39) Audrow Nash
these. I've heard good, differently, like video games often use it. And the behavior seems kind of it's like fallback conditions. So like, if you want to get into a house, a behavior tree, you might say, first try the door. And then if that fails, you fall back to do I have keys. And if that fails, then you see if you can knock and if that fails, no one comes then you break the window. And like this kind of thing, where it's fall back, fall back, fall back fall back all structured by the I want to get into the house thing, or Yeah, yeah, I think
(0:59:17) Steve Macenski
that that's that's one of the primitives you work with. Yeah, fallback node sequence nodes are kinda too too big control flow types, they can also define other types with more more use case specific here, but essentially, this this is this is all modeled as a giant tree. So you have like a root node, which then has different branches to it, which then have our own node for branches to it with different methodologies for controlling the flow of data between are sort of controlling the flow of the nodes execution model by these different fallback or sequence controller nodes until you reach your leaf conditions, which are leaf actions, which then are you know, do something or do something valuable thing, okay. Yeah. And so all the navigation behavior is then defined through these behavior trees, which then can be reconfigured to do different things for you. So, if you don't want to replant every second, you can replant every 10 seconds. If you don't want to replant ever, you can have a so that you use only one sack path. If you'd like to replan only when that current path isn't valid, you can also do that as well. And these things are all really easy to implement. These are all through plugin interfaces, so you don't even need to fork and have to, you just need to implement your own custom plugins. And then, you know, run your own custom behavior tree model, and you still use the available binaries from apt to, we try to make it as simple as possible. We offer significant documentation about this because we understand this gonna be a new concept with users, where we can't talk about what our paid for three is, why should you care? Where's the node to give you access to why should you care? What are some of the examples behavior trees we offer you? Why do you care? And then what is our what is the default behavior and then walking you through exactly how the input or the the execution model of that behavior tree goes through on tick by tick basis, you can have a reasonable understanding about how the current behavior acts.
(1:01:04) Audrow Nash
And so this helps your program figure out what it should do when by finding certain events? Or how does it like if I if I have, say, we have a simple example of a robot getting from one position to another, and a human gets in the way or something? How would that look kind of in behavior tree space? Um,
(1:01:31) Steve Macenski
if so we don't have the environmental data, set up the data three interface, so that that really wouldn't be important for at least how the current behavior or navigation works? Or what would be a good example. Like if we can trace through some of what a behavior tree does, for some simple case will be a good example for that. Yeah, let me just Let us pull up an actual example. So not just typing stuff up on the fly was relatively simple. Okay. That's not simple. Let's do distance. Okay. So for instance, like, we might have a sequence which was in Beaver tree land a sequence is say, do this task is successful, do this task is successful to this task, versus a fallback which say, do this task, if it fails, do fallback to the next one, if fails fallback, the next one, etc, within within all the the children nodes on that control flow of the sequence or the or the recovery? So they might be operating to do to very simply just to replan every second, or every time you, you, you or you travel, for instance. So you might have a Sequence node that says, Okay, start just because sequence out here, I say okay, go by first child, and my first child would be a decorator node, called a distance controller, a decorator node is something that has one child and defines the behavior about when that child is tick. So for this instance, it'd be, it'd be checking if we've moved one meter. Another example would be if we've, if there, the time has moved by one seconds, or that my speed is 10 meters per seconds, or any sort of random condition you want to set to define whether or not you want to do this child. So we have our SQL code, we go to our decorator to say, do we want to actually we compute a path? Yes or no? It says yes, there we go down to that, that that compute, compute a new path, name for tree node, and that's going to put them that path on to was referred to as the blackboard, which is essentially a parameter system that is shared across all the nodes and the behavior tree so that any node can input or change or modify parameters on the other node. You're not use NG for state with Eva trees ever. But it should be it should be the state should be defined to the tree. This should be mostly parameters or values that you need to be in terms
(1:04:04) Audrow Nash
of state with. Or could it be like we've traveled one meter? That would be something you put on the blackboard?
(1:04:14) Steve Macenski
Yeah, as long as it's not state about the execution of the behavior tree like you shouldn't have like a blackboard node that is saying check check the blackboard for a one or a zero for without execute. You could do that. I mean, you could I mean, nothing stopping you. But you shouldn't it should be based on on computing values separately so the baby trees think about his four parameters or dynamic values but shouldn't
(1:04:37) Audrow Nash
be an example of something you could if you needed to.
(1:04:42) Steve Macenski
Yeah, so So example here is like the path so once we're once we're in our sequence in our we go to our decorator and compute the path, the way that we get that path computed to the next node of the tree to Okay, follow the path is through the Blackboard. And so because this node and this node can't draw To talk to each other, so they can only pass values. So we have our Depew path, update the Blackboard variable for the path. And then when when we go back up to our tree back to our router into our sequence, and we say, okay, we compute a path that would go down to our fall path node and check to say, hey, there's been an update to this, this path block, we're variable, I'm gonna throw that to my local trajectory planner. And then it's going to compute the the control effort required to execute. And then if our, what's nice about the the decorator node, is that because if it fails, it says, No, we've not moved one second, it's going to fail and go back up the sequence and be the Sequence node only continues going down to children. And when it when the last one succeeded, it's going to exit and say, Okay, I'm done. So it's not going to try to execute the default follow path, because it's
(1:05:49) Audrow Nash
so it doesn't try anything further.
(1:05:51) Steve Macenski
Yeah. So as a result, yeah. So so in a very simple behavior tree with like, you know, 1234 nodes, essentially, we just, you know, just did planning, dynamic replanting of a of a, of a path and then throw it to a local trick and learn to execute. Yeah, that is the at the core level, the most simplistic way of representing dynamic navigation.
(1:06:16) Audrow Nash
How does this compare to like a state machine way of representing this information, I assume that there are advantages to choosing to do a behavior tree to represent kind of these transitions. Um, yeah, it's
(1:06:32) Steve Macenski
preference more than anything else. So they're, they're both they're both execution models, you can use define how you like your system to behave. And there's not necessarily a benefit of one over the other. It's more about preference. And maybe certain applications are more simple to model one way versus another, for instance, that the move base was state machine back in ROS 1, you know, it statically set. So it wasn't using some sort of like external state machine library, that's reconfigurable. All that kind of good stuff is, you know, hard coded, here's our, here's my state machine. And that works pretty well. And it's pretty easy to define simple state machines to do something like this, this type of replanting, essentially, why just described was the moon base node, except for we're not replanting every meter we're planning every second or I think it's configurable. So not often, you won't do it. But essentially, that that's the exact same thing. But what we just described as favorite tree is as an XML file, which is loaded one time, dynamically allocates to nodes, and then an execute. So you can change your behavior over time. But there are other state machine libraries that exist that could do those exact same things. And almost anything you can model as a base tree, you can also model as a as a hierarchical or finite state machine, and back and forth. But we just prefer in general working with behavior trees, we actually ran a study in LinkedIn last year, asking users what they prefer to work with. And we had 70% of the people say they'd rather work favorite trees. So this is why we were interested in trees. But certainly there, there are other options, and they're all valid under the sun. It just depends on you know, whatever makes
(1:08:00) Audrow Nash
it work. Okay, so nav two has, it's a there's a lot of, there's a lot of parts to this. I'm, so it goes from and actually, we haven't talked about the cost map that you can generate through this. So in so in your initial mapping, kind of more slam like so actually creating a map, while you're driving around it nav through does this as well.
(1:08:34) Steve Macenski
Yeah, it doesn't do it through like the the slam library we use is not actually built into the navigation stacks of bright you go to github.com, you know, walk by a slash navigation to is there's no slack implementation there. But we have a strong integration with a library called slam toolbox that I largely built. Well, is it simply robotics then continue to refine while Samsung research in the first few months there? And so that's what we use primarily for January maps, that this is where all our tutorials are based off of this binaries release for it. So from an end user, basically, it is part of the application stack. Yeah, but it's it's sufficiently complex that should not belong in the same place. I think some toolbox has almost as much code as half an hour just by itself. So it's very complicated. There's a lot going on. So it's worth keeping these products separated. But also actually like to remove a MCL was an attack as well, that the adaptive Monte Carlo localization system, essentially, it's the methodology for localization in an existing map. And a MCL is the very common like probabilistic robotics implementation of this of this algorithm. So it does it works on 2d maps for these 2d occupancy grid. So we've been talking about this grid space discussion prior and through so I want to actually remove that from the stack into potentially different repositories that I want to get to the Get get the point across that nav two is not specific to me. navigation and does not require 2d Localization 2d and mapping systems, if you want to work with a 3d map in a 3d position isn't totally valid, you want to work with a visual slab system visual localization currently valid. I think that even last one got a bad rap about that, because a MCL was included in the stack. And there were examples of other localization systems within the stack to kind of showcase that you can utilize different methodologies. So I'd like to remove that. And it also adds support for 3d positioning methodologies as well. So that way, you know, we get the point across that you can use really anything you want. This is just what we pride by by default, because it works well. And this is what other users are.
(1:10:41) Audrow Nash
Yeah. So if you had, say, like a quadrotor, or some sort of flying robot, you could use nav to just as well, because of the plugin architecture, in the sense or,
(1:10:53) Steve Macenski
um, I, maybe I don't know, anyone using Yeah, this is mostly ground robot systems. So it's mobile robots, surface marine applications. So that kind of stuff. I mean, it's meant for like, largely, like, like you're working on a surface environment. Our environmental models don't really work with like three dimensional spaces to have like large height variations. So if you have a drone that's flying very high, or you have like a submersible robot going very low, you don't have a lot of you don't have the capability right out modeling those gotcha effectively. It's not that the plane runs couldn't be made to work in three dimensions, that it's just they work. That's not the the current niche that we're trying.
(1:11:35) Audrow Nash
So I'm a little confused about the serve the last point, then you're you're saying you want to remove this boza MCL? Are you and it? Yeah, they're just quite understand. So you were so so because I was assuming this was primarily for 2d planning. So if you want to move on the ground, or something like this, are you saying like, with different floors, or?
(1:12:00) Steve Macenski
Yeah, so So these are two different topics. So what what is that? How do we represent our space, and right now we do this through a 2d planar cost presentation environment. So this is a 2d plane. But you can add layers to represent things like like height, height changes, so you can work on other other train models. With that said, it's not really optimal and longer term, we want to be replacing or adapting cost map 2d, to be operating both height information, as well as risk information as well as, you know, AI detection based methods, methods, so that you can, you can work it all out with all these kinds of spaces. But at the end of the day, the natural system is serving the niche of surface robots. So they're in a surface and the ground surface on a lake top, you know, but generally speaking, working on the surface. And then there's the other problem of localization. So right now we do primarily 2d Localization using a 3d laser scanner, which then takes like a slice of your environment, right, you don't see a couch, you see a straight line that were on the edge of the couch. And so that has, you know, you, right, you're throwing out a lot of information, right, you're throwing out the the texture of the couch, which might have some interesting features for throwing out the shape of the couch outside of that, that just that 2d slice of the world. So we want to build support, localization systems that can take advantage of more contextual information of the environment than just 2d laser scans into 2d Kind of like, flat Earth world.
(1:13:26) Audrow Nash
NACHA. Quick, cool. Let's see. I want to segue a little bit. How has, how has it done for you being so involved in the community? Can you just talk a bit about your experience, working with the NAFTA community growing? Its working with developers and getting an open source project? Yeah,
(1:13:48) Steve Macenski
no, it's definitely been been different. So it's, it's a different challenge to work with people that aren't paid to work for you. thing, right? Like, you know, these aren't, these aren't my colleagues where we have like, we get this thing done, we're all being paid by this company to accomplish this particular task. And I'm not managing people where I can just say, here's what you must do, because we must get this thing done. Here's the project, we're working on this this month. It's a lot more. It has to be collaborative, and you're building coalitions with companies and you're building collaborations with individually individual people to get things done. And so, you know, that's to be a lot more more mentorship going on. So I have to be, you know, not just assigning projects, but talking to people about what they're interested in learning where they are right now, where they want to be going, what kinds of things interests them and engage them and try to carve out project like individual projects or small sections of larger projects that they can work on that they can have that those personal growth elements, but then also contribute back to the project at large. So there's a lot of mentorship that goes on that I mean, would you also would happen in hopefully corporate environment as well, but it's a little bit different. Because no one's being being paid to work on this. For instance, these are all just volunteers with their off time or Maybe some companies who are giving engineers, you know, every other Friday, you know, four hours to play around with stuff or write documentation or help out with various things. So it's a lot more collaborative and building building coalitions of multiple companies to work on larger projects together. Since these are being worked on primarily on a volunteer, part time basis, so we're working on some of these larger projects, it's often it's either just me and myself working on something, or it's me building coalitions of, you know, a couple students with a, you know, a representative company, a, and a senior engineer at a company B, or each kind of doing their kinds of things, and then I'm popping in to help resolve disputes or help with design decisions. And, you know, implement some sections of this myself so that we can get you to continue moving forward. You know, the smack planner is a great example of a coalition project. You know, I put out the the bones of that, largely myself after, you know, many, many months of work that only contain the hybrid, a star planter, had some, some bugs in it, and all that kind of stuff. And over time his company started using it, we had a couple companies go through and really start benchmarking performance and help me find some areas that can improve things and made some pretty substantial improvements to it, that we're currently working through through merging at the moment. And then I also have worked with a couple of students where instead of just having this this hybrid ACR planner, I also want to add a state lattice planner, so we could model arbitrary motion primitives and not just Ackermann models. And I found a couple of folks that were really interested in this thing as well. And were helping me, you'll build out file formats for caching information, building the local, primitive files offline for what the the actual printers are of that search pattern. Other than acumen models, we have folks working on improving the analog expansions to be more efficient and deal with cost information more effectively. You know, it's a lot of folks are working a lot of different things that end up coming together to build something new and interesting. And so it's a, yeah, a lot of my time is spent coalition building and getting new people involved the project and finding out what excites them, and where I can kind of where I can put them the project that is both helpful for the direction that we're going, or even potentially changing the direction we're going, if there's somebody who can spend a lot of time on it, or very senior that wants to see that certain thing happening. Sometimes I'll shift my direction to kind of help support them do that do that thing. Since a lot of development happens the NASDAQ is opportunity based, so there's an opportunity to integrate something new or somebody has some time they want to spend working on this thing. Sometimes I'll drop what I'm doing, I'll go watch over there. Because if I can get if I get if I get some help in some area, I'm not gonna say no to help.
(1:17:44) Audrow Nash
What do you it's interesting to hear that it shifts really to community building and coalition building. And you mentioned that it's a little bit different with not being able to, like, give people deadlines and things not easily. What do you think of open source? As like a business model? Yeah, you know, you know, I
(1:18:13) Steve Macenski
don't think about that, honestly. So I don't know, I couldn't really give you too much. I mean, I know like, you know, folks like picnic, they're doing this, the Oprah boxes doing this and some of the folks I've chatted with as well are doing this and and I think that's, that's a good direction to bring money into a project to fund development in a particular area. But, you know, I really like being able to work on the hard problems that are necessary because their gaps and they're needed, and not necessarily just working on who's going to you know, throw money in my face and say, go do this thing. Because sometimes it's hard to get funding for the things that are really important and you know, you have over a box I mean, you must know understand that with the the bill farm and documentation and things like that, it's hard, it's hard to get some people throw money at the important things that are the big gaps that are felt. So you know, maybe down the line there's there's a, there's an opportunity for an app to to have a similar kind of, you know, open source business model to have that funded. But that's not something that I'm currently thinking too much about or I've executed on or have any plans they need to catch up.
(1:19:07) Audrow Nash
So I'd love your thoughts on that anyways. Let's see. How can someone get involved if they want to, like say someone is interested in learning more and interested in the possibility of mentorship? What should they do?
(1:19:24) Steve Macenski
Yeah, the first place to go to navigation dot rostered org That's our documentation website. I spent an inordinate amount of time very good documentation site I am very impressed. Yeah, it's it's a lot of nickel and diming it's just you know, a page here and there one thing every Friday kind of thing. But yeah, if you go there you can find our tutorials you can find our getting started guides for how just like get from from absolutely nothing to installing ROS to gain the binaries, do your first demonstrations and arvis so that that's a good place to get started with the actual NASDAQ itself. If you want start contributing. We have a community slack that can give you the URL to do including this description of this podcast, that's a great place to, to reach out, either for assistance or ask questions or if you want to get involved, and you want to see where you can get involved in that, that slack group is a great place to be. And then generally just looking at our product issues. So our issue tracker is very active, it's where everything that that's being worked on is being discussed openly, and also searchable later on. But you know, don't don't restrict yourself to those things. If there are some other features that you see missing, or some gaps that you want for your product or your research or whatever, you'll file ticket. Let's let's talk about it. You know, I'm definitely not, you know, I'm not God, I don't know, I don't know, like, you know, what your what your product needs, that we're not, we're not servicing. And I want to understand these gaps are because I want to build this in to be the right, not just like a navigation system, but the navigation system, I read this, I want this to be the standard library for robot navigations. That, you know, it doesn't make sense to use anything else. Because this, this is a feature complete, it has so many options and plugin interfaces, so, and I think we're definitely getting there. So yeah, if there's anything missing, I want to hear about it.
(1:21:05) Audrow Nash
And just going back to like more for mentoring and things like this, or what? So if someone wants to contribute? Should they already be a roboticist? Or can they like, what level? Should someone like? What level of skills should someone have if they want to contribute?
(1:21:25) Steve Macenski
Yeah, I mean, it kind of depends on what you're interested in working on. If you're okay with doing some more like small discrete tasks, and in basically no experience is required. You know, in oftentimes folks that reach out to that are working on things like testing frameworks, and helping me out with with CI issues and these kinds of things. So then over time, as you've built some confidence in the concepts in the in the the code base, we can build those into larger and larger projects that are maybe more substantial and rewarding. In general, ROS, having some rough backgrounds, especially rough to background, is always helpful. But you know, I understand that a lot of folks, this would be the first experiences working with wealth too. So we're always willing to help work work through those those problems. So really not not much required. But if you are more experienced, have more background in these concepts, you can certainly make a much bigger impact much quicker. Because, you know, you can understand the gaps, where they're important for you and your research and where they are in navigation. It's its roadmap, you can help Phil so it fills in those gaps with some of the more substantial process.
(1:22:28) Audrow Nash
And what advice do you have for someone who's at the beginning of their career?
(1:22:39) Steve Macenski
Yeah, I mean, just just dive in. I think that's the biggest part. Like don't, don't wait for someone to teach you something. I guess. Like, you know, my background is in aerospace engineering, my, my, my, like I did, I wrote zero, Python for any coursework, anytime, during all college. I learned embedded C when I was in high school, just as my dad is a software engineer, we did some projects together, literally, like my first job, SMB, like day one, it was the first time I'd ever really written C++ code on the plane to San Francisco, I was reading the C++ standard manual to learn the semantics. It's been a while since I've worked with C. And so it you know, just just dive in headfirst and start start breaking stuff, look at errors get confused. You know, if you're willing to be excited and interested in spending time doing it, you know, you can you can go from zero to expert, you know, fairly quickly. It just requires grit and the desire loves Oh,
(1:23:37) Audrow Nash
and kind of big picture question. Where do you think robotics is headed in the next two to five years? Um,
(1:23:52) Steve Macenski
continuation of what's happening right now, I think like, a lot of the warehouse companies are starting to go from like, medium scale to like really like, you know, everywhere else in the world happening yet. So I think that that trend will continue. I think the delivery robots will get a bit will be a lot less Telly off and a bit more autonomous, hopefully. But I think there's still a long way to go there. I think there there's a there's a missing. There's a technology gap right there that exists.
(1:24:19) Audrow Nash
What is it today? What is the gap? I'm
(1:24:26) Steve Macenski
going from like my first like semantic segmentation and detection models to like something that's like super robust and works in all situations. I think is a big is a big part of it. I think there's a lot of there's a lot of folks working with just like, out of the box demos of TensorFlow. And they work well enough but not well enough. You could actually just lead a robot to go do its thing and especially in the outdoor space, we have cars and yeah, for sure everything around it. And I think there's a there's a there's a big gap there and in when you do have the small number of companies that can go over that. It's because they had millions Millions and millions suspend to hire all the right people and you know, click this huge train datasets and all that kind of stuff and that that's not something everyone can do. And I think hopefully we see a lot more mobile manipulators starting to reach the market and actually accomplishing useful tasks. I guess right now there's, there's that promise rifle Bill garage of, you know, a special box that has had a product, other folks that have had products, but they didn't really go anywhere outside the research space. But I'd really love to see those start really taking off. And I think that that would get me excited to go back into the manipulation space, if I started seeing it at that, that that will start to take off again.
(1:25:41) Audrow Nash
All right, awesome. Thank you. Thanks for listening to this conversation with Steve Minsky. Thank you again to our founding sponsor, Open Robotics. See you next time.