Solver¶
Base Trajectory Solver (abstract)¶
-
class
mantrap.solver.base.trajopt.
TrajOptSolver
(env: mantrap.environment.base.graph_based.GraphBasedEnvironment, goal: torch.Tensor, t_planning: int = 5, modules: Union[List[Tuple], List] = None, attention_module: abc.ABCMeta = None, eval_env: abc.ABCMeta = None, config_name: str = 'unknown', is_logging: bool = False, is_debug: bool = False, **solver_params)¶ General abstract solver implementation.
The idea of this general implementation is that in order to build a solver class only the optimize() method has to be implemented, that determines the “optimal” value for the optimization variable given its initial value and the internally stored scene, while the general solver implementation deals with multi-threading and provides methods for computing the objective and constraint values (given a list of modules which should be taken into account, see below).
Initialise solver class by building objective and constraint modules as defined within the specific definition of the (optimisation) problem. Over all definitions it is shared that the optimization variable are the control inputs of the robot, however the exact formulation, i.e. which objective and constraint modules are used depends on the implementation of the child class.
Internally, the solver stores two environments, the environment it uses for planning (optimization etc) and the environment it uses for evaluation, i.e. which is actually unknown for the solver but encodes the way the scene actually changes from one time-step to another. If eval_env = None the planning and evaluation environment are the same.
- Parameters
env – environment the solver’s forward simulations are based on.
goal – goal state (position) of the robot (2).
t_planning – planning horizon, i.e. how many future time-steps shall be taken into account in planning.
multiprocessing – use multiprocessing for optimization.
modules – List of optimization modules and according kwargs (if required).
attention_module – Filter module name (None = no filter).
eval_env – environment that should be used for evaluation (“real” environment).
config_name – name of solver configuration.
is_logging – should all the results be logged (necessary for plotting but very costly !!).
is_debug – logging debug mode (for printing).
-
constraints
(z: numpy.ndarray, ado_ids: List[str] = None, tag: str = 'optimization', return_violation: bool = False) → numpy.ndarray¶ Determine constraints vector for some optimization vector z.
This function generally defines the constraints vector for some input optimization vector z, and some list of ados that should be taken into account in this computation. Therefore all constraint modules are called, their results are concatenated.
For logging purposes additionally the overall constraints violation, i.e. a scalar value representing the amount of how much each constraint is violated (zero if constraint is not active), summed over all constraints, is determined and logged, together with the violation by constraint module. Since the optimization parameters itself already have been logged in the objective() method, there are not logged within the constraints() method.
If the optimization is unconstrained, an empty constraint vector as well as zero violation are returned.
- Parameters
z – optimization vector (shape depends on exact optimization formulation).
ado_ids – identifiers of ados that should be taken into account during optimization.
tag – name of optimization call (name of the core).
return_violation – flag whether to return the overall constraint violation value as well.
- Returns
constraints vector w.r.t. z.
-
encode
() → torch.Tensor¶ Encode the current environment-goal-setup in a four-dimensional continuous space.
To represent a given scene completely the following elements have to be taken into account: the robot state (position, acceleration, type), the ado states (position, velocity, history, type) and the goal state. However to reduce dimensionality of the representation, we simplify the problem to only encode the current states (no history) and assume single integrator ado and double integrator robot dynamics. As a consequence the ado’s velocity can be ignored as well (since it can change instantly due to the single integrator dynamics).
The most important information with respect to the trajectory optimization surely is the relative position of the goal state, in robot coordinates. Therefore as an encoding the ado positions are transformed into the coordinate system spanned by the line from the robot’s to the goal position (and its orthogonal). The transformed coordinates of the closest pedestrian (w.r.t. the robot) as well as the robot’s velocity form the scene encoding.
- Returns
four-dimensional scene representation.
-
static
module_defaults
() → Union[List[Tuple], List]¶ List of optimization modules (objectives, constraint) and according dictionary of module kwargs, such as weight (for objectives), etc.
-
static
module_hard
() → Union[List[Tuple], List]¶ List of “hard” optimization modules (objectives, constraint). Hard modules are used for warm-starting the trajectory optimization and should therefore be simple to solve while still encoding a good guess of possible solutions.
By default these modules are assumed to be the goal objective function and the controls limit constraint, dynamics constraint fulfilled by the solver’s structure.
-
objective
(z: numpy.ndarray, ado_ids: List[str] = None, tag: str = 'optimization') → float¶ Determine objective value for some optimization vector z.
This function generally defines the value of the objective function for some input optimization vector z, and some input list of ados that should be taken into account in this computation. Therefore all objective modules are called, their results are summed (module-internally weighted).
- Parameters
z – optimization vector (shape depends on exact optimization formulation).
ado_ids – identifiers of ados that should be taken into account during optimization.
tag – name of optimization call (name of the core).
- Returns
weighted sum of objective values w.r.t. z.
-
optimize
(z0: torch.Tensor, tag: str, **kwargs) → torch.Tensor¶ Optimization core wrapper function.
Filter the agents by using the attention module, execute the optimization, log the results and return the optimization results.
- Parameters
z0 – initial value of optimization variable.
tag – name of optimization call (name of the core).
kwargs – additional arguments for optimization core function.
- Returns
z_opt (optimal values of optimization variable vector)
-
abstract
optimize_core
(z0: torch.Tensor, tag: str, ado_ids: List[str], **kwargs) → Tuple[torch.Tensor, Dict[str, torch.Tensor]]¶ Optimization function for single core to find optimal z-vector.
Given some initial value z0 find the optimal allocation for z with respect to the internally defined objectives and constraints. This function is executed in every thread in parallel, for different initial values z0. To simplify optimization not all agents in the scene have to be taken into account during the optimization but only the ones with ids defined in ado_ids.
- Parameters
z0 – initial value of optimization variables.
tag – name of optimization call (name of the core).
ado_ids – identifiers of ados that should be taken into account during optimization.
- Returns
z_opt (optimal values of optimization variable vector) optimization_log (logging dictionary for this optimization = self.log)
-
solve
(time_steps: int, warm_start_method: str = 'hard', **kwargs) → Tuple[torch.Tensor, torch.Tensor]¶ Find the ego trajectory given the internal environment with the current scene as initial condition. Therefore iteratively solve the problem for the scene at t = t_k, update the scene using the internal simulator and the derived ego policy and repeat until t_k = horizon or until the goal has been reached.
This method changes the internal environment by forward simulating it over the prediction horizon.
- Parameters
time_steps – how many time-steps shall be solved (not planning horizon !).
warm_start_method – warm-starting method (see .warm_start()).
- Returns
derived ego trajectory [T, 5] (T <= horizon + 1 in case goal is reached earlier).
- Returns
derived actual ado trajectories [T, horizon + 1, 5] (T <= horizon + 1, see above).
-
visualize_actual
(tag: str = 'optimization', as_video: bool = False, **vis_kwargs)¶ Visualize actual trajectories over the solved time-steps, as simulated, in a single 2D plot.
- Parameters
tag – logging tag to plot, per default optimization tag.
as_video – output visualization as video over time-steps.
-
visualize_scenes
(tag: str = 'optimization', **vis_kwargs)¶ Visualize planned trajectory over full time-horizon as well as simulated ado reactions (i.e. their trajectories conditioned on the planned ego trajectory).
- Parameters
tag – logging tag to plot, per default optimization tag.
-
visualize_step
(tag: str = 'optimization', iteration: int = 0, **vis_kwargs)¶ Visualize planned trajectory over full time-horizon as well as simulated ado reactions (i.e. their trajectories conditioned on the planned ego trajectory) for one specific iteration.
- Parameters
tag – logging tag to plot, per default optimization tag.
iteration – solver iteration to visualize, per default 0th iteration (start).
-
warm_start
(method: str = 'hard') → torch.Tensor¶ Compute warm-start for optimization decision variables z.
hard: solve the same optimization process but use the hard optimization modules only.
encoding: query pre-computed solutions for scenario close to current one (using encoding).
soft: solve the same optimization process but use the hard and safety optimization modules.
potential: warm-start using full formulation of PotentialFieldEnvironment.
zeros: no warm-start, assignment to zeros.
- Parameters
method – method to use.
- Returns
initial z values.
-
class
mantrap.solver.base.search.
SearchIntermediate
(env: mantrap.environment.base.graph_based.GraphBasedEnvironment, goal: torch.Tensor, t_planning: int = 5, modules: Union[List[Tuple], List] = None, attention_module: abc.ABCMeta = None, eval_env: abc.ABCMeta = None, config_name: str = 'unknown', is_logging: bool = False, is_debug: bool = False, **solver_params)¶ -
evaluate
(z: numpy.ndarray, ado_ids: List[str], tag: str) → Tuple[float, float]¶ Evaluate “value” of z-values, i.e. some choice of optimization variable assignment, by computing the overall objective value as well as the constraint violation (aka the feasibility of the choice).
-
optimize_core
(z0: torch.Tensor, ado_ids: List[str], tag: str = 'optimization', max_cpu_time: float = 0.5, **solver_kwargs) → Tuple[torch.Tensor, Dict[str, torch.Tensor]]¶ Optimization function for single core to find optimal z-vector.
Given some initial value z0 find the optimal allocation for z with respect to the internally defined objectives and constraints. This function is executed in every thread in parallel, for different initial values z0. To simplify optimization not all agents in the scene have to be taken into account during the optimization but only the ones with ids defined in ado_ids.
In general searching algorithms (at least the ones implemented within this project) have a similar “frame”, which is some termination constraint, here the computation runtime, and some inner optimization loop which repeats until the algorithm has either terminated or is done.
- Parameters
z0 – initial value of optimization variables.
tag – name of optimization call (name of the core).
ado_ids – identifiers of ados that should be taken into account during optimization.
max_cpu_time – maximal cpu runtime for optimization.
- Returns
z_opt (optimal values of optimization variable vector) optimization_log (logging dictionary for this optimization = self.log)
-
IPOPT¶
-
class
mantrap.solver.ipopt.
IPOPTSolver
(env: mantrap.environment.base.graph_based.GraphBasedEnvironment, goal: torch.Tensor, t_planning: int = 5, modules: Union[List[Tuple], List] = None, attention_module: abc.ABCMeta = None, eval_env: abc.ABCMeta = None, config_name: str = 'unknown', is_logging: bool = False, is_debug: bool = False, **solver_params)¶ -
gradient
(z: numpy.ndarray, ado_ids: List[str] = None, tag: str = 'optimization') → numpy.ndarray¶ Gradient computation function.
Compute the objectives gradient for some value of the optimization variable z based on the gradient implementations of the objective modules. Sum all these gradients together for the final gradient estimate.
-
jacobian
(z: numpy.ndarray, ado_ids: List[str] = None, tag: str = 'optimization') → numpy.ndarray¶ Jacobian of constraints computation function.
Compute the constraints jacobian for some value of the optimization variable z based on the jacobian implementations of the constraints modules. Concatenate all these gradients together for the final jacobian estimate.
-
jacobian_structure
(ado_ids: List[str] = None, tag: str = 'optimization') → Tuple[numpy.ndarray, numpy.ndarray]¶ Sparsity structure of Jacobian matrix.
The structure of the Jacobian matrix is defined by the indices of non-zero elements in the Jacobian matrix. As it is defined by module, the sparsity structures of all modules are concatenated.
-
static
module_hard
() → Union[List[Tuple], List]¶ List of “hard” optimization modules (objectives, constraint). Hard modules are used for warm-starting the trajectory optimization and should therefore be simple to solve while still encoding a good guess of possible solutions.
The IPOPT solver already uses the optimization variable boundaries as control limit, since we optimize for z = controls. Therefore only the goal module is required as a hard module.
-
optimize_core
(z0: torch.Tensor, ado_ids: List[str], tag: str = 'optimization', max_cpu_time: float = 2.0, approx_jacobian: bool = False, **solver_kwargs) → Tuple[torch.Tensor, Dict[str, torch.Tensor]]¶ Optimization function for single core to find optimal z-vector.
Given some initial value z0 find the optimal allocation for z with respect to the internally defined objectives and constraints. This function is executed in every thread in parallel, for different initial values z0. To simplify optimization not all agents in the scene have to be taken into account during the optimization but only the ones with ids defined in ado_ids.
IPOPT-Solver poses the optimization problem as Non-Linear Program (NLP) and uses the non-linear optimization library IPOPT (with Mumps backend) to solve it. Documentation: https://pythonhosted.org/ipopt/reference.html
- Parameters
z0 – initial value of optimization variables.
tag – name of optimization call (name of the core).
ado_ids – identifiers of ados that should be taken into account during optimization.
max_cpu_time – maximal cpu time until return.
approx_jacobian – if True automatic approximation of Jacobian based on finite-difference values.
- Returns
z_opt (optimal values of optimization variable vector) objective_opt (optimal objective value) optimization_log (logging dictionary for this optimization = self.log)
-
Solver Baselines¶
Baseline - MCTS¶
-
class
mantrap.solver.baselines.mcts.
MonteCarloTreeSearch
(env: mantrap.environment.base.graph_based.GraphBasedEnvironment, goal: torch.Tensor, t_planning: int = 5, modules: Union[List[Tuple], List] = None, attention_module: abc.ABCMeta = None, eval_env: abc.ABCMeta = None, config_name: str = 'unknown', is_logging: bool = False, is_debug: bool = False, **solver_params)¶
Baseline - ORCA¶
-
class
mantrap.solver.baselines.orca.
ORCASolver
(env: mantrap.environment.base.graph_based.GraphBasedEnvironment, goal: torch.Tensor, t_planning: int = 5, modules: Union[List[Tuple], List] = None, attention_module: abc.ABCMeta = None, eval_env: abc.ABCMeta = None, config_name: str = 'unknown', is_logging: bool = False, is_debug: bool = False, **solver_params)¶ ORCA-based planning algorithm.
Implementation of a planning algorithm according to ‘Reciprocal n-body Collision Avoidance’ by Jur van den Berg, Stephen J. Guy, Ming Lin, and Dinesh Manocha (short: ORCA, O = optimal). In this environment the optimal velocity update is determined for every agent so that a) there is guaranteed no collision in the next time-steps (agent_safe_dt) and b) the new velocity is the closest velocity to the preferred velocity (here: constant velocity to goal) there is in the set of non-collision velocities. Thereby we have some main assumptions:
all agents behave according to the ORCA formalism
all agents are single integrators i.e. do not have holonomic constraints
synchronized discrete time updates
perfect observability of every agents state at the current time (pref. velocities unknown)
In order to find the “optimal” velocity linear constraints are derived using the ORCA formalism and solved in a linear program. The ego agent is then regarded as one of the (N+1) agents in the scene and the scene, just with a known control input. Since ORCA assumes all parameters to be shared and known to the other agents it can neither be probabilistic nor multi-modal.
Also the update of some agent is affected by the ego, if and only if the ego agent imposes an active constraint on this agent, which is usually not the case for every agent. Therefore when differentiating the ado positions with respect to the ego’s state trajectory, usually it will no gradient (i.e. no connection in the computation graph), while in case of a connection having a very long network connection (i.e. comparably large computational effort to compute gradient).
-
class
LineConstraint
(point, direction)¶ -
property
direction
¶ Alias for field number 1
-
property
point
¶ Alias for field number 0
-
property
-
static
module_defaults
() → Union[List[Tuple], List]¶ List of optimization modules (objectives, constraint) and according dictionary of module kwargs, such as weight (for objectives), etc.
-
optimize_core
(z0: torch.Tensor, ado_ids: List[str], tag: str = 'optimization', agent_radius: float = 0.5, safe_time: float = 0.8, **solver_kwargs) → Tuple[torch.Tensor, Dict[str, torch.Tensor]]¶ Optimization function for single core to find optimal z-vector.
Given some initial value z0 find the optimal allocation for z with respect to the internally defined objectives and constraints. This function is executed in every thread in parallel, for different initial values z0. To simplify optimization not all agents in the scene have to be taken into account during the optimization but only the ones with ids defined in ado_ids.
Implementation of the ORCA algorithm for collision-free pedestrian avoidance. This solver only takes into account the current states, not any future state, which makes it efficient to solve, but also not very much anticipative.
- Parameters
z0 – initial value of optimization variables.
tag – name of optimization call (name of the core).
ado_ids – identifiers of ados that should be taken into account during optimization.
agent_radius – ado collision-avoidance safety radius [m].
safe_time – time interval of guaranteed no collisions [s].
- Returns
z_opt (optimal values of optimization variable vector) optimization_log (logging dictionary for this optimization = self.log)
Baseline - Search¶
-
class
mantrap.solver.baselines.random_search.
RandomSearch
(env: mantrap.environment.base.graph_based.GraphBasedEnvironment, goal: torch.Tensor, t_planning: int = 5, modules: Union[List[Tuple], List] = None, attention_module: abc.ABCMeta = None, eval_env: abc.ABCMeta = None, config_name: str = 'unknown', is_logging: bool = False, is_debug: bool = False, **solver_params)¶
Baseline - RRT Star¶
-
class
mantrap.solver.baselines.rrt_star.
RRTStarSolver
(env: mantrap.environment.base.graph_based.GraphBasedEnvironment, goal: torch.Tensor, t_planning: int = 5, modules: Union[List[Tuple], List] = None, attention_module: abc.ABCMeta = None, eval_env: abc.ABCMeta = None, config_name: str = 'unknown', is_logging: bool = False, is_debug: bool = False, **solver_params)¶ -
optimize_core
(z0: torch.Tensor, ado_ids: List[str], tag: str = 'optimization', **solver_kwargs) → Tuple[torch.Tensor, Dict[str, torch.Tensor]]¶ Optimization function for single core to find optimal z-vector.
Given some initial value z0 find the optimal allocation for z with respect to the internally defined objectives and constraints. This function is executed in every thread in parallel, for different initial values z0. To simplify optimization not all agents in the scene have to be taken into account during the optimization but only the ones with ids defined in ado_ids.
Implementation of optimal sampling-based path planning algorithm RRT* by Karaman and Frazzoli (Sampling-based Algorithms for Optimal Motion Planning). Since the algorithm is (runtime) asymptotically optimal and complete, search until the runtime has ended, then convert the path into robot controls, The implementation from PythonRobotics (https://github.com/AtsushiSakai/PythonRobotics) is used here.
This solver only takes into account the current states, not any future state, which makes it efficient to solve, but also not very much anticipative.
- Parameters
z0 – initial value of optimization variables.
tag – name of optimization call (name of the core).
ado_ids – identifiers of ados that should be taken into account during optimization.
- Returns
z_opt (optimal values of optimization variable vector) optimization_log (logging dictionary for this optimization = self.log)
-