Objective and Constraint Modules¶
Base Optimization Modules (abstract)¶
Base class¶
-
class
mantrap.modules.base.optimization_module.
OptimizationModule
(t_horizon: int, weight: float = 0.0, env: mantrap.environment.base.graph_based.GraphBasedEnvironment = None, has_slack: bool = False, slack_weight: float = 0.0)¶ -
compute_constraint
(ego_trajectory: torch.Tensor, ado_ids: List[str], tag: str) → Optional[torch.Tensor]¶ Determine internal constraints + slack constraints.
Compute internal constraints and convert them to equality constraints by updating and adding the slack variables. Then add further constraints for the slack variables themselves (>= 0).
- Parameters
ego_trajectory – planned ego trajectory (t_horizon, 5).
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
-
compute_gradient_analytically
(ego_trajectory: torch.Tensor, grad_wrt: torch.Tensor, ado_ids: List[str], tag: str) → Optional[numpy.ndarray]¶ Compute objective gradient vector analytically.
While the gradient vector of the objective can be computed automatically using PyTorch’s automatic differentiation package there might be an analytic solution, which is when known for sure more efficient to compute. Although it is against the convention to use torch representations whenever possible, this function returns numpy arrays, since the main gradient() function has to return a numpy array. Hence, not computing based on numpy arrays would just introduce an un-necessary .detach().numpy().
When no analytical solution is defined (or too hard to determine) return None.
- Parameters
ego_trajectory – planned ego trajectory (t_horizon, 5).
grad_wrt – vector w.r.t. which the gradient should be determined.
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
-
compute_gradient_auto_grad
(x: torch.Tensor, grad_wrt: torch.Tensor) → numpy.ndarray¶ Compute derivative of x with respect to grad_wrt.
Compute the gradient/jacobian/etc. of some vector x with respect to some tensor grad_wrt using the PyTorch autograd, automatic differentiation package. Here we assume that both are connected by some computational graph (PyTorch graph) that can be used for differentiation.
A comment about multiprocessing: Computing the gradients in parallel would be a good match for multiple processing, since it is fairly independent from each other, given the shared memory of the computation graph.
import torch.multiprocessing as mp mp.set_start_method('spawn') x.share_memory_() grad_wrt.share_memory_() gradient.share_memory_() def compute(x_i, grad_wrt_i): grad = torch.autograd.grad(element, grad_wrt, retain_graph=True, only_inputs=True)[0] return grad.flatten().detach() processes = [] for i_process in range(8): p = mp.Process(target=compute, args=(x[i_process], grad_wrt, )) p.start() processes.append(p) for p in processes: p.join()
Here the torch.multiprocessing library is used to compute the gradient over the whole tensor x in multiple parallel processes. Therefore the tensors of both x and grad_wrt are shared over all processes using the .share_memory() method and all processes are launched with a different element of the tensor x. However as shown below sharing a computation graph, i.e. tensors that require a gradient, being attached to this graph, over multiple processes is not supported in PyTorch and therefore not possible.
def reduce_tensor(tensor): storage = tensor.storage() if tensor.requires_grad and not tensor.is_leaf: raise RuntimeError("Cowardly refusing to serialize non-leaf tensor which requires_grad, " "since autograd does not support crossing process boundaries. " "If you just want to transfer the data, call detach() on the tensor " "before serializing (e.g., putting it on the queue).")
To avoid this issue, the full computation graph would have to be re-built for every single element of x, which would create a lot of overhead due to repeated computations (as well as being quite not general and unreadable due to nesting instead of batching) and therefore not accelerate the computations.
- Parameters
x – gradient input flat vector.
grad_wrt – tensor with respect to gradients should be computed.
- Returns
flattened gradient tensor (x.size * grad_wrt.size)
-
compute_jacobian_analytically
(ego_trajectory: torch.Tensor, grad_wrt: torch.Tensor, ado_ids: List[str], tag: str) → Optional[numpy.ndarray]¶ Compute Jacobian matrix analytically.
While the Jacobian matrix of the constraint can be computed automatically using PyTorch’s automatic differentiation package there might be an analytic solution, which is when known for sure more efficient to compute. Although it is against the convention to use torch representations whenever possible, this function returns numpy arrays, since the main jacobian() function has to return a numpy array. Hence, not computing based on numpy arrays would just introduce an un-necessary .detach().numpy().
When no analytical solution is defined (or too hard to determine) return None.
- Parameters
ego_trajectory – planned ego trajectory (t_horizon, 5).
grad_wrt – vector w.r.t. which the gradient should be determined.
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
-
compute_objective
(ego_trajectory: torch.Tensor, ado_ids: List[str], tag: str) → Optional[torch.Tensor]¶ Determine internal objective value + slack variables.
Add slack based part of objective function. The value of the slack variable can only be updated if the constraints have been computed before. However using general optimization frameworks we cannot enforce the order to method calls, therefore to be surely synced we have to compute the constraints here first (!). Add the slack-based objective if the constraint is violated, otherwise add zero (since a constraint should not be optimised, just be feasible). The large slack_weight will thereby force the optimiser to make some decision to become feasible again.
- Parameters
ego_trajectory – planned ego trajectory (t_horizon, 5).
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
-
compute_violation
(ego_trajectory: torch.Tensor, ado_ids: List[str], tag: str) → float¶ Determine constraint violation based on some input ego trajectory and ado ids list.
The violation is the amount how much the solution state is inside the constraint active region. When the constraint is not active, then the violation is zero. The calculation is based on the last (cached) evaluation of the constraint function.
- Parameters
ego_trajectory – planned ego trajectory (t_horizon, 5).
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
-
compute_violation_internal
(tag: str) → float¶ Determine constraint violation, i.e. how much the internal state is inside the constraint active region. When the constraint is not active, then the violation is zero. The calculation is based on the last (cached) evaluation of the constraint function.
-
constraint
(ego_trajectory: torch.Tensor, ado_ids: List[str], tag: str) → numpy.ndarray¶ Determine constraint value for passed ego trajectory by calling the internal compute() method.
- Parameters
ego_trajectory – planned ego trajectory (t_horizon, 5).
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
-
constraint_boundaries
(ado_ids: List[str]) → Tuple[List[Optional[float]], List[Optional[float]]]¶ Compute module constraint boundaries.
Assuming that the limits of a constraint are constant over the full time horizon, only the (scalar) limits are specific to the module, while the boundaries, i.e. the stacked limits over the time-horizon are generally stacked (and normalized !) for any module.
-
gradient
(ego_trajectory: torch.Tensor, grad_wrt: torch.Tensor, ado_ids: List[str], tag: str) → numpy.ndarray¶ Determine gradient vector for passed ego trajectory. Therefore determine the objective value by calling the internal compute() method and en passant build a computation graph. Then using the pytorch auto-grad library compute the gradient vector through the previously built computation graph.
- Parameters
ego_trajectory – planned ego trajectory (t_horizon, 5).
grad_wrt – vector w.r.t. which the gradient should be determined.
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
-
abstract
gradient_condition
() → bool¶ Condition for back-propagating through the objective/constraint in order to obtain the objective’s gradient vector/jacobian (numerically). If returns True and the ego_trajectory itself requires a gradient, the objective/constraint value, stored from the last computation (_current_-variables) has to require a gradient as well.
-
jacobian
(ego_trajectory: torch.Tensor, grad_wrt: torch.Tensor, ado_ids: List[str], tag: str) → numpy.ndarray¶ Determine jacobian matrix for passed ego trajectory.
Therefore at first check whether an analytical solution is defined, if not determine the constraint values by calling the internal compute() method and en passant build a computation graph. Then using the PyTorch autograd library compute the jacobian matrix based on the constraints computation graph.
- Parameters
ego_trajectory – planned ego trajectory (t_horizon, 5).
grad_wrt – vector w.r.t. which the gradient should be determined.
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
-
jacobian_structure
(ado_ids: List[str], tag: str) → Optional[numpy.ndarray]¶ Return the sparsity structure of the jacobian, i.e. the indices of non-zero elements.
When not defined otherwise the jacobian structure is determined by determining the jacobian for some random input structure (and the current environment), so that the non-zero indices can be determined afterwards. However this way of computing the jacobian structure highly depends on efficient calculation of the jacobian matrix, and is therefore only available if the the compute_jacobian_analytically() function is defined.
- Parameters
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
- Returns
indices of non-zero elements of jacobian.
-
abstract
normalize
(x: Union[numpy.ndarray, float]) → Union[numpy.ndarray, float]¶ Normalize the objective/constraint value for improved optimization performance.
- Parameters
x – objective/constraint value in normal value range.
- Returns
normalized objective/constraint value in range [0, 1].
-
objective
(ego_trajectory: torch.Tensor, ado_ids: List[str], tag: str) → float¶ Determine objective value for passed ego trajectory by calling the internal compute() method.
- Parameters
ego_trajectory – planned ego trajectory (t_horizon, 5).
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
-
Pure objective¶
-
class
mantrap.modules.base.pure_objective.
PureObjectiveModule
(t_horizon: int = None, weight: float = 1.0, env: mantrap.environment.base.graph_based.GraphBasedEnvironment = None)¶ Pure constraint module class.
For an unified and general implementation of objective and constraint function modules, this superclass implements methods for computing both, either analytically or numerically based on the PyTorch autograd package. Thereby all objective and constraint computations should be purely based on the robot’s (ego) trajectory, as well as the possibility to perform further roll-outs in a given simulation environment.
The PureConstraintModule implements the general optimization module as pure constraint module, i.e. for hard constraints without any inter-connection to the objective function.
-
compute_jacobian_analytically
(ego_trajectory: torch.Tensor, grad_wrt: torch.Tensor, ado_ids: List[str], tag: str) → Optional[numpy.ndarray]¶ Compute Jacobian matrix analytically.
While the Jacobian matrix of the constraint can be computed automatically using PyTorch’s automatic differentiation package there might be an analytic solution, which is when known for sure more efficient to compute. Although it is against the convention to use torch representations whenever possible, this function returns numpy arrays, since the main jacobian() function has to return a numpy array. Hence, not computing based on numpy arrays would just introduce an un-necessary .detach().numpy().
When no analytical solution is defined (or too hard to determine) return None.
- Parameters
ego_trajectory – planned ego trajectory (t_horizon, 5).
grad_wrt – vector w.r.t. which the gradient should be determined.
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
-
Pure constraint¶
-
class
mantrap.modules.base.pure_constraint.
PureConstraintModule
(t_horizon: int = None, env: mantrap.environment.base.graph_based.GraphBasedEnvironment = None)¶ Pure constraint module class.
For an unified and general implementation of objective and constraint function modules, this superclass implements methods for computing both, either analytically or numerically based on the PyTorch autograd package. Thereby all objective and constraint computations should be purely based on the robot’s (ego) trajectory, as well as the possibility to perform further roll-outs in a given simulation environment.
The PureConstraintModule implements the general optimization module as pure constraint module, i.e. for hard constraints without any inter-connection to the objective function.
-
compute_gradient_analytically
(ego_trajectory: torch.Tensor, grad_wrt: torch.Tensor, ado_ids: List[str], tag: str) → Optional[numpy.ndarray]¶ Compute objective gradient vector analytically.
While the gradient vector of the objective can be computed automatically using PyTorch’s automatic differentiation package there might be an analytic solution, which is when known for sure more efficient to compute. Although it is against the convention to use torch representations whenever possible, this function returns numpy arrays, since the main gradient() function has to return a numpy array. Hence, not computing based on numpy arrays would just introduce an un-necessary .detach().numpy().
When no analytical solution is defined (or too hard to determine) return None.
- Parameters
ego_trajectory – planned ego trajectory (t_horizon, 5).
grad_wrt – vector w.r.t. which the gradient should be determined.
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
-
Goal-Driven Objectives¶
Goal-Norm¶
-
class
mantrap.modules.goal_norm.
GoalNormModule
(goal: torch.Tensor, env: mantrap.environment.base.graph_based.GraphBasedEnvironment, optimize_speed: bool = False, weight: float = 0.5, **unused)¶ Objective based on goal distance of every point of planned robot trajectory.
Next to avoiding interaction with other agents the robot should reach the goal state in a finite amount of time. Therefore the distance of every trajectory point to the goal state is taken to account, which is minimized the faster the robot gets to the goal.
\[objective = 1/T \sum_{T} (pos_t - goal)^2\]However, it is more important for the last rather than the first trajectory points to be close to the goal. Using some strictly-increasing distribution to weight the importance of the distance at every point in time did not lead to the expect result, while complicating the optimization. When we want to trade-off the goal cost with other cost, simply adapting its weight is sufficient as well.
Additionally a cost for the velocity at the goal state can be included in this objective, a cost for non-zero velocity to be exact. This cost is weighted continuously based on the distance to the goal, i.e. the closer the a large speed (= velocity L2 norm) occurs, the higher its cost.
\[objective = \sum_{T} w_t(d_{goal}(t)) || v_t ||_2\]- Parameters
goal – goal state/position for robot agent (2).
optimize_speed – include cost for zero velocity at goal state.
-
compute_gradient_analytically
(ego_trajectory: torch.Tensor, grad_wrt: torch.Tensor, ado_ids: List[str], tag: str) → Optional[numpy.ndarray]¶ Compute objective gradient vector analytically.
While the gradient vector of the objective can be computed automatically using PyTorch’s automatic differentiation package there might be an analytic solution, which is when known for sure more efficient to compute. Although it is against the convention to use torch representations whenever possible, this function returns numpy arrays, since the main gradient() function has to return a numpy array. Hence, not computing based on numpy arrays would just introduce an un-necessary .detach().numpy().
- Parameters
ego_trajectory – planned ego trajectory (t_horizon, 5).
grad_wrt – vector w.r.t. which the gradient should be determined.
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
-
gradient_condition
() → bool¶ Condition for back-propagating through the objective/constraint in order to obtain the objective’s gradient vector/jacobian (numerically). If returns True and the ego_trajectory itself requires a gradient, the objective/constraint value, stored from the last computation (_current_-variables) has to require a gradient as well.
Since the objective value computation depends on the ego_trajectory (and the ego goal) only, this should always hold.
-
normalize
(x: Union[numpy.ndarray, float]) → Union[numpy.ndarray, float]¶ Normalize the objective/constraint value for improved optimization performance.
Compute normalize factor as the maximal possible ego goal distance in the given environment (i.e. length of rectangular diagonal line).
- Parameters
x – objective/constraint value in normal value range.
- Returns
normalized objective/constraint value in range [0, 1].
Goal-Weighted¶
-
class
mantrap.modules.baselines.goal_weighted.
GoalWeightedModule
(goal: torch.Tensor, env: mantrap.environment.base.graph_based.GraphBasedEnvironment, **unused)¶ Objective based on goal distance of every point of planned robot trajectory.
This module merely serves as a baseline comparison for the GoalMeanModule, by using a weighted sum instead of the mean for combining the point-wise objectives. Following the idea that it is more important for the last point of the trajectory to be close to the goal than the first one, the larger the trajectory index the larger the weight of the point-wise distance.
-
compute_gradient_analytically
(ego_trajectory: torch.Tensor, grad_wrt: torch.Tensor, ado_ids: List[str], tag: str) → Optional[numpy.ndarray]¶ Compute objective gradient vector analytically.
While the gradient vector of the objective can be computed automatically using PyTorch’s automatic differentiation package there might be an analytic solution, which is when known for sure more efficient to compute. Although it is against the convention to use torch representations whenever possible, this function returns numpy arrays, since the main gradient() function has to return a numpy array. Hence, not computing based on numpy arrays would just introduce an un-necessary .detach().numpy().
- Parameters
ego_trajectory – planned ego trajectory (t_horizon, 5).
grad_wrt – vector w.r.t. which the gradient should be determined.
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
-
Interaction-Driven Objectives¶
Probability Interaction¶
-
class
mantrap.modules.prob_interact.
InteractionProbabilityModule
(env: mantrap.environment.base.graph_based.GraphBasedEnvironment, t_horizon: int, weight: float = 1.0, **unused)¶ Loss based on unconditioned probability value in distribution conditioned on ego motion.
The general idea of an interactive objective module is to compare the ado trajectory distributions as if no ego/robot would be in the scene with the distributions conditioned on the robot trajectory, in order to drive the robot’s trajectory optimisation to some state in which the ados are (in average) minimally disturbed by the robots motion. For general ado trajectory probability distributions including multi-modality, this is not a trivial problem. In fact its is not analytically solved for most multi-modal distributions. However in the following some common approaches:
1) KL-Divergence: The KL-Divergence expresses the similarity some distribution q with respect to another distribution p:
While this is a well-defined and commonly used similarity measure for “simple” distributions for more complex ones, such as GMM (e.g. as Trajectron’s output) it is not analytically defined. Methods to approximate the KL-Divergence for GMMs embrace Monte Carlo sampling, optimisation (itself) and several more which however are not computationally feasible for an online application, especially since the objective’s gradient has to be computed. Other methods simply the real GMM to a single Gaussian, by a weighted average over its parameters, which is a) not guaranteed to be a meaningful distribution and b) looses the advantages of predicting multi-modal distributions in the first place.
Especially for trajectron one could also not use the output distribution, but some intermediate (maybe simpler distribution) instead, e.g. the latent space, but this one does not depend on the ego trajectory so cannot be used for its optimisation.
2) Unconditioned path projection: Another approach is to compute (and maximize) the probability of the mean un-conditioned trajectories (mode-wise) appearing in the conditioned distribution. While it only takes into account the mean values (and weights) it is very efficient to compute while still taking the full conditioned distribution into account and has shown to be “optimise-able” in the training of Trajectron. Since the distributions itself are constant, while the sampled trajectories vary, the objective is also constant regarding the same scenario, which also improves its “optimise-ability”.
- Parameters
env – solver’s environment environment for predicting the behaviour without interaction.
-
gradient_condition
() → bool¶ Condition for back-propagating through the objective/constraint in order to obtain the objective’s gradient vector/jacobian (numerically). If returns True and the ego_trajectory itself requires a gradient, the objective/constraint value, stored from the last computation (_current_-variables) has to require a gradient as well.
If the internal environment is itself differentiable with respect to the ego (trajectory) input, the resulting objective value must have a gradient as well.
-
normalize
(x: Union[numpy.ndarray, float]) → Union[numpy.ndarray, float]¶ Normalize the objective/constraint value for improved optimization performance.
When the projected trajectory is very unlikely in the computed distribution, then the log likelihood grows to very large values, which would result in very un-balanced objectives. Therefore it is clamped. The objective value is clamped to some maximal value which enforces the resulting objective values to be in some range. And can hence serve as normalize factor. However it turned out that (linear) transformation are not sufficient to deal with the partly very small probability values that occur here, therefore we log the values.
To perform it this “correction” on both the objective and gradient, the normalization function is used, not the objective core function itself (which is a little bit suboptimal but the only way).
- Parameters
x – objective/constraint value in normal value range.
- Returns
normalized objective/constraint value in range [-1, 1].
Acceleration Interaction¶
-
class
mantrap.modules.baselines.acc_interact.
InteractionAccelerationModule
(env: mantrap.environment.base.graph_based.GraphBasedEnvironment, t_horizon: int, weight: float = 1.0, **unused)¶ Loss based on difference of accelerations due to interaction between robot and ados.
As a proxy for interaction based on the mean acceleration of every ado is computed in a (fictional) scene without an ego (robot) and compared to the actual occurring accelerations in the scene. As for autonomous driving the acceleration can be expressed “moving comfort”, since a change in acceleration, especially a sudden change like strong de-acceleration, decreases the comfort of the agent.
Re-Predicting it every time-step would be more correct, however it would also require a lot more computational effort (horizon times as much to be exact). Therefore merely the behavior of the ado without ego is computed that would occur, if the ego is not there from the beginning.
\[objective = 1/T \sum_{T} \sum_{ados} || acc_{t,i} - acc_{t,i}^{wo} ||_2\]- Parameters
env – environment for predicting the behaviour without interaction.
-
gradient_condition
() → bool¶ Condition for back-propagating through the objective/constraint in order to obtain the objective’s gradient vector/jacobian (numerically). If returns True and the ego_trajectory itself requires a gradient, the objective/constraint value, stored from the last computation (_current_-variables) has to require a gradient as well.
If the internal environment is itself differentiable with respect to the ego (trajectory) input, the resulting objective value must have a gradient as well.
-
normalize
(x: Union[numpy.ndarray, float]) → Union[numpy.ndarray, float]¶ Normalize the objective/constraint value for improved optimization performance.
The objective value is clamped to some maximal value which enforces the resulting objective values to be in some range. And can hence serve as normalize factor.
- Parameters
x – objective/constraint value in normal value range.
- Returns
normalized objective/constraint value in range [-1, 1].
-
summarize_distribution
(dist_dict: Dict[str, torch.distributions.distribution.Distribution]) → torch.Tensor¶ Compute ado-wise accelerations from velocity distribution dict mean values.
Position Interaction¶
-
class
mantrap.modules.baselines.pos_interact.
InteractionPositionModule
(env: mantrap.environment.base.graph_based.GraphBasedEnvironment, t_horizon: int, weight: float = 1.0, **unused)¶ Loss based on difference of positions due to interaction between robot and ados.
As a proxy for interaction based on the mean position of every ado is computed in a (fictional) scene without an ego (robot) and compared to the actual occurring positions in the scene, as in intuitive measure for the change the robot’s presence introduces to the scene.
Re-Predicting it every time-step would be more correct, however it would also require a lot more computational effort (horizon times as much to be exact). Therefore merely the behavior of the ado without ego is computed that would occur, if the ego is not there from the beginning.
\[objective = 1/T \sum_{T} \sum_{ados} || pos_{t,i} - pos_{t,i}^{wo} ||_2\]- Parameters
env – solver’s environment environment for predicting the behaviour without interaction.
-
summarize_distribution
(dist_dict: Dict[str, torch.distributions.distribution.Distribution]) → torch.Tensor¶ Compute ado-wise positions from velocity distribution dict mean values.
Control-Effort Constraints¶
-
class
mantrap.modules.control_limits.
ControlLimitModule
(env: mantrap.environment.base.graph_based.GraphBasedEnvironment, t_horizon: int, **unused)¶ Maximal control input at every point in time.
For computing this constraint simply the norm of the planned control input is determined and compared to the maximal agent’s control limit. For 0 < t < T_{planning}:
\[||u(t)|| < u_{max}\]-
compute_jacobian_analytically
(ego_trajectory: torch.Tensor, grad_wrt: torch.Tensor, ado_ids: List[str], tag: str) → Optional[numpy.ndarray]¶ Compute Jacobian matrix analytically.
While the Jacobian matrix of the constraint can be computed automatically using PyTorch’s automatic differentiation package there might be an analytic solution, which is when known for sure more efficient to compute. Although it is against the convention to use torch representations whenever possible, this function returns numpy arrays, since the main jacobian() function has to return a numpy array. Hence, not computing based on numpy arrays would just introduce an un-necessary .detach().numpy().
When the gradient shall be computed with respect to the controls, then computing the gradient analytically is very straight-forward, by just applying the following formula:
\[\frac{ d ||u|| }{ du_i } = \frac{u_i}{||u||}\]- Parameters
ego_trajectory – planned ego trajectory (t_horizon, 5).
grad_wrt – vector w.r.t. which the gradient should be determined.
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
-
gradient_condition
() → bool¶ Condition for back-propagating through the objective/constraint in order to obtain the objective’s gradient vector/jacobian (numerically). If returns True and the ego_trajectory itself requires a gradient, the objective/constraint value, stored from the last computation (_current_-variables) has to require a gradient as well.
Since the ego trajectory directly depends on the controls, the gradient always exists.
-
jacobian_structure
(ado_ids: List[str], tag: str) → Optional[numpy.ndarray]¶ Return the sparsity structure of the jacobian, i.e. the indices of non-zero elements.
- Parameters
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
- Returns
indices of non-zero elements of jacobian.
-
normalize
(x: Union[numpy.ndarray, float]) → Union[numpy.ndarray, float]¶ Normalize the objective/constraint value for improved optimization performance.
Use the maximally allowed controls as normalize factor, which are a property of the robot itself, that is stored in the internal environment representation.
- Parameters
x – objective/constraint value in normal value range.
- Returns
normalized objective/constraint value in range [0, 1].
-
Safety-Constraints¶
HJ-Reachability¶
-
class
mantrap.modules.hj_reachibility.
HJReachabilityModule
(env: mantrap.environment.base.graph_based.GraphBasedEnvironment, t_horizon: int, weight: float = 10.0, data_file: str = '2D.mat', interp_method: str = 'linear', **unused)¶ Soft constraint based on Hamilton-Jacobi Reachability.
Use HJ (aka backward) reachability to constraint the ego agent to not become “unsafer” when moving, i.e. to chose some trajectory so that the backward reachability value function does not become larger. This is equivalent to a min-max-game between the robot and each pedestrian, in which the pedestrian tries to catch the ego robot and while the robot tries to avoid the pedestrian. When the value function is negative, it is provably safe which means no matter what the pedestrian does (regarding safety trying to “catch” the robot is the worst case) they cannot reach each other.
with f_{rel}(x, u_R, u_P) being the system dynamics of the relative state. Since the robot is modelled as double integrator while all pedestrians are modelled as single integrators, the relative (coupled) systems state only depends on the relative position as well as the velocity of the robot, not the velocity of the pedestrian (which is a system input).
The coupled system dynamics then are given as:
Since sigma is a slack variable the according weight in the objective function should be comparably large.
-
compute_jacobian_analytically
(ego_trajectory: torch.Tensor, grad_wrt: torch.Tensor, ado_ids: List[str], tag: str) → Optional[numpy.ndarray]¶ Compute Jacobian matrix analytically.
While the Jacobian matrix of the constraint can be computed automatically using PyTorch’s automatic differentiation package there might be an analytic solution, which is when known for sure more efficient to compute. Although it is against the convention to use torch representations whenever possible, this function returns numpy arrays, since the main jacobian() function has to return a numpy array. Hence, not computing based on numpy arrays would just introduce an un-necessary .detach().numpy().
In the following we assume that assume that we look for the jacobian with respect to the ego controls, but to keep this general we will test it at the beginning. However since this module is based on look-up table interpolated values, we cannot compute the full derivative using torch’s autograd framework only, as in other modules, therefore if grad_wrt is not the controls, an error is raised (since otherwise autograd would be used and fail).
Since we pre-computed both the value function and its gradient computing the jacobian is quite straight-forward. Using the chain rule we get:
\[\frac{dJ}{du} = \frac{dJ}{dx_{rel}} \frac{dx_{rel}}{du}\]with dJ/dx_rel being pre-computed we only have to determine dx_rel/du.
- Parameters
ego_trajectory – planned ego trajectory (t_horizon, 5).
grad_wrt – vector w.r.t. which the gradient should be determined.
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
-
gradient_condition
() → bool¶ Condition for back-propagating through the objective/constraint in order to obtain the objective’s gradient vector/jacobian (numerically). If returns True and the ego_trajectory itself requires a gradient, the objective/constraint value, stored from the last computation (_current_-variables) has to require a gradient as well.
The constraint and objective evaluation basically are table-lookups which are in general not differentiable (although there are approaches to do so, like some kind of function fitting, but they are not used here).
-
jacobian_structure
(ado_ids: List[str], tag: str) → Optional[numpy.ndarray]¶ Return the sparsity structure of the jacobian, i.e. the indices of non-zero elements.
- Parameters
ado_ids – ghost ids which should be taken into account for computation.
tag – name of optimization call (name of the core).
- Returns
indices of non-zero elements of jacobian.
-
normalize
(x: Union[numpy.ndarray, float]) → Union[numpy.ndarray, float]¶ Normalize the objective/constraint value for improved optimization performance.
- Parameters
x – objective/constraint value in normal value range.
- Returns
normalized objective/constraint value in range [0, 1].
-
static
state_relative
(x_r: torch.Tensor, u_r: torch.Tensor, x_ped: torch.Tensor, u_ped: torch.Tensor, dt: float) → torch.Tensor¶ Determine the relative state and dynamics (see module description):
Since the states of robot and pedestrian as well as the action of the robot are known, at the time of evaluating the constraint, only the pedestrians controls are unknown. However in HJ reachability we look for the controls that minimize the value function, by evaluating the value function for several assignments of u_ped and finding the arg-min. In this context this function assumes that all states as well as the robots control are “unique” while it can handle a whole bunch of different pedestrian controls and computes the relative state for each of them.
- Parameters
x_r – robot’s state (x, y, vx, vy)
u_r – robot’s control input (ux, uy).
x_ped – pedestrian’s state (px, py, vpx, vpy).
u_ped – pedestrian control input (upx, upy).
dt – time-step [s] for applying dynamics on current state.
- Returns
next state for each pedestrian control input.
-