Agents¶
Base Discrete Agent (abstract)¶
-
class
mantrap.agents.base.discrete.
DTAgent
(position: torch.Tensor, velocity: torch.Tensor = tensor([0.0, 0.0]), time: float = 0, history: torch.Tensor = None, is_robot: bool = False, color: numpy.ndarray = None, identifier: str = None, **agent_params)¶ General agent representation.
An agent, whether in environment or real world, has a five-dimensional state vector and a state history, which are defined as follows:
\[s_t = (pos_x(t), pos_y(t), vel_x(t), vel_y(t), time)\]\[history = (s_{t0}, s_{t1}, ..., s_{t})\]For unique identification each agent has an id (3-character string) and for visualisation purposes a unique color. Both are created randomly during initialization.
The internal state of the agent can be altered by calling the update() function, which uses some (control) input to update the agent’s state using its dynamics, or the reset() function, which sets the internal state to some value directly. All other methods do not alter the internal state.
- Parameters
position – current 2D position vector (2).
velocity – current 2D velocity vector (2).
time – current time stamp, default = 0.0.
history – current agent’s state history (N, 5), default = no history.
identifier – agent’s pre-set identifier, default = none so initialized randomly during initialization.
agent_params – dictionary of additional parameters stored in agent object.
-
check_feasibility_controls
(controls: torch.Tensor) → bool¶ Check feasibility of a given set of controls to be executed by the internal agent.
In order to check feasibility convert the trajectory just check whether the controls are inside the internal control boundaries. Since changes of control input from one step to another are not limited, they are not checked here.
- Parameters
controls – controls to be checked (N, 2).
-
check_feasibility_trajectory
(trajectory: torch.Tensor, dt: float) → bool¶ Check feasibility of a given trajectory to be followed by the internal agent.
In order to check feasibility convert the trajectory to control inputs using the internal inverse dynamics of the agent and check whether all of them are inside its control boundaries.
- Parameters
trajectory – trajectory to be checked (N, 4).
dt – time interval [s] between discrete trajectory states.
-
static
compute_acceleration
(trajectory: torch.Tensor, dt: float) → torch.Tensor¶ Compute the accelerations (ax, ay) of the agent give some trajectory. Therefore use the central difference theorem to numerically take the derivative of the velocity at every time-step.
- Parameters
trajectory – agent trajectory (t_horizon, >=4 = px, py, vx, vy).
dt – time interval [s] between discrete trajectory states.
-
abstract
control_limits
() → Tuple[float, float]¶ Returns agent’s control input limitations, i.e. lower and upper bound.
-
abstract
control_norm
(controls: torch.Tensor) → torch.Tensor¶ Compute the agent’s control norm ||u||.
- Parameters
controls – controls to calculate the norm from (N, 2).
- Returns
control norm (N, 2).
-
detach
()¶ Detach the agent’s internal variables (position, velocity, history) from computation tree. This is sometimes required to completely separate subsequent computations in PyTorch.
-
abstract
dx_du
(controls: torch.Tensor, dt: float) → torch.Tensor¶ Compute the derivative of the agent’s state with respect to its control input evaluated over controls u with discrete state time-step dt.
-
dynamics
(state: torch.Tensor, action: torch.Tensor, dt: float) → torch.Tensor¶ Forward integrate the agent’s motion given some state-action pair and an integration time-step.
Passing the state, instead of using the internal state, allows the method to be used for other state vector than the internal state, e.g. for forward predicting over a time horizon > 1. Since every agent type has different dynamics (like single-integrator or Dubins Car) this method is implemented abstractly.
- Parameters
state – state to be updated @ t = k (5).
action – control input @ t = k (size depending on agent type).
dt – forward integration time step [s].
- Returns
updated state vector with time @ t = k + dt (5).
-
static
expand_state_vector
(state_4: torch.Tensor, time: float) → torch.Tensor¶ Expand 4 dimensional (x, y, vx, vy) state vector by time information.
-
expand_trajectory
(path: torch.Tensor, dt: float) → torch.Tensor¶ Derive (position, orientation, velocity)-trajectory information from position data only, using numerical differentiation, i.e. v_i = (x_i+1 - x_i) / dt.
Since the path is not constrained to be dynamically infeasible, stretch it so that the maximal velocity is the maximally allowed velocity for the dynamics.
- Parameters
path – sequence of states (position, velocity) without temporal dimension (N, 4).
dt – time interval which is assumed to be constant over full path sequence [s].
- Returns trajectory
temporally-expanded path (N, 5).s
-
inverse_dynamics
(state: torch.Tensor, state_previous: torch.Tensor, dt: float) → torch.Tensor¶ Determine the agent’s motion given its current and previous state.
Passing the state, instead of using the internal state, allows the method to be used for other state vector than the internal state, e.g. for forward predicting over a time horizon > 1. Since every agent type has different dynamics (like single-integrator or Dubins Car) this method is implemented abstractly.
- Parameters
state – state @ t = k (4 or 5).
state_previous – previous state @ t = k - dt (4 or 5).
dt – forward integration time step [s].
- Returns
control input @ t = k (size depending on agent type).
-
make_controls_feasible
(controls: torch.Tensor) → torch.Tensor¶ Make controls feasible by clipping them between its lower and upper boundaries.
Return the transformed feasible controls. The direction of controls is thereby not changed, since just the length of the control vector is adapted. If the controls are just zeros then return them as they are.
-
abstract
reachability_boundary
(time_steps: int, dt: float) → mantrap.utility.maths.Shape2D¶ Compute the forward reachable set within the time-horizon with time_steps number of discrete steps with time-step dt. While the reachable set can be computed numerically in general, an exact (and more efficient) implementation has to be done in the child classes. Return this exact solution as class object of Shape2D type.
- Parameters
time_steps – number of discrete time-steps in reachable time-horizon.
dt – time interval which is assumed to be constant over full path sequence [s].
-
reset
(state: torch.Tensor, history: torch.Tensor = None)¶ Reset the complete state of the agent by resetting its position and velocity. Either adapt the agent’s history to the new state (i.e. append it to the already existing history) if history is given as None or set it to some given trajectory.
- Parameters
state – new state (5).
history – new state history (N, 5).
-
roll_trajectory
(trajectory: torch.Tensor, dt: float) → torch.Tensor¶ Determine the controls by iteratively applying the agent’s model inverse dynamics. Thereby a perfect model i.e. without uncertainty and correct is assumed.
To guarantee that the unrolled trajectory is invertible, i.e. when the resulting trajectory is back-transformed to the controls, the same controls should occur. Therefore no checks for the feasibility of the controls are made. Also this function is not updating the agent in fact, it is rather determining the theoretical trajectory given the agent’s dynamics and controls.
- Parameters
trajectory – sequence of states to apply to the robot (N, 4).
dt – time interval [s] between discrete trajectory states.
- Returns
inferred controls (no uncertainty in dynamics assumption !), (N, input_size).
-
sanity_check
() → bool¶ Sanity check for agent. In order to evaluate the sanity of the agent in the most general form, several internal states such as the position and velocity are checked to be of the right type. Also the history should always reflect the states until and including (!) the current state.
-
unroll_trajectory
(controls: torch.Tensor, dt: float) → torch.Tensor¶ Build the trajectory from some controls and current state, by iteratively applying the model dynamics. Thereby a perfect model i.e. without uncertainty and correct is assumed.
To guarantee that the unrolled trajectory is invertible, i.e. when the resulting trajectory is back-transformed to the controls, the same controls should occur. Therefore no checks for the feasibility of the controls are made. Also this function is not updating the agent in fact, it is rather determining the theoretical trajectory given the agent’s dynamics and controls.
- Parameters
controls – sequence of inputs to apply to the robot (N, input_size).
dt – time interval [s] between discrete trajectory states.
- Returns
resulting trajectory (no uncertainty in dynamics assumption !), (N, 4).
-
update
(action: torch.Tensor, dt: float) → Tuple[torch.Tensor, torch.Tensor]¶ Update internal state (position, velocity and history) by forward integrating the agent’s dynamics over the given time-step. The new state and time are then appended to the agent’s state history.
- Parameters
action – control input (size depending on agent type).
dt – forward integration time step [s].
- Returns
executed control action and new agent state
-
update_inverse
(state_next: torch.Tensor, dt: float) → Tuple[torch.Tensor, torch.Tensor]¶ Update internal state (position, velocity and history) by backward integrating the agent’s dynamics, i.e. by using the inverse dynamics to compute the executed action between the current and given next position and then execute this action again, to obtain the complete updated state as well as ensure feasibility.
- Parameters
state_next – next state (5).
dt – forward integration time step [s].
- Returns
executed control action and new agent state
Linear Agents¶
-
class
mantrap.agents.base.linear.
LinearDTAgent
(position: torch.Tensor, velocity: torch.Tensor = tensor([0.0, 0.0]), history: torch.Tensor = None, dt: float = None, max_steps: int = 20, **agent_kwargs)¶ Intermediate agent class for agents having linear state dynamics.
For linear state dynamics the dynamics can be computed very efficiently, using matrix-vector multiplication, with constant matrices (state-space matrices).
\[x_{i + n} = A^n x_i + \sum_{k=0}^{n-1} A^{n-k-1} * B * u_{i+k}\]Passing the simulation time-step directly instead of passing it to every function individually surely is not nice, but it enables pre-computing of the agent’s dynamics matrices, which speeds up computation of the dynamics by at least factor 5 (see examples/tools/timing), especially since the dynamics() method is the most called function in the full program. As storing the same information twice generally a bad programming paradigm and it reduces the capabilities of the controller a lot to only be able to handle one time-step, as a trade-off solution a dictionary of dynamics matrices is built, enabling handling multiple time-steps but still computing them once only.
- Parameters
dt – default dynamics time-step [s], default none (no dynamics pre-computation).
max_steps – maximal number of pre-computed rolling steps.
-
dx_du
(controls: torch.Tensor, dt: float) → torch.Tensor¶ Compute the derivative of the agent’s control input with respect to its state, evaluated over state trajectory x with discrete state time-step dt.
As follows from the equations for some state at time-step n x_n in dependence of the initial state x0 and the control inputs u_k, we have
due to the assumptions that the control inputs are independently from each other, their derivatives do not depend on each other, i.e. du_i/du_j = 0 for all i != j. Also the state at time-step n only depends on the initial state and the control inputs, not the states in between when the equation above is used ! Therefore we can simplify the equation above to the following:
-
dynamics_rolling_matrices
(dt: float, max_steps: int) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor]¶ Determine matrices for batched trajectory-rolling dynamics using equation shown in the definition of the class, stacked to two matrices An and Bn.
\[An = [I, A, A^2, ..., A^n]\]\[Bn = [[B, 0, ..., 0], [AB, B, 0, ..., 0], ..., [A^{n-1} B, ..., B]]\]\[Tn = [[0, 0, 0, 0, 0], [0, 0, 0, 0, 1], ..., [0, 0, 0, 0, n]]\]- Parameters
dt – dynamics integration time-step [s].
max_steps – maximal number of pre-computed steps.
-
reachability_boundary
(time_steps: int, dt: float) → mantrap.utility.maths.Circle¶ Generally for linear agents the N-th can be easily expressed in terms of the controls and the initial state by nesting the linear dynamics, resulting in
\[x_{i + N} = A^N x_i + \sum_{k = 0}^{N - 1} A^{N - k - 1} B u_{i + k}\]Also linear agents are assumed to behave isotropic (here), i.e. the control input can change its direction instantly and without limitation. Therefore the forward reachable set within the number of time_steps is just a circle (in general ellipse, but agent has same control bounds for both x- and y-direction) around the current position, with radius being the maximal allowed agent speed.
- Parameters
time_steps – number of discrete time-steps in reachable time-horizon.
dt – time interval which is assumed to be constant over full path sequence [s].
-
roll_trajectory
(trajectory: torch.Tensor, dt: float) → torch.Tensor¶ Determine the controls by iteratively applying the agent’s model inverse dynamics. Thereby a perfect model i.e. without uncertainty and correct is assumed.
To guarantee that the unrolled trajectory is invertible, i.e. when the resulting trajectory is back-transformed to the controls, the same controls should occur. Therefore no checks for the feasibility of the controls are made. Also this function is not updating the agent in fact, it is rather determining the theoretical trajectory given the agent’s dynamics and controls.
For linear agents determining the controls from the trajectory is very straight-forward and most importantly does not have to be done sequentially (at least not for the types of agents used within this project). Therefore the roll_trajectory() method can be improved.
- Parameters
trajectory – sequence of states to apply to the robot (N, 4).
dt – time interval [s] between discrete trajectory states.
- Returns
inferred controls (no uncertainty in dynamics assumption !), (N, input_size).
-
unroll_trajectory
(controls: torch.Tensor, dt: float) → torch.Tensor¶ Build the trajectory from some controls and current state, by iteratively applying the model dynamics. Thereby a perfect model i.e. without uncertainty and correct is assumed.
To guarantee that the unrolled trajectory is invertible, i.e. when the resulting trajectory is back-transformed to the controls, the same controls should occur. Therefore no checks for the feasibility of the controls are made. Also this function is not updating the agent in fact, it is rather determining the theoretical trajectory given the agent’s dynamics and controls.
- Parameters
controls – sequence of inputs to apply to the robot (N, input_size).
dt – time interval [s] between discrete trajectory states.
- Returns
resulting trajectory (no uncertainty in dynamics assumption !), (N, 4).
Single Integrator¶
-
class
mantrap.agents.integrator_single.
IntegratorDTAgent
(position: torch.Tensor, velocity: torch.Tensor = tensor([0.0, 0.0]), history: torch.Tensor = None, dt: float = None, max_steps: int = 20, **agent_kwargs)¶ Linear single integrator dynamics:
\[vel_{t+1} = action\]\[pos_{t+1} = pos_t + vel_{t+1} * dt\]-
property
acceleration_max
¶ Since single integrators are assumed to react instantly, their maximal acceleration is in fact infinite !
-
control_limits
() → Tuple[float, float]¶ - \[[- v_{max}, v_{max}]\]
-
control_norm
(controls: torch.Tensor) → torch.Tensor¶ Compute the agent’s control norm ||u|| = L2-norm.
- Parameters
controls – controls to calculate the norm from (N, 2).
- Returns
control norm (N, 1).
-
property
Double Integrator¶
-
class
mantrap.agents.integrator_double.
DoubleIntegratorDTAgent
(position: torch.Tensor, velocity: torch.Tensor = tensor([0.0, 0.0]), history: torch.Tensor = None, dt: float = None, max_steps: int = 20, **agent_kwargs)¶ Linear double integrator dynamics:
\[vel_{t+1} = vel_t + action * dt\]\[pos_{t+1} = pos_t + vel_{t+1} * dt\]-
control_limits
() → Tuple[float, float]¶ - \[[- a_{max}, a_{max}]\]
-
control_norm
(controls: torch.Tensor) → torch.Tensor¶ Compute the agent’s control norm ||u|| = L1-norm.
- Parameters
controls – controls to calculate the norm from (N, 2).
- Returns
control norm (N, 2).
-