Agents

Base Discrete Agent (abstract)

class mantrap.agents.base.discrete.DTAgent(position: torch.Tensor, velocity: torch.Tensor = tensor([0.0, 0.0]), time: float = 0, history: torch.Tensor = None, is_robot: bool = False, color: numpy.ndarray = None, identifier: str = None, **agent_params)

General agent representation.

An agent, whether in environment or real world, has a five-dimensional state vector and a state history, which are defined as follows:

\[s_t = (pos_x(t), pos_y(t), vel_x(t), vel_y(t), time)\]
\[history = (s_{t0}, s_{t1}, ..., s_{t})\]

For unique identification each agent has an id (3-character string) and for visualisation purposes a unique color. Both are created randomly during initialization.

The internal state of the agent can be altered by calling the update() function, which uses some (control) input to update the agent’s state using its dynamics, or the reset() function, which sets the internal state to some value directly. All other methods do not alter the internal state.

Parameters
  • position – current 2D position vector (2).

  • velocity – current 2D velocity vector (2).

  • time – current time stamp, default = 0.0.

  • history – current agent’s state history (N, 5), default = no history.

  • identifier – agent’s pre-set identifier, default = none so initialized randomly during initialization.

  • agent_params – dictionary of additional parameters stored in agent object.

check_feasibility_controls(controls: torch.Tensor) → bool

Check feasibility of a given set of controls to be executed by the internal agent.

In order to check feasibility convert the trajectory just check whether the controls are inside the internal control boundaries. Since changes of control input from one step to another are not limited, they are not checked here.

Parameters

controls – controls to be checked (N, 2).

check_feasibility_trajectory(trajectory: torch.Tensor, dt: float) → bool

Check feasibility of a given trajectory to be followed by the internal agent.

In order to check feasibility convert the trajectory to control inputs using the internal inverse dynamics of the agent and check whether all of them are inside its control boundaries.

Parameters
  • trajectory – trajectory to be checked (N, 4).

  • dt – time interval [s] between discrete trajectory states.

static compute_acceleration(trajectory: torch.Tensor, dt: float) → torch.Tensor

Compute the accelerations (ax, ay) of the agent give some trajectory. Therefore use the central difference theorem to numerically take the derivative of the velocity at every time-step.

Parameters
  • trajectory – agent trajectory (t_horizon, >=4 = px, py, vx, vy).

  • dt – time interval [s] between discrete trajectory states.

abstract control_limits() → Tuple[float, float]

Returns agent’s control input limitations, i.e. lower and upper bound.

abstract control_norm(controls: torch.Tensor) → torch.Tensor

Compute the agent’s control norm ||u||.

Parameters

controls – controls to calculate the norm from (N, 2).

Returns

control norm (N, 2).

detach()

Detach the agent’s internal variables (position, velocity, history) from computation tree. This is sometimes required to completely separate subsequent computations in PyTorch.

abstract dx_du(controls: torch.Tensor, dt: float) → torch.Tensor

Compute the derivative of the agent’s state with respect to its control input evaluated over controls u with discrete state time-step dt.

dynamics(state: torch.Tensor, action: torch.Tensor, dt: float) → torch.Tensor

Forward integrate the agent’s motion given some state-action pair and an integration time-step.

Passing the state, instead of using the internal state, allows the method to be used for other state vector than the internal state, e.g. for forward predicting over a time horizon > 1. Since every agent type has different dynamics (like single-integrator or Dubins Car) this method is implemented abstractly.

Parameters
  • state – state to be updated @ t = k (5).

  • action – control input @ t = k (size depending on agent type).

  • dt – forward integration time step [s].

Returns

updated state vector with time @ t = k + dt (5).

static expand_state_vector(state_4: torch.Tensor, time: float) → torch.Tensor

Expand 4 dimensional (x, y, vx, vy) state vector by time information.

expand_trajectory(path: torch.Tensor, dt: float) → torch.Tensor

Derive (position, orientation, velocity)-trajectory information from position data only, using numerical differentiation, i.e. v_i = (x_i+1 - x_i) / dt.

Since the path is not constrained to be dynamically infeasible, stretch it so that the maximal velocity is the maximally allowed velocity for the dynamics.

Parameters
  • path – sequence of states (position, velocity) without temporal dimension (N, 4).

  • dt – time interval which is assumed to be constant over full path sequence [s].

Returns trajectory

temporally-expanded path (N, 5).s

inverse_dynamics(state: torch.Tensor, state_previous: torch.Tensor, dt: float) → torch.Tensor

Determine the agent’s motion given its current and previous state.

Passing the state, instead of using the internal state, allows the method to be used for other state vector than the internal state, e.g. for forward predicting over a time horizon > 1. Since every agent type has different dynamics (like single-integrator or Dubins Car) this method is implemented abstractly.

Parameters
  • state – state @ t = k (4 or 5).

  • state_previous – previous state @ t = k - dt (4 or 5).

  • dt – forward integration time step [s].

Returns

control input @ t = k (size depending on agent type).

make_controls_feasible(controls: torch.Tensor) → torch.Tensor

Make controls feasible by clipping them between its lower and upper boundaries.

Return the transformed feasible controls. The direction of controls is thereby not changed, since just the length of the control vector is adapted. If the controls are just zeros then return them as they are.

abstract reachability_boundary(time_steps: int, dt: float) → mantrap.utility.maths.Shape2D

Compute the forward reachable set within the time-horizon with time_steps number of discrete steps with time-step dt. While the reachable set can be computed numerically in general, an exact (and more efficient) implementation has to be done in the child classes. Return this exact solution as class object of Shape2D type.

Parameters
  • time_steps – number of discrete time-steps in reachable time-horizon.

  • dt – time interval which is assumed to be constant over full path sequence [s].

reset(state: torch.Tensor, history: torch.Tensor = None)

Reset the complete state of the agent by resetting its position and velocity. Either adapt the agent’s history to the new state (i.e. append it to the already existing history) if history is given as None or set it to some given trajectory.

Parameters
  • state – new state (5).

  • history – new state history (N, 5).

roll_trajectory(trajectory: torch.Tensor, dt: float) → torch.Tensor

Determine the controls by iteratively applying the agent’s model inverse dynamics. Thereby a perfect model i.e. without uncertainty and correct is assumed.

To guarantee that the unrolled trajectory is invertible, i.e. when the resulting trajectory is back-transformed to the controls, the same controls should occur. Therefore no checks for the feasibility of the controls are made. Also this function is not updating the agent in fact, it is rather determining the theoretical trajectory given the agent’s dynamics and controls.

Parameters
  • trajectory – sequence of states to apply to the robot (N, 4).

  • dt – time interval [s] between discrete trajectory states.

Returns

inferred controls (no uncertainty in dynamics assumption !), (N, input_size).

sanity_check() → bool

Sanity check for agent. In order to evaluate the sanity of the agent in the most general form, several internal states such as the position and velocity are checked to be of the right type. Also the history should always reflect the states until and including (!) the current state.

unroll_trajectory(controls: torch.Tensor, dt: float) → torch.Tensor

Build the trajectory from some controls and current state, by iteratively applying the model dynamics. Thereby a perfect model i.e. without uncertainty and correct is assumed.

To guarantee that the unrolled trajectory is invertible, i.e. when the resulting trajectory is back-transformed to the controls, the same controls should occur. Therefore no checks for the feasibility of the controls are made. Also this function is not updating the agent in fact, it is rather determining the theoretical trajectory given the agent’s dynamics and controls.

Parameters
  • controls – sequence of inputs to apply to the robot (N, input_size).

  • dt – time interval [s] between discrete trajectory states.

Returns

resulting trajectory (no uncertainty in dynamics assumption !), (N, 4).

update(action: torch.Tensor, dt: float) → Tuple[torch.Tensor, torch.Tensor]

Update internal state (position, velocity and history) by forward integrating the agent’s dynamics over the given time-step. The new state and time are then appended to the agent’s state history.

Parameters
  • action – control input (size depending on agent type).

  • dt – forward integration time step [s].

Returns

executed control action and new agent state

update_inverse(state_next: torch.Tensor, dt: float) → Tuple[torch.Tensor, torch.Tensor]

Update internal state (position, velocity and history) by backward integrating the agent’s dynamics, i.e. by using the inverse dynamics to compute the executed action between the current and given next position and then execute this action again, to obtain the complete updated state as well as ensure feasibility.

Parameters
  • state_next – next state (5).

  • dt – forward integration time step [s].

Returns

executed control action and new agent state

Linear Agents

class mantrap.agents.base.linear.LinearDTAgent(position: torch.Tensor, velocity: torch.Tensor = tensor([0.0, 0.0]), history: torch.Tensor = None, dt: float = None, max_steps: int = 20, **agent_kwargs)

Intermediate agent class for agents having linear state dynamics.

For linear state dynamics the dynamics can be computed very efficiently, using matrix-vector multiplication, with constant matrices (state-space matrices).

\[x_{i + n} = A^n x_i + \sum_{k=0}^{n-1} A^{n-k-1} * B * u_{i+k}\]

Passing the simulation time-step directly instead of passing it to every function individually surely is not nice, but it enables pre-computing of the agent’s dynamics matrices, which speeds up computation of the dynamics by at least factor 5 (see examples/tools/timing), especially since the dynamics() method is the most called function in the full program. As storing the same information twice generally a bad programming paradigm and it reduces the capabilities of the controller a lot to only be able to handle one time-step, as a trade-off solution a dictionary of dynamics matrices is built, enabling handling multiple time-steps but still computing them once only.

Parameters
  • dt – default dynamics time-step [s], default none (no dynamics pre-computation).

  • max_steps – maximal number of pre-computed rolling steps.

dx_du(controls: torch.Tensor, dt: float) → torch.Tensor

Compute the derivative of the agent’s control input with respect to its state, evaluated over state trajectory x with discrete state time-step dt.

As follows from the equations for some state at time-step n x_n in dependence of the initial state x0 and the control inputs u_k, we have

due to the assumptions that the control inputs are independently from each other, their derivatives do not depend on each other, i.e. du_i/du_j = 0 for all i != j. Also the state at time-step n only depends on the initial state and the control inputs, not the states in between when the equation above is used ! Therefore we can simplify the equation above to the following:

dynamics_rolling_matrices(dt: float, max_steps: int) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor]

Determine matrices for batched trajectory-rolling dynamics using equation shown in the definition of the class, stacked to two matrices An and Bn.

\[An = [I, A, A^2, ..., A^n]\]
\[Bn = [[B, 0, ..., 0], [AB, B, 0, ..., 0], ..., [A^{n-1} B, ..., B]]\]
\[Tn = [[0, 0, 0, 0, 0], [0, 0, 0, 0, 1], ..., [0, 0, 0, 0, n]]\]
Parameters
  • dt – dynamics integration time-step [s].

  • max_steps – maximal number of pre-computed steps.

reachability_boundary(time_steps: int, dt: float) → mantrap.utility.maths.Circle

Generally for linear agents the N-th can be easily expressed in terms of the controls and the initial state by nesting the linear dynamics, resulting in

\[x_{i + N} = A^N x_i + \sum_{k = 0}^{N - 1} A^{N - k - 1} B u_{i + k}\]

Also linear agents are assumed to behave isotropic (here), i.e. the control input can change its direction instantly and without limitation. Therefore the forward reachable set within the number of time_steps is just a circle (in general ellipse, but agent has same control bounds for both x- and y-direction) around the current position, with radius being the maximal allowed agent speed.

Parameters
  • time_steps – number of discrete time-steps in reachable time-horizon.

  • dt – time interval which is assumed to be constant over full path sequence [s].

roll_trajectory(trajectory: torch.Tensor, dt: float) → torch.Tensor

Determine the controls by iteratively applying the agent’s model inverse dynamics. Thereby a perfect model i.e. without uncertainty and correct is assumed.

To guarantee that the unrolled trajectory is invertible, i.e. when the resulting trajectory is back-transformed to the controls, the same controls should occur. Therefore no checks for the feasibility of the controls are made. Also this function is not updating the agent in fact, it is rather determining the theoretical trajectory given the agent’s dynamics and controls.

For linear agents determining the controls from the trajectory is very straight-forward and most importantly does not have to be done sequentially (at least not for the types of agents used within this project). Therefore the roll_trajectory() method can be improved.

Parameters
  • trajectory – sequence of states to apply to the robot (N, 4).

  • dt – time interval [s] between discrete trajectory states.

Returns

inferred controls (no uncertainty in dynamics assumption !), (N, input_size).

unroll_trajectory(controls: torch.Tensor, dt: float) → torch.Tensor

Build the trajectory from some controls and current state, by iteratively applying the model dynamics. Thereby a perfect model i.e. without uncertainty and correct is assumed.

To guarantee that the unrolled trajectory is invertible, i.e. when the resulting trajectory is back-transformed to the controls, the same controls should occur. Therefore no checks for the feasibility of the controls are made. Also this function is not updating the agent in fact, it is rather determining the theoretical trajectory given the agent’s dynamics and controls.

Parameters
  • controls – sequence of inputs to apply to the robot (N, input_size).

  • dt – time interval [s] between discrete trajectory states.

Returns

resulting trajectory (no uncertainty in dynamics assumption !), (N, 4).

Single Integrator

class mantrap.agents.integrator_single.IntegratorDTAgent(position: torch.Tensor, velocity: torch.Tensor = tensor([0.0, 0.0]), history: torch.Tensor = None, dt: float = None, max_steps: int = 20, **agent_kwargs)

Linear single integrator dynamics:

\[vel_{t+1} = action\]
\[pos_{t+1} = pos_t + vel_{t+1} * dt\]
property acceleration_max

Since single integrators are assumed to react instantly, their maximal acceleration is in fact infinite !

control_limits() → Tuple[float, float]
\[[- v_{max}, v_{max}]\]
control_norm(controls: torch.Tensor) → torch.Tensor

Compute the agent’s control norm ||u|| = L2-norm.

Parameters

controls – controls to calculate the norm from (N, 2).

Returns

control norm (N, 1).

Double Integrator

class mantrap.agents.integrator_double.DoubleIntegratorDTAgent(position: torch.Tensor, velocity: torch.Tensor = tensor([0.0, 0.0]), history: torch.Tensor = None, dt: float = None, max_steps: int = 20, **agent_kwargs)

Linear double integrator dynamics:

\[vel_{t+1} = vel_t + action * dt\]
\[pos_{t+1} = pos_t + vel_{t+1} * dt\]
control_limits() → Tuple[float, float]
\[[- a_{max}, a_{max}]\]
control_norm(controls: torch.Tensor) → torch.Tensor

Compute the agent’s control norm ||u|| = L1-norm.

Parameters

controls – controls to calculate the norm from (N, 2).

Returns

control norm (N, 2).