Environment¶
Graph-Based Environment (abstract)¶
-
class
mantrap.environment.base.graph_based.
GraphBasedEnvironment
(ego_position: torch.Tensor = None, ego_velocity: torch.Tensor = tensor([0., 0.]), ego_history: torch.Tensor = None, ego_type: abc.ABCMeta = <class 'mantrap.agents.integrator_double.DoubleIntegratorDTAgent'>, x_axis: Tuple[float, float] = (-10, 10), y_axis: Tuple[float, float] = (-10, 10), dt: float = 0.4, time: float = 0.0, config_name: str = 'unknown')¶ General environment engine for obstacle-free, interaction-aware, probabilistic and multi-modal agent environments. As used in a robotics use-case the environment separates between the ego-agent (the robot) and ado-agents (other agents in the scene which are not the robot).
The internal states basically are the states of the ego and ados and can only be changed by either using the step() or step_reset() function, which simulate how the environment reacts based on some action performed by the ego or resets it directly to some given states.
The simulated world is two-dimensional and defined in the area limited by the passed x_axis and y_axis. It has a constant environment time-step dt.
-
add_ado
(position: torch.Tensor, velocity: torch.Tensor = tensor([0.0, 0.0]), history: torch.Tensor = None, **ado_kwargs) → mantrap.agents.integrator_single.IntegratorDTAgent¶ Add ado (i.e. non-robot) agent to environment as single integrator.
To represent pedestrians (ados) single integrator dynamics are best suitable due to their dynamic, reactive and fast changing nature.
While the ego is added to the environment during initialization, the ado agents have to be added afterwards, individually. To do so initialize single integrator agent using its state vectors, namely position, velocity and its state history. The ado id, color and other parameters can either be passed using the ado_kwargs option or are created automatically during the agent’s initialization.
After initialization check whether the given states are valid, i.e. do not pass the internal environment bounds, e.g. that they are in the given 2D space the environment is defined in.
- Parameters
position – ado initial position (2D).
velocity – ado initial velocity (2D).
history – ado state history (if None then just current state as history).
ado_kwargs – addition kwargs for ado initialization.
-
agent_by_id
(agent_id: str) → Optional[mantrap.agents.base.discrete.DTAgent]¶ Return an agent object by its id, including the ego agent.
- Parameters
agent_id – identifier of agent to return.
-
check_distribution
(distribution: Dict[str, torch.distributions.distribution.Distribution], t_horizon: int)¶ Check the distribution dictionary for correctness.
-
compute_distributions
(ego_trajectory: torch.Tensor, **kwargs) → Dict[str, torch.distributions.distribution.Distribution]¶ Build a dictionary of velocity distributions for every ado as it would be with the presence of a robot in the scene.
Build the graph conditioned on some ego_trajectory, which is assumed to be fix while the ados in the scene behave accordingly, i.e. in reaction to the ego’s trajectory. For the sake of differentiability using the by building the dictionary using PyTorch, a computational graph is built in the background which can later be used for automatically differentiate between its inputs and outputs.
- Parameters
ego_trajectory – ego’s trajectory (t_horizon, 5).
- Kwargs
additional graph building arguments.
- Returns
dictionary over every state of every agent in the scene for t in [0, t_horizon].
-
compute_distributions_wo_ego
(t_horizon: int, **kwargs) → Dict[str, torch.distributions.distribution.Distribution]¶ Build a dictionary of velocity distributions for every ado as it would be without the presence of a robot in the scene.
- Parameters
t_horizon – number of prediction time-steps.
- Kwargs
additional graph building arguments.
- Returns
ado_id-keyed velocity distribution dictionary for times [0, t_horizon].
-
copy
(env_type: str = None) → mantrap.environment.base.graph_based.GraphBasedEnvironment¶ Create copy of environment.
However just using deepcopy is not supported for tensors that are not detached from the PyTorch computation graph. Therefore re-initialize the objects such as the agents in the environment and reset their state to the internal current state.
While copying the environment-type can be defined by the user, which is possible due to standardized class interface of every environment-type. When no environment is defined, the default environment will be used which is the type of the executing class object.
-
detach
()¶ Detach all internal agents (ego and all ados) from computation graph. This is sometimes required to completely separate subsequent computations in PyTorch.
-
predict_w_controls
(ego_controls: torch.Tensor) → Optional[torch.Tensor]¶ Predict the ado path distribution means based conditioned on robot controls.
- Parameters
ego_controls – ego control input (prediction_horizon, 2).
- Returns
predicted ado paths (num_ados, num_samples, prediction_horizon+1, num_modes, 2). if no ado in scene, return None instead.
-
predict_w_trajectory
(ego_trajectory: torch.Tensor) → Optional[torch.Tensor]¶ Predict the ado path samples based conditioned on robot trajectory.
- Parameters
ego_trajectory – ego trajectory (prediction_horizon + 1, 5).
- Returns
predicted ado paths (num_ados, num_samples, prediction_horizon+1, num_modes, 2). if no ado in scene, return None instead.
-
predict_wo_ego
(t_horizon: int) → Optional[torch.Tensor]¶ Predict the unconditioned ado path distribution means (i.e. if no robot would be in the scene).
- Parameters
t_horizon – prediction horizon, number of discrete time-steps.
- Returns
predicted ado paths (num_ados, num_samples, t_horizon+1, num_modes, 2). if no ado in scene, return None instead.
-
same_initial_conditions
(other: mantrap.environment.base.graph_based.GraphBasedEnvironment)¶ Similar to __eq__() function, but not enforcing parameters of environment to be completely equivalent, merely enforcing the initial conditions to be equal, such as states of agents in scene. Hence, all prediction depending parameters, dont have to be equal.
- Parameters
other – comparable environment object.
-
sample_w_controls
(ego_controls: torch.Tensor, num_samples: int = 1) → Optional[torch.Tensor]¶ Predict the ado path samples based conditioned on robot controls.
- Parameters
ego_controls – ego control input (prediction_horizon, 2).
num_samples – number of samples to return.
- Returns
predicted ado paths (num_ados, num_samples, prediction_horizon+1, num_modes, 2). if no ado in scene, return None instead.
-
sample_w_trajectory
(ego_trajectory: torch.Tensor, num_samples: int = 1) → Optional[torch.Tensor]¶ Predict the ado path samples based conditioned on robot trajectory.
- Parameters
ego_trajectory – ego trajectory (prediction_horizon + 1, 5).
num_samples – number of samples to return.
- Returns
predicted ado paths (num_ados, num_samples, prediction_horizon+1, num_modes, 2). if no ado in scene, return None instead.
-
sample_wo_ego
(t_horizon: int, num_samples: int = 1) → Optional[torch.Tensor]¶ Predict the unconditioned ado path samples (i.e. if no robot would be in the scene).
- Parameters
t_horizon – prediction horizon, number of discrete time-steps.
num_samples – number of samples to return.
- Returns
predicted ado paths (num_ados, num_samples, prediction_horizon+1, num_modes, 2). if no ado in scene, return None instead.
-
sanity_check
(check_ego: bool = False) → bool¶ Check for the sanity of the scene and agents.
For the environment to be “sane” general environment properties should hold as well as all agents living in the environment should be “sane” as well. For example it is ensured that the number and order of ado_ids is equal to the list of ados itself.
-
states
() → Tuple[torch.Tensor, torch.Tensor]¶ Return current states of ego and ado agents in the scene. Since the current state is known for every ado the states are deterministic and uni-modal. States are returned as vector including temporal dimension.
- Returns
ego state vector including temporal dimension (5).
- Returns
ado state vectors including temporal dimension (num_ados, 5).
-
step
(ego_action: torch.Tensor) → Tuple[torch.Tensor, Optional[torch.Tensor]]¶ Run environment step (time-step = dt).
Attention: This method changes the states of all environment agents, by executing the ego action and sample from the conditioned ado velocity distribution.
- Parameters
ego_action – planned ego control input for current time step (2).
- Returns
ado_states (num_ados, num_modes, 1, 5), ego_next_state (5) in next time step.
-
step_reset
(ego_next: Optional[torch.Tensor], ado_next: Optional[torch.Tensor])¶ Run environment step (time-step = dt). Instead of predicting the behaviour of every agent in the scene, it is given as an input and the agents are merely updated. All the ghosts (modes of an ado) will collapse to the same given state, since the update is deterministic.
- Parameters
ego_next – ego state for next time step (5).
ado_next – ado states for next time step (num_ados, num_modes, 1, 5).
-
visualize_prediction
(ego_trajectory: torch.Tensor, num_samples: int = 10, **vis_kwargs)¶ Visualize the predictions for the scene based on the given ego trajectory.
-
visualize_prediction_w_controls
(ego_controls: torch.Tensor, num_samples: int = 10, **vis_kwargs)¶ Visualize the predictions for the scene based on the given ego controls.
-
visualize_prediction_wo_ego
(t_horizon: int, num_samples: int = 10, **vis_kwargs)¶ Visualize the predictions for the scene based on the given ego trajectory.
-
Particle Environments¶
-
class
mantrap.environment.base.particle.
ParticleEnvironment
(ego_position: torch.Tensor = None, ego_velocity: torch.Tensor = tensor([0., 0.]), ego_history: torch.Tensor = None, ego_type: abc.ABCMeta = <class 'mantrap.agents.integrator_double.DoubleIntegratorDTAgent'>, x_axis: Tuple[float, float] = (-10, 10), y_axis: Tuple[float, float] = (-10, 10), dt: float = 0.4, time: float = 0.0, config_name: str = 'unknown')¶ -
abstract
create_particles
(num_particles: int, param_dicts: Dict[str, Dict[str, Dict]] = None, const_dicts: Dict[str, Dict[str, Any]] = None, **particle_kwargs) → Tuple[List[List[mantrap.agents.integrator_single.IntegratorDTAgent]], torch.Tensor]¶ Create particles from internal parameter distribution.
In order to create parameters sample from the underlying parameter distributions, which are modelled as independent, uni-modal Gaussian distributions, individual for each ado. So build the distribution, sample N = num_particles values for each ado and parameter and create N copies of each ado (particle) storing the sampled parameter.
Attention: Not the same convention over all input parameters !
- Parameters
num_particles – number of particles per ado.
param_dicts – parameter dictionaries for varying parameters between particles. {param_name: {ado_id: (mean, variance)}, ….}
const_dicts – dictionary mapping ado-wise constant parameters to parameter name. {ado_id: {param_name: values}, ….}
- Returns
list of N = num_particles for every ado in the scene.
- Returns
probability (pdf) of each particle (num_ados, num_particles).
-
abstract
simulate_particle
(particle: mantrap.agents.integrator_single.IntegratorDTAgent, means_t: torch.Tensor, ego_state_t: torch.Tensor = None) → mantrap.agents.integrator_single.IntegratorDTAgent¶ Forward simulate particle for one time-step (t -> t + 1).
- Parameters
particle – particle agent to simulate.
means_t – means of positional and velocity distribution at time t (num_ados, 4).
ego_state_t – ego/robot state at time t.
-
abstract
Social Forces¶
Social Forces Simulation.
Pedestrian Dynamics based on to “Social Force Model for Pedestrian Dynamics” (D. Helbling, P. Molnar). The idea of Social Forces is to determine interaction forces by taking into account the following entities:
Goal force: Each ado has a specific goal state/position in mind, to which it moves to. The goal pulling force is modelled as correction term between the direction vector of the current velocity and the goal direction (vector between current position and goal).
\[F_{goal} = 1 / tau_{a} (v_a^0 e_a - v_a)\]Interaction force: For modelling interaction between multiple agents such as avoiding collisions each agent has an with increasing distance exponentially decaying repulsion field. Together with the scalar product of the velocity of each agent pair (in order to not alter close but non interfering agents, e.g. moving parallel to each other) the interaction term is constructed.
\[V_{aB} (x) = V0_a exp(−x / \sigma_a)\]\[F_{interaction} = - grad_{r_{ab}} V_{aB}(||r_{ab}||)\]Although theoretically a force introduces an acceleration as the ado’s control input (double not single integrator dynamics), accelerations are simplified to act as velocities, which is reasonable due to the fast reaction time of the pedestrian, which is way faster than the environment sampling rate.
Add ado (i.e. non-robot) agent to environment as single integrator.
While the ego is added to the environment during initialization, the ado agents have to be added afterwards, individually. To do so initialize single integrator agent using its state vectors, namely position, velocity and its state history. The ado id, color and other parameters can either be passed using the ado_kwargs option or are created automatically during the agent’s initialization.
After initialization check whether the given states are valid, i.e. do not pass the internal environment bounds, e.g. that they are in the given 2D space the environment is defined in.
Simply add a goal state to the ado definition as required by social forces model
- Parameters
position – ado initial position (2D).
velocity – ado initial velocity (2D).
history – ado state history (if None then just stacked current state).
goal – ado goal position (2D).
ado_kwargs – addition kwargs for ado initialization.
Create particles from internal parameter distribution.
In order to create parameters sample from the underlying parameter distributions, which are modelled as independent, uni-modal Gaussian distributions, individual for each ado. So build the distribution, sample N = num_particles values for each ado and parameter and create N copies of each ado (particle) storing the sampled parameter.
- Parameters
num_particles – number of particles per ado.
v0_dict – parameter v0 gaussian distribution (mean, variance) by ado_id, if None then gaussian with mean, variance = mantrap.constants.SOCIAL_FORCES_DEFAULT_V0 similarly for each ado.
sigma_dict – parameter sigma gaussian distribution (similar to v0_dict).
tau – tau parameter, by default mantrap.constants.SOCIAL_FORCES_DEFAULT_TAU, which has to be shared over all agents.
- Returns
list of N = num_particles for every ado in the scene.
- Returns
probability (pdf) of each particle (num_ados, num_particles).
Forward simulate particle for one time-step (t -> t + 1).
Use the social forces equations writen in the class description for updating the particles. Thereby take into account repulsive forces from both the robot and other ados as well as a pulling force between the particle and its “goal” state.
- Parameters
particle – particle agent to simulate.
means_t – means of positional and velocity distribution at time t (num_ados, 4).
ego_state_t – ego/robot state at time t.
Potential Field¶
-
class
mantrap.environment.simplified.potential_field.
PotentialFieldEnvironment
(ego_position: torch.Tensor = None, ego_velocity: torch.Tensor = tensor([0., 0.]), ego_history: torch.Tensor = None, ego_type: abc.ABCMeta = <class 'mantrap.agents.integrator_double.DoubleIntegratorDTAgent'>, x_axis: Tuple[float, float] = (-10, 10), y_axis: Tuple[float, float] = (-10, 10), dt: float = 0.4, time: float = 0.0, config_name: str = 'unknown')¶ Simplified version of social forces environment class.
The simplified model assumes static agents (ados) in the scene, having zero velocity (if not stated otherwise) and staying in this (non)-movement since no forces are applied. Hereby, the graph model is cut to the pure interaction between ego and ado, no inter-ado interaction and goal pulling force. Since the ados would not move at all without an ego agent in the scene, the interaction loss simply comes down the the distance of every position of the ados in time to their initial (static) position.
Although theoretically a force introduces an acceleration as the ado’s control input (double not single integrator dynamics), accelerations are simplified to act as velocities, which is reasonable due to the fast reaction time of the pedestrian, which is way faster than the environment sampling rate.
-
create_particles
(num_particles: int, v0_dict: Dict[str, Tuple[float, float]] = None, **particle_kwargs) → Tuple[List[List[mantrap.agents.integrator_single.IntegratorDTAgent]], torch.Tensor]¶ Create particles from internal parameter distribution.
In order to create parameters sample from the underlying parameter distributions, which are modelled as independent, uni-modal Gaussian distributions, individual for each ado. So build the distribution, sample N = num_particles values for each ado and parameter and create N copies of each ado (particle) storing the sampled parameter.
- Parameters
num_particles – number of particles per ado.
v0_dict – parameter v0 gaussian distribution (mean, variance) by ado_id, if None then gaussian with mean, variance = mantrap.constants.POTENTIAL_FIELD_V0_DEFAULT similarly for each ado.
- Returns
list of N = num_particles for every ado in the scene.
- Returns
probability (pdf) of each particle (num_ados, num_particles).
-
simulate_particle
(particle: mantrap.agents.integrator_single.IntegratorDTAgent, means_t: torch.Tensor, ego_state_t: torch.Tensor = None) → mantrap.agents.integrator_single.IntegratorDTAgent¶ Forward simulate particle for one time-step (t -> t + 1).
As described in the class description the potential field environment merely takes into account the repulsive force with respect to the ego, not to other ados, and introduces it as direct control input for the particle.
- Parameters
particle – particle agent to simulate.
means_t – means of positional and velocity distribution at time t (num_ados, 4).
ego_state_t – ego/robot state at time t.
-
Kalman¶
-
class
mantrap.environment.simplified.kalman.
KalmanEnvironment
(ego_position: torch.Tensor = None, ego_velocity: torch.Tensor = tensor([0., 0.]), ego_history: torch.Tensor = None, ego_type: abc.ABCMeta = <class 'mantrap.agents.integrator_double.DoubleIntegratorDTAgent'>, x_axis: Tuple[float, float] = (-10, 10), y_axis: Tuple[float, float] = (-10, 10), dt: float = 0.4, time: float = 0.0, config_name: str = 'unknown')¶ Kalman (Filter) - based Environment.
The Kalman environment implements the update rules, defined in the Kalman Filter, to update the agents states iteratively. Thereby no interactions between the agents are taken into account. The growing uncertainty in the prediction, growing with the number of predicted time-steps, is modelled as a variance increasing with constant rate, which in the Kalman Equations is a constant noise Q_k. Assuming constant state space matrices (F, B) we get:
with z_k = H x_k (H = H_k = eye(N)) the observations, I = eye(n) the identity matrix and R = 0 (perfect perception assumption). In the notation used in mantrap.agents the state space matrix A is used instead of F, but they are equivalent.
Since no interactions are assumed to ados control inputs (i.e. velocities) are assumed to stay constant over the full prediction horizon).
Trajectron¶
- class
mantrap.environment.trajectron.
Trajectron
(ego_position: torch.Tensor = None, ego_velocity: torch.Tensor = tensor([0., 0.]), ego_history: torch.Tensor = None, ego_type: abc.ABCMeta = <class 'mantrap.agents.integrator_double.DoubleIntegratorDTAgent'>, dt: float = 0.4, **env_kwargs)¶Trajectron-based environment model (B. Ivanovic, T. Salzmann, M. Pavone).
The Trajectron model requires to get some robot position. Therefore, in order to minimize the impact of the ego robot on the trajectories (since the prediction should be not conditioned on the robot) some pseudo trajectory is used, which is very far distant from the actual scene.
Within the trajectory optimisation the ado’s trajectories conditioned on the robot’s planned motion are compared with their trajectories without taking any robot into account. So when both the conditioned and un-conditioned model for these predictions would be used, and they would be behavioral different, it would lead to some base difference (even if there is no robot affecting some ado at all) which might be larger in scale than the difference the conditioning on the robot makes. Then minimizing the difference would miss the goal of minimizing interaction.
add_ado
(position: torch.Tensor, velocity: torch.Tensor = tensor([0.0, 0.0]), history: torch.Tensor = None, **ado_kwargs) → mantrap.agents.integrator_single.IntegratorDTAgent¶Add ado (i.e. non-robot) agent to environment as single integrator.
While the ego is added to the environment during initialization, the ado agents have to be added afterwards, individually. To do so initialize single integrator agent using its state vectors, namely position, velocity and its state history. The ado id, color and other parameters can either be passed using the ado_kwargs option or are created automatically during the agent’s initialization.
After initialization check whether the given states are valid, i.e. do not pass the internal environment bounds, e.g. that they are in the given 2D space the environment is defined in.
Next to the internal ado representation of each ado within this class, the Trajectron model has an own graph, in which the ado has to be introduced as a node during initialization. Also Trajectron model is trained to predict accurately, iff the agent has some history > 1, therefore if no history is given build a custom history by stacking the given (position, velocity) state over multiple time-steps. If zero history should be enforced, pass a non None history argument.
- Parameters
position – ado initial position (2D).
velocity – ado initial velocity (2D).
history – ado state history (if None then just stacked current state).
ado_kwargs – addition kwargs for ado initialization.
agent_by_id
(agent_id: str) → Optional[mantrap.agents.base.discrete.DTAgent]¶Return an agent object by its id, including the ego agent.
- Parameters
agent_id – identifier of agent to return.
- static
agent_id_from_node
(node: str) → str¶In Trajectron nodes have an identifier structure as follows “node_type/node_id”. As initialized the node_id is identical to the internal node_id while is node type is e.g. “ROBOT” or “PEDESTRIAN”. However it is not assumed that the node_type has to be robot or pedestrian, since it does not change the structure.
detach
()¶Detaching the whole graph (which is the whole neural network) might be hard. Therefore just rebuilt it from scratch completely, using the most up-to-date states of the agents.
SGAN¶
- class
mantrap.environment.sgan.
SGAN
(ego_position: torch.Tensor = None, ego_velocity: torch.Tensor = tensor([0., 0.]), ego_history: torch.Tensor = None, ego_type: abc.ABCMeta = <class 'mantrap.agents.integrator_double.DoubleIntegratorDTAgent'>, dt: float = 0.4, **env_kwargs)¶SGAN from “Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks” (Gupta, 2018).
The SGAN model is a GAN-based model for pedestrian prediction. As it is not conditioned on the robot, it is not of use for robot trajectory optimization here. However, it still can be utilized as an independent, but accurate pedestrian prediction model for evaluation purposes.
As a consequence, merely the trajectory sampling functions are implemented here. Since SGAN is based on a GAN, it merely outputs an empirical distribution, therefore mean values (for prediction-function) cannot be evaluated as well, without averaging over many samples.
add_ado
(position: torch.Tensor, velocity: torch.Tensor = tensor([0.0, 0.0]), history: torch.Tensor = None, **ado_kwargs) → mantrap.agents.integrator_single.IntegratorDTAgent¶Add ado (i.e. non-robot) agent to environment as single integrator.
While the ego is added to the environment during initialization, the ado agents have to be added afterwards, individually. To do so initialize single integrator agent using its state vectors, namely position, velocity and its state history. The ado id, color and other parameters can either be passed using the ado_kwargs option or are created automatically during the agent’s initialization.
After initialization check whether the given states are valid, i.e. do not pass the internal environment bounds, e.g. that they are in the given 2D space the environment is defined in.
The SGAN model is trained to predict accurately, iff the agent has some history > 1, therefore if no history is given build a custom history by stacking the given (position, velocity) state over multiple time-steps. If zero history should be enforced, pass a non None history argument.
- Parameters
position – ado initial position (2D).
velocity – ado initial velocity (2D).
history – ado state history (if None then just stacked current state).
ado_kwargs – addition kwargs for ado initialization.
sample_w_trajectory
(ego_trajectory: torch.Tensor, num_samples: int = 1) → Optional[torch.Tensor]¶Predict the ado path samples based conditioned on robot trajectory.
As described above the SGAN predictions are not conditioned on the ego trajectory. Consequently, this method is equal to the sample_wo_ego() method.
- Parameters
ego_trajectory – ego trajectory (prediction_horizon + 1, 5).
num_samples – number of samples to return.
- Returns
predicted ado paths (num_ados, num_samples, prediction_horizon+1, num_modes=1, 2). if no ado in scene, return None instead.
sample_wo_ego
(t_horizon: int, num_samples: int = 1) → Optional[torch.Tensor]¶Predict the unconditioned ado path samples (i.e. if no robot would be in the scene).
For prediction simply call the SGAN generator model and sample trajectories from it.
- Parameters
t_horizon – prediction horizon, number of discrete time-steps.
num_samples – number of samples to return.
- Returns
predicted ado paths (num_ados, num_samples, prediction_horizon+1, num_modes=1, 2). if no ado in scene, return None instead.