Skip to main content

TF-Agents

A robust, scalable and easy to use Reinforcement Learning Library

Motivation

  • Greate for Learning RL: Colabs, examples, documentation
  • Well suited for solving complex problems with RL
  • Develop new RL algorithms quickly
  • Well tested and easy to configure with gin-config

Installation

pip install tf-agents-nightly

Example: Cart Pole

class CartPole(tf_agents.py_environment.PyEnvironment):

def observation_spec(self):
"""Defines the Observations"""

def action_spec(self):
"""Defines the Actions"""

def _reset(self):
"""Reset the environment and return an initial time_step(reward, observation)."""

def _step(self, action):
"""Apply the action an return the next time_step(reward, observation)."""

Trying to balance the Pole:

# Load the environment
env = suite_gym.load("CartPole-V1")

# Define a Policy
policy = ActorPolicy(...)

time_step = env.reset()
episode_return = 0.0

# Start playing
while not time_step.is_last():
policy_step = policy.action(time_step)
time_step = env.step(policy_step.action)
episode_return += time_step.reward
  • Actor Policy: Takes in observations and emits probability over the actions

Prepare to Train with TF Agents

# Create the Environment
tf_env = tf_py_environment.TFPyEnvironment(suite_gym.load("CartPole-V1"))

# Create the Network
action_net = actor_distribution_network.ActorDistributionNetwork(
tf_env.observation_spec(), tf_env.action_spec(),
fc_layer_params=[32, 64])

# Create the Agent
tf_agent = reinforce_agent.ReinforceAgent(
tf_env.time_step_spec(),
tf_env.action_spec(),
actor_network=actor_net,
optimizer=AdamOptimizer(learning_rate=learning_rate))

Collect Experience and Train with TF Agents

replay_buffer = TFUniformReplayBuffer()
driver = DynamicEpisodicDriver(
tf_env, agent.collect_policy,
observers=[replay_buffer.add_batck],
num_episodes=1)

for _ in range(num_iterations):
# Get experience
driver.run()
# train the Agent
experience = replay_buffer.gather_all()
agent.train(experience)
replay_buffer.clear()

TF-Agent @ Google IO 2019

TF-Agent @ TF Summit

TF-Agents Github repository

TF-Agents Colabs