alphaghost package

The app package.

Submodules

alphaghost.alphazero module

A basic AlphaZero implementation.

A version of open_spiel.python.algorithms.alpha_zero.alpha_zero modified to use alphaghost.mcts.MCTSBot and alphaghost.model.AlphaGhostModel to support imperfect information games, specifically Phantom Go.

class alphaghost.alphazero.Buffer

Bases: object

A fixed size buffer that keeps the newest values.

__init__(max_size)
Parameters:

max_size (int)

append(val)
Parameters:

val (Any)

extend(batch)
Parameters:

batch (Iterable[Any])

sample(count)
Parameters:

count (int)

class alphaghost.alphazero.Config

Bases: NamedTuple

Configuration for the AlphaGhOst algorithm.

actors: int

Alias for field number 9

board_size: int

Alias for field number 0

checkpoint_freq: int

Alias for field number 8

eval_levels: int

Alias for field number 12

evaluation_window: int

Alias for field number 11

evaluators: int

Alias for field number 10

classmethod from_json(path)

Load a config from a json file.

Parameters:

path (Path)

learning_rate: float

Alias for field number 2

max_simulations: int

Alias for field number 14

max_steps: int

Alias for field number 7

nn_depth: int

Alias for field number 20

nn_width: int

Alias for field number 19

observation_shape: Any

Alias for field number 21

output_size: int

Alias for field number 22

path: str

Alias for field number 1

policy_alpha: float

Alias for field number 15

policy_epsilon: float

Alias for field number 16

quiet: bool

Alias for field number 23

replay_buffer_reuse: int

Alias for field number 6

replay_buffer_size: int

Alias for field number 5

temperature: float

Alias for field number 17

temperature_drop: int

Alias for field number 18

train_batch_size: int

Alias for field number 4

uct_c: float

Alias for field number 13

weight_decay: float

Alias for field number 3

class alphaghost.alphazero.Trajectory

Bases: object

A sequence of observations, actions and policies, and the outcomes.

__init__()
add(information_state, action, policy)
class alphaghost.alphazero.TrajectoryState

Bases: object

A particular point along a trajectory.

__init__(observation, current_player, legals_mask, action, policy, value)
alphaghost.alphazero.actor(*, config, game, logger, queue)

Generate games and returns trajectories.

Parameters:
  • config (Config)

  • game (pyspiel.Game)

alphaghost.alphazero.alphazero(config=None)

Start all the worker processes for a full alphazero setup.

Parameters:

config (Config | None)

alphaghost.alphazero.evaluator(*, game, config, logger, queue)

Play the latest checkpoint vs standard MCTS.

Parameters:
  • game (pyspiel.Game)

  • config (Config)

alphaghost.alphazero.learner(*, game, config, actors, evaluators, broadcast_fn, logger)

Consume the replay buffer and train the network.

Parameters:
  • game (pyspiel.Game)

  • config (Config)

  • actors (list[open_spiel.python.utils.spawn.Process])

  • evaluators (list[open_spiel.python.utils.spawn.Process])

  • broadcast_fn (Callable[[Any], None])

alphaghost.alphazero.update_checkpoint(logger, queue, model, az_evaluator)

Read the queue for a checkpoint to load, or an exit signal.

Return type:

bool

Parameters:
  • model (AlphaGhostModel)

  • az_evaluator (open_spiel.python.algorithms.alpha_zero.evaluator.AlphaZeroEvaluator)

alphaghost.alphazero.watcher(fn)

Give a logger and logs exceptions.

Parameters:

fn (Callable)

alphaghost.mcts module

Monte-Carlo Tree Search algorithm for Phantom Go.

A version of open_spiel.python.algorithms.mcts.MCTSBot extended to support imperfect information games, specifically Phantom Go.

class alphaghost.mcts.MCTSBot

Bases: Bot

Bot that uses Monte-Carlo Tree Search algorithm.

__init__(game, uct_c, max_simulations, evaluator, solve=True, random_state=None, child_selection_fn=open_spiel.python.algorithms.mcts.SearchNode.uct_value, dirichlet_noise=None, verbose=False, dont_return_chance_node=False)

Initialize a MCTS Search algorithm in the form of a bot.

Parameters:
  • game (Game) – A pyspiel.Game to play.

  • uct_c (float) – The exploration constant for UCT.

  • max_simulations (int) – How many iterations of MCTS to perform. Each simulation will result in one call to the evaluator. Memory usage should grow linearly with simulations * branching factor. How many nodes in the search tree should be evaluated. This is correlated with memory size and tree depth.

  • evaluator (Evaluator) – A Evaluator object to use to evaluate a leaf node.

  • solve (bool) – Whether to back up solved states.

  • random_state (RandomState | None) – An optional numpy RandomState to make it deterministic.

  • child_selection_fn (Callable) – A function to select the child in the descent phase. The default is UCT.

  • dirichlet_noise (tuple[float, float] | None) – A tuple of (epsilon, alpha) for adding dirichlet noise to the policy at the root. This is from the alpha-zero paper.

  • verbose (bool) – Whether to print information about the search tree before returning the action. Useful for confirming the search is working sensibly.

  • dont_return_chance_node (bool) – If true, do not stop expanding at chance nodes. Enabled for AlphaZero.

Return type:

None

Search with Monte-Carlo Tree Search algorithm.

Return type:

SearchNode

Parameters:

state (pyspiel.State)

restart_at(state)
Return type:

None

step(state)

Return bot’s action at given state.

Return type:

int

Parameters:

state (pyspiel.State)

step_with_policy(state)

Return bot’s policy and action at given state.

Return type:

tuple[list[tuple[int, float]], int]

Parameters:

state (pyspiel.State)

alphaghost.model module

An AlphaZero style model with a policy and value head.

class alphaghost.model.AlphaGhostModel

Bases: Module

__init__(input_shape, output_size, nn_width, nn_depth, learning_rate, weight_decay, path)
Parameters:
forward(x)
classmethod from_checkpoint(path)

Load a model from a checkpoint.

Parameters:

path (Path)

inference(obs, mask)

Run a forward pass through the network.

load_checkpoint(path)

Load model checkpoint.

Return type:

None

Parameters:

path (str | Path)

save_checkpoint(step)

Save model checkpoint.

Return type:

Path

Parameters:

step (int)

update(train_inputs)

Perform a training step.

Return type:

Losses

Parameters:

train_inputs (list)

class alphaghost.model.Losses

Bases: NamedTuple

Losses from a training step.

l2: float

Alias for field number 2

policy: float

Alias for field number 0

property total: float
value: float

Alias for field number 1

class alphaghost.model.TrainInput

Bases: NamedTuple

Inputs for training the Model.

legals_mask: ndarray

Alias for field number 1

observation: ndarray

Alias for field number 0

policy: ndarray

Alias for field number 2

static stack(train_inputs)
Return type:

TrainInput

Parameters:

train_inputs (list)

value: ndarray

Alias for field number 3

alphaghost.parsers module

Parsers for game states.

Methods for extracting additional information from pyspiel.State objects beyond the core python API.

alphaghost.parsers.get_board(state, player_id=None)

Return the board positions visible to a player as a matrix.

Return type:

ndarray

Parameters:
  • state (pyspiel.State)

  • player_id (int | None)

alphaghost.parsers.get_board_size(state)

Return the board size.

Return type:

int

Parameters:

state (pyspiel.State)

alphaghost.parsers.get_board_svg(state, player_id=None)

Return the board as an SVG string.

Return type:

str

Parameters:
  • state (pyspiel.State)

  • player_id (int | None)

alphaghost.parsers.get_previous_move_info_string(state, player_id=None)

Return the previous move information.

Return type:

str

Parameters:
  • state (pyspiel.State)

  • player_id (int | None)

alphaghost.parsers.get_stones_count(state)

Return the number of stones for each player.

Return type:

ndarray

Parameters:

state (pyspiel.State)

alphaghost.parsers.get_visible_actions(state, player_id=None)

Return the visible board position’s indices.

Return type:

list[ndarray]

Parameters:
  • state (pyspiel.State)

  • player_id (int | None)

alphaghost.parsers.render_board(state, player_id=None)

Render the board as a vector graphic.

Return type:

Drawing

Parameters:
  • state (pyspiel.State)

  • player_id (int | None)

alphaghost.phantom_go module

Module for playing Phantom Go against various bots.

class alphaghost.phantom_go.GoColor

Bases: Enum

Black = 0
White = 1
class alphaghost.phantom_go.PhantomGoGame

Bases: object

__init__(config, black='human', white='ag', ag_model=None, verbose=False)
Parameters:
Return type:

None

auto_play(num_games=1)

Automatically play a number of games.

Return type:

None

Parameters:

num_games (int)

auto_step()

Automatically make steps until turn ends.

Return type:

None

bot_step()

Make a bot step.

Return type:

None

property current_player: int

Return the current player.

restart()

Restart the game.

Return type:

None

step(pos_str)

Make a step.

Return type:

None

Parameters:

pos_str (str)