alphaghost package

The app package.

Submodules

alphaghost.alphazero module

A basic AlphaZero implementation.

A version of open_spiel.python.algorithms.alpha_zero.alpha_zero modified to use alphaghost.mcts.MCTSBot and alphaghost.model.AlphaGhostModel to support imperfect information games, specifically Phantom Go.

class alphaghost.alphazero.Buffer

Bases: object

A fixed size buffer that keeps the newest values.

__init__(max_size)

Parameters:: max_size (int)

append(val)

Parameters:: val (Any)

extend(batch)

Parameters:: batch (Iterable[Any])

sample(count)

Parameters:: count (int)

class alphaghost.alphazero.Config

Bases: NamedTuple

Configuration for the AlphaGhOst algorithm.

actors: int: Alias for field number 9

board_size: int: Alias for field number 0

checkpoint_freq: int: Alias for field number 8

eval_levels: int: Alias for field number 12

evaluation_window: int: Alias for field number 11

evaluators: int: Alias for field number 10

classmethod from_json(path)

Load a config from a json file.

Parameters:: path (Path)

learning_rate: float: Alias for field number 2

max_simulations: int: Alias for field number 14

max_steps: int: Alias for field number 7

nn_depth: int: Alias for field number 20

nn_width: int: Alias for field number 19

observation_shape: Any: Alias for field number 21

output_size: int: Alias for field number 22

path: str: Alias for field number 1

policy_alpha: float: Alias for field number 15

policy_epsilon: float: Alias for field number 16

quiet: bool: Alias for field number 23

replay_buffer_reuse: int: Alias for field number 6

replay_buffer_size: int: Alias for field number 5

temperature: float: Alias for field number 17

temperature_drop: int: Alias for field number 18

train_batch_size: int: Alias for field number 4

uct_c: float: Alias for field number 13

weight_decay: float: Alias for field number 3

class alphaghost.alphazero.Trajectory

Bases: object

A sequence of observations, actions and policies, and the outcomes.

__init__()

add(information_state, action, policy)

class alphaghost.alphazero.TrajectoryState

Bases: object

A particular point along a trajectory.

__init__(observation, current_player, legals_mask, action, policy, value)

alphaghost.alphazero.actor(*, config, game, logger, queue)

Generate games and returns trajectories.

Parameters:

config (Config)
game (pyspiel.Game)

alphaghost.alphazero.alphazero(config=None)

Start all the worker processes for a full alphazero setup.

Parameters:: config (Config | None)

alphaghost.alphazero.evaluator(*, game, config, logger, queue)

Play the latest checkpoint vs standard MCTS.

Parameters:

game (pyspiel.Game)
config (Config)

alphaghost.alphazero.learner(*, game, config, actors, evaluators, broadcast_fn, logger)

Consume the replay buffer and train the network.

Parameters:

game (pyspiel.Game)
config (Config)
actors (list[open_spiel.python.utils.spawn.Process])
evaluators (list[open_spiel.python.utils.spawn.Process])
broadcast_fn (Callable[[Any], None])

alphaghost.alphazero.update_checkpoint(logger, queue, model, az_evaluator)

Read the queue for a checkpoint to load, or an exit signal.

Return type:

bool

Parameters:

model (AlphaGhostModel)
az_evaluator (open_spiel.python.algorithms.alpha_zero.evaluator.AlphaZeroEvaluator)

alphaghost.alphazero.watcher(fn)

Give a logger and logs exceptions.

Parameters:: fn (Callable)

alphaghost.mcts module

Monte-Carlo Tree Search algorithm for Phantom Go.

A version of open_spiel.python.algorithms.mcts.MCTSBot extended to support imperfect information games, specifically Phantom Go.

class alphaghost.mcts.MCTSBot

Bases: Bot

Bot that uses Monte-Carlo Tree Search algorithm.

__init__(game, uct_c, max_simulations, evaluator, solve=True, random_state=None, child_selection_fn=open_spiel.python.algorithms.mcts.SearchNode.uct_value, dirichlet_noise=None, verbose=False, dont_return_chance_node=False)

Initialize a MCTS Search algorithm in the form of a bot.

Parameters:

game (Game) – A pyspiel.Game to play.
uct_c (float) – The exploration constant for UCT.
max_simulations (int) – How many iterations of MCTS to perform. Each simulation will result in one call to the evaluator. Memory usage should grow linearly with simulations * branching factor. How many nodes in the search tree should be evaluated. This is correlated with memory size and tree depth.
evaluator (Evaluator) – A Evaluator object to use to evaluate a leaf node.
solve (bool) – Whether to back up solved states.
random_state (RandomState | None) – An optional numpy RandomState to make it deterministic.
child_selection_fn (Callable) – A function to select the child in the descent phase. The default is UCT.
dirichlet_noise (tuple[float, float] | None) – A tuple of (epsilon, alpha) for adding dirichlet noise to the policy at the root. This is from the alpha-zero paper.
verbose (bool) – Whether to print information about the search tree before returning the action. Useful for confirming the search is working sensibly.
dont_return_chance_node (bool) – If true, do not stop expanding at chance nodes. Enabled for AlphaZero.

Return type:

None

mcts_search(state)

Search with Monte-Carlo Tree Search algorithm.

Return type:: SearchNode
Parameters:: state (pyspiel.State)

restart_at(state)

Return type:: None

step(state)

Return bot’s action at given state.

Return type:: int
Parameters:: state (pyspiel.State)

step_with_policy(state)

Return bot’s policy and action at given state.

Return type:: tuple[list[tuple[int, float]], int]
Parameters:: state (pyspiel.State)

alphaghost.model module

An AlphaZero style model with a policy and value head.

class alphaghost.model.AlphaGhostModel

Bases: Module

__init__(input_shape, output_size, nn_width, nn_depth, learning_rate, weight_decay, path)

Parameters:

input_shape (list[int])
output_size (int)
nn_width (int)
nn_depth (int)
learning_rate (float)
weight_decay (float)
path (str | Path)

forward(x)

classmethod from_checkpoint(path)

Load a model from a checkpoint.

Parameters:: path (Path)

inference(obs, mask): Run a forward pass through the network.

load_checkpoint(path)

Load model checkpoint.

Return type:: None
Parameters:: path (str | Path)

save_checkpoint(step)

Save model checkpoint.

Return type:: Path
Parameters:: step (int)

update(train_inputs)

Perform a training step.

Return type:: Losses
Parameters:: train_inputs (list)

class alphaghost.model.Losses

Bases: NamedTuple

Losses from a training step.

l2: float: Alias for field number 2

policy: float: Alias for field number 0

property total: float

value: float: Alias for field number 1

class alphaghost.model.TrainInput

Bases: NamedTuple

Inputs for training the Model.

legals_mask: ndarray: Alias for field number 1

observation: ndarray: Alias for field number 0

policy: ndarray: Alias for field number 2

static stack(train_inputs)

Return type:: TrainInput
Parameters:: train_inputs (list)

value: ndarray: Alias for field number 3

alphaghost.parsers module

Parsers for game states.

Methods for extracting additional information from pyspiel.State objects beyond the core python API.

alphaghost.parsers.get_board(state, player_id=None)

Return the board positions visible to a player as a matrix.

Return type:

ndarray

Parameters:

state (pyspiel.State)
player_id (int | None)

alphaghost.parsers.get_board_size(state)

Return the board size.

Return type:: int
Parameters:: state (pyspiel.State)

alphaghost.parsers.get_board_svg(state, player_id=None)

Return the board as an SVG string.

Return type:

str

Parameters:

state (pyspiel.State)
player_id (int | None)

alphaghost.parsers.get_previous_move_info_string(state, player_id=None)

Return the previous move information.

Return type:

str

Parameters:

state (pyspiel.State)
player_id (int | None)

alphaghost.parsers.get_stones_count(state)

Return the number of stones for each player.

Return type:: ndarray
Parameters:: state (pyspiel.State)

alphaghost.parsers.get_visible_actions(state, player_id=None)

Return the visible board position’s indices.

Return type:

list[ndarray]

Parameters:

state (pyspiel.State)
player_id (int | None)

alphaghost.parsers.render_board(state, player_id=None)

Render the board as a vector graphic.

Return type:

Drawing

Parameters:

state (pyspiel.State)
player_id (int | None)

alphaghost.phantom_go module

Module for playing Phantom Go against various bots.

class alphaghost.phantom_go.GoColor

Bases: Enum

Black = 0

White = 1

class alphaghost.phantom_go.PhantomGoGame

Bases: object

__init__(config, black='human', white='ag', ag_model=None, verbose=False)

Parameters:

config (Config)
black (str)
white (str)
ag_model (Path | None)
verbose (bool)

Return type:

None

auto_play(num_games=1)

Automatically play a number of games.

Return type:: None
Parameters:: num_games (int)

auto_step()

Automatically make steps until turn ends.

Return type:: None

bot_step()

Make a bot step.

Return type:: None

property current_player: int: Return the current player.

restart()

Restart the game.

Return type:: None

step(pos_str)

Make a step.

Return type:: None
Parameters:: pos_str (str)