GWD


GWD is my Polish draughts program that I have developed together with Klaas Bor, author of Turbo Dambase. Recently I decided to implement parallel alpha-beta search in GWD. Parallelising alpha-beta is not easy as the serial algorithm is already quite efficient. There are two main issues when distributing moves among threads: the time spent searching moves can vary considerably (so one thread may spend all it's time on one move, while another finishes the remaining moves in less time) and if one thread finds a beta-cutoff the results of the other threads become irrelevant. Also, when you distribute moves you do not know if a thread is available or when it becomes available.
The approach I have taken to solve these issues is to use a publish/subscribe mechanism. When a thread wants to distribute moves to other threads it publishes a workload consisting of the moves, whether a move has been searched, the best move, the best score, alpha and beta. The move status, the best move and the best score are shared between threads and exclusive access is controlled with semaphores. Threads continuously loop over all workloads and check whether another thread is already working on it. If not it allocates the workload to itself and starts working on it. It asks the workload for the next move that has not been searched yet. It evaluates the move and updates the move status and if necessary also updates the best move and best score in the workload. Each workload is assigned a unique id when it is published. When a thread detects a beta-cutoff it flags all remaining moves as searched. When a thread asks for the next move that has not been searched yet or wants to update the best move and the best score it first checks if the id has changed. If so the thread knows that a beta-cutoff has occurred and will stop searching that node.
Using this publish/subscribe mechanism leads to quite efficient parallelisation. If a thread wants to publish a workload it can of course check whether all other workloads are currently assigned to threads and then decide not to publish it, but another thread could become available shortly wasting a parallelisation opportunity. Currently I allow twice as many workloads as threads so threads can publish workloads even if no other threads are currently available. I assign one thread to one workload, but I can easily assign multiple threads to a workload. You can for example imagine a scheduling policy where a thread first checks if a workload has not been assigned to a thread yet, but if all workloads have been assigned it allocates itself to an already assigned workload.
The only synchronisations that I have are the waits for the semaphores to update the move status, the best move and the best score and the node that publishes a workload has to wait for the other thread (if any) to finish it's search of the move if a beta-cutoff did not occur.
The following describes the idea in (pseudo) Free Pascal code:
  function alpha_beta(my_alpha, my_beta): integer;
  begin
    best_move := 0;
    best_score := -SCORE_INFINITY;

    workload_id := 0;
    imove := 0;
    while true do
    begin
      Inc(imove);
      if (workload_id = 0) and (imove = 2) then
      begin
        workload_id := publish_workload(my_alpha, my_beta, best_move, best_score,
          unique_id);
      end;

      if (workload_id <> 0) then
      begin
        Assert(imove > 1);
        Assert(workloads[workload_id].choose_next_move(unique_id, imove) = true);
      end;

      if imove > nmoves then Break;

      temp_alpha := my_alpha;
      if best_score > temp_alpha then temp_alpha := best_score;

      if imove = 1 then
      begin
        do_move(imove);
        temp_score := -alpha_beta(-my_beta, -temp_alpha);
        undo_move(imove);
      end
      else
      begin
        do_move(imove);
        temp_score := -alpha_beta(-temp_alpha - 1, -temp_alpha);
        if (temp_score > temp_alpha) and (temp_score < my_beta) then
        begin
          temp_score := -alpha_beta(-my_beta, -temp_alpha);
        end;
        undo_move(imove);
      end;
      if temp_score > best_score then
      begin
        best_move := imove;
        best_score := temp_score;
      end;
      if workload_id <> 0 then
      begin
        Assert(workloads[workload_id].update_best_move_and_score(unique_id, best_move,
          best_score) = true);
      end;
      if best_score >= my_beta then Break;
    end;

    Assert(best_move <> 0);
    Assert(best_score <> -SCORE_INFINITY);

    if workload_id <> 0 then
    begin
      Assert(workloads[workload_id].wait_moves_done(unique_id) = true);
      Assert(workloads[workload_id].return_best_move_and_score(unique_id,
          best_move, best_score)= true);
      Assert(workloads[workload_id].unpublish(unique_id) = true);
    end;
  end;

  //Chooses the next move in the workload that has not been searched yet
  //  or >nmoves  if all moves have been searched.
  //Returns false if the unique_id of the workload has changed
  function workload_class.choose_next_move(unique_id; var imove): boolean;

  //Updates best move and best score in workload if the score has improved or
  //  returns the current best score if not.
  //Flags all moves in workload as searched if best_score >= my_beta.
  //Returns false if the unique_id of the workload has changed
  function workload_class.update_best_move_and_score(unique_id; best_move;
    var best_score): boolean;

  //Returns current best move and score in the workload.
  //Returns false if the unique_id of the workload has changed.
  function workload_class.return_best_move_and_score(unique_id;
    var best_move; var best_score): boolean;

  //Waits until all moves in the workload have been searched.
  //Returns false if the unique_id of the workload has changed.
  function workload_class.wait_moves_done(unique_id): boolean;

  //Invalidates the unique_id of the workload
  function workload_class.unpublish(unique_id): boolean;

  //Creates a workload and returns the unique_id assigned to the workload.
  //Returns 0 if all workload slots are occupied or allocated to threads.
  function publish_workload(my_alpha; my_beta; best_move; best_score;
    var unique_id): integer;