Search & Planning

Introduction to AI (I2AI)

Andy Weeger

Neu-Ulm University of Applied Sciences

April 4, 2026

Agenda

  • Warm-up 10 min
  • Search problems & modelling 20 min
  • Search algorithms in action 25 min
  • Adversarial search & MCTS 25 min
  • Wrap-up 10 min

Search problems & modelling

Recap: Problem-solving process

Agents in simple environments follow a four-phase process:

  1. Goal formulation: identify the objectives based on the current situation
  2. Problem formulation: decide which states and actions to consider
  3. Search: explore possible action sequences to find a solution path
  4. Execution: carry out the solution in the environment

Formal search problem

A search problem is formally defined by:

  • State space: all possible states the environment can be in
  • Initial state: the state the agent starts in
  • Actions \(\textrm{ACTIONS}(s)\) – available actions when in state \(s\)
  • Transition model \(\textrm{RESULT}(s, a)\) – state resulting from action \(a\) in state \(s\)
  • Goal test: condition determining whether a state is a goal state
  • Action cost function \(\textrm{ACTION-COST}(s, a, s')\) – numeric cost of performing \(a\) from \(s\) to \(s'\)
  • Path: a sequence of actions leading from one state to another
  • Solution: a path from the initial state to a goal state

Recap: modelling

A simplified map of Romania with road distances in miles (Russel & Norvig, 2022)
Figure 1: A partial search tree for finding a route from Arad to Bucharest based on the simplified map of Romania (Russel & Norvig, 2022)

Definition sequence

Problem formulation must follow goal formulation.

Is this statement true or false? Why?

Problem formulation

There are six glass boxes in a row, each with a lock. Each of the first five boxes holds a key unlocking the next box in line; the last box holds a banana. You have the key to the first box and want the banana.

Write out all formal components:

  • State space
  • Initial state
  • Actions
  • Transition model
  • Goal test
  • Action cost function
10:00

Solution note

  • State space: A state is defined by which boxes are open (or equivalently, which keys you hold). Since each box is either open or closed, there are 2⁶ = 64 possible states in theory, though only 7 are reachable in practice: you can have boxes 1 through k open (for k = 0, 1, …, 6).
  • Initial state: All six boxes are closed. You hold the key to box 1. You do not have the banana.
  • Actions: \(\textrm{ACTIONS}(s) = \{ \textit{unlock and open box } k \}\), where k is the next locked box for which you hold the key. In each reachable state, there is exactly one available action (or zero, if all boxes are open).
  • Transition model: \(\textrm{RESULT}(s, \textit{open box } k)\) = the state where box k is now open. If k < 6, you now hold the key to box k+1 (and no longer need the key to box k). If k = 6, you now hold the banana.
  • Goal test: You have the banana, i.e., box 6 is open.
  • Action cost function: \(ACTION-COST(s, a, s')\) = 1 for each unlock action (uniform cost). Alternatively, 0 if only the final state matters, not the path.

Search algorithms in action

Recap: uninformed search strategies

Overview of uninformed search strategies

Trace the frontier

Simplified graph (edge costs in parentheses):

Arad -> Sibiu (140) -> Fagaras (99) -> Bucharest (211);
Sibiu -> Rimnicu Vilcea (80) -> Pitesti (97) -> Bucharest (101)

Tasks

  1. Breadth-first search from Arad to Bucharest.
    List the frontier after each expansion. Which path does BFS return?
  2. Uniform-cost search on the same graph.
    List the frontier with cumulative costs after each expansion. Which path does UCS return? Compare with BFS.
  3. (Bonus) Straight-line distance to Bucharest:
    Arad=366, Sibiu=253, Fagaras=176, Rimnicu Vilcea=193, Pitesti=100.
    Trace A*: how does the heuristic change the expansion order?
15:00

Adversarial search & MCTS

Recap: MinMax

An adversarial game tree

Recap: Alpha-beta pruning

Monte Carlo Tree Search (MCTS)

Instead of exhaustive traversal, MCTS uses repeated simulation:

  1. Selection: follow tree policy (e.g., UCT) from root to a leaf
  2. Expansion: add a new child node at the selected leaf
  3. Simulation: play out the game randomly to a terminal state
  4. Backpropagation: update win/playout counts up to the root

MCTS: expansion and simulation

Pruning

Your task:

  1. Draw a game search tree with 3 levels of depth (MAX at root, MIN at level 1, terminal values at level 2). Assign arbitrary terminal values.
  2. Apply alpha-beta pruning step by step, tracking \(\alpha\) and \(\beta\) at each node.
  3. Mark the pruned branches and explain why each is skipped (i.e., state the \(\alpha \geq \beta\) condition).

Can MinMax be beaten?

Can the MinMax algorithm be beaten by an opponent playing suboptimally?

Why or why not?

  • Discuss and reach a verdict
  • Sketch a small game tree where MAX wins against a suboptimal MIN

Wrap-up

Key Takeaways

Search

  • Search operates over models; planning is “in simulation”
  • Search is only as good as the models it uses
  • BFS finds shortest paths (fewest hops); UCS finds cheapest paths (lowest cost); A* adds a heuristic to guide expansion toward the goal

Adversarial search

  • MinMax assumes optimal adversarial play; a suboptimal opponent only helps MAX
  • Alpha-Beta Pruning achieves the same result as MinMax while evaluating fewer nodes (\(\alpha \geq \beta\) prune)
  • MCTS estimates state value via simulation rather than exhaustive search – no evaluation function needed

xkcd

xkcd 1263: reassuring

Q&A

Literature

Russel, S., & Norvig, P. (2022). Artificial intelligence: A modern approach. Pearson Education.