Artificial Intelligence: Q-Learning

Average Rating:  
X Rating Failed

Endow NPCs and other autonomous agents with the ability to acquire new behavior through reinforcement learning – an algorithmic approach to decision making that mimics the way humans and other animals learn

  • Supported Target Platforms
  • Supported Engine Versions


Companion Video:

In this example, the Q learning equation is used to solve a 'match to sample' puzzle in which the NPC learns that it must activate a switch within the level at the same time that a light is on in order to receive a “food reward”. Similar puzzles have been used in a wide variety of animal learning experiments that explore instrumental and associative learning abilities. Q-learning is also the foundation for more advanced systems of intelligent behavior such as those found in the AI Emotions Toolkit.

Here, the agent must use reinforcement learning to predict that it can take an action to receive a reward only during specific circumstance, in this case- when a light is on, it touches a switch and then proceeds to a food bowl. The task begins with a training phase in which the NPC randomly travels from one of four locations- 3 food bowls and one switch, represented by balls and a cone respectively. During the training phase, it learns associations about the values of each of these elements, the order in which they are visited and how they are affected by the light that periodically turns on and off. Specifically, if the NPC goes to the cone before proceeding to a specific food bowl while the light is on, it receives a reward.

After training the agent displays intentional behavior by first going to the switch(the cone) and then going directly to the food reward bowl whenever the light is on. Q-Learning can be used to provide a wide variety of intentional behaviors, including avoiding enemy players, collecting health points, and almost every behavior a human player is capable of manifesting within a game environment.

Technical Details


·        1 Custom Structure – 2 dimensional array

·        1 AI Behavior Tree

·        1 AI Blackboard

·        1 AI Character Controller

·        1 AI Character Blueprint

Number of Blueprints: 1

Input: None

Network Replicated: No

Supported Development Platforms: Unreal Engine 4.15 and Up

Supported Target Build Platforms: All


Important/Additional Notes:

To disable visualization of the training phase, set all delay nodes after the AI movement code to 0 and change both delay nodes for the light turning off from 10 to zero. This somewhat defeats the purpose of learning whether the light is on or off, but is necessary if one wishes to train up the agent very fast.



Previous Next
  • Edit
  • Preview
  • Help
Login to comment
X Report this Comment
X Attention

X Edit this Comment
  • Edit
  • Preview
  • Help
X Remove this Comment

Are you sure you want to remove this comment?

X Attention