A youtube by Seing — researched and verified by Depth
A program called MarI/O, written by SethBling in Lua using the NEAT algorithm, learned to beat Super Mario World's Donut Plains 1 level from zero knowledge in 34 generations over 24 hours — demonstrating how neuroevolution can grow neural network topology and weights simultaneously, mirroring biological evolution, to solve real-time control tasks without any hand-designed architecture.
MarI/O is a Lua script written by SethBling (YouTube creator) that implements the NEAT algorithm to autonomously learn to play Super Mario World on the SNES. The original script (~1,200 lines) is at https://pastebin.com/ZZmSNaHX and runs inside the BizHawk emulator (Windows-only for Lua scripting). The video was published in 2015 and remains one of the most-watched demonstrations of neuroevolution.
NEAT (NeuroEvolution of Augmenting Topologies) is a genetic algorithm for evolving artificial neural networks, developed by Kenneth O. Stanley and Risto Miikkulainen at the University of Texas at Austin, published in Evolutionary Computation (2002, Vol. 10, No. 2, pp. 99–127). DOI: 10.1162/106365602320169811. The full paper PDF is freely available at: https://nn.cs.utexas.edu/downloads/papers/stanley.ec02.pdf. Stanley's official NEAT page with all papers and implementations: https://www.cs.ucf.edu/~kstanley/neat.html.
The canonical Python implementation is neat-python by CodeReclaimers: https://github.com/CodeReclaimers/neat-python — pure Python, no dependencies beyond the standard library, licensed under 3-clause BSD, supports Python 3.8–3.14 and PyPy3.
An improved fork of MarI/O with bug fixes and enhanced fitness functions: https://github.com/mam91/neat-genetic-mario.
The three pillars of NEAT (each is necessary; ablation studies show removing any one degrades performance severely):
Historical Markings (Innovation Numbers): Every new gene (connection or node) added by mutation receives a globally incrementing innovation number. This acts as a chronological tag, allowing NEAT to align genes from two differently-structured parent networks during crossover without expensive topological analysis — solving the "competing conventions" problem that plagued earlier topology-evolving algorithms.
Speciation: The population is divided into species based on topological similarity (measured via a compatibility distance formula using excess genes, disjoint genes, and average weight differences). Individuals compete primarily within their own species, not against the whole population. This protects new structural innovations — which initially reduce fitness — giving them time to optimize before facing elimination. Without speciation, random-starting NEAT was 7× slower and failed to find a solution within 1,000 generations 5% of the time.
Incremental Growth from Minimal Structure: NEAT starts every run with networks containing only input and output nodes — no hidden layers. Structure is added only when mutations insert new nodes (by splitting an existing connection) or new connections. This keeps the search space minimal and prevents bloat.
Two mutation types:
- Add connection: Links two previously unconnected nodes; assigned the next innovation number.
- Add node: Splits an existing connection, disabling it and inserting a new node with two new connections. The new node initially acts as an identity function, minimizing disruption.
MarI/O specifics:
- Input grid: BoxRadius = 6, so InputSize = (6×2+1)² = 169 tiles + 1 bias = 170 inputs.
- Outputs: 8 buttons (A, B, X, Y, Up, Down, Left, Right) for SMW; 6 for SMB.
- Population: 300 genomes per generation (visible in source code).
- Fitness: rightward X-position + time bonus.
- Save state: DP1.state (Donut Plains 1 start) must be placed in both the Lua folder and BizHawk root.
- The original save/load function in SethBling's script crashes on load; community forks fix this.
BizHawk constraint: Lua scripting is only available on Windows. Mac/Linux users must use Wine.
Path A: Run MarI/O exactly as SethBling did (Windows, BizHawk + Lua)
Download BizHawk from https://github.com/TASEmulators/BizHawk/releases. Run the prerequisite installer first, then extract and launch EmuHawk.exe.
Obtain a legal Super Mario World ROM (Super Mario World (USA).sfc). If you own the cartridge, dump it yourself. The ROM must match the filename the script expects.
Get the MarI/O script. Use the improved community fork instead of the original (which has a broken save/load): clone https://github.com/mam91/neat-genetic-mario and place the neat-mario folder inside BizHawk\Lua\SNES\.
Configure the script. Open config.lua and set _M.BizhawkDir to your BizHawk installation path.
Create a save state. In BizHawk, load the ROM, navigate to the start of Donut Plains 1, then go to File > Save State > Save Named and name it DP1.state. Copy this file to both the Lua\SNES\neat-mario\ folder and the BizHawk root directory.
Load the script. In BizHawk, open Tools > Lua Console, then Script > Open Script, and select mario-neat.lua. The NEAT control window will appear — click Start.
Let it run. Early generations will mostly idle or walk a few steps. Meaningful progress typically appears after 5–10 generations. A full run to level completion takes hours (SethBling's took 24 hours at normal emulation speed; you can increase emulation speed in BizHawk to accelerate training significantly).
Save your pool periodically. The improved fork fixes the crash-on-load bug. Use the save button in the NEAT control window to checkpoint progress between sessions.
Path B: Apply NEAT to your own problem in Python
pip install neat-python
import neat, os
# 1. Define your fitness function
def eval_genomes(genomes, config):
for genome_id, genome in genomes:
net = neat.nn.FeedForwardNetwork.create(genome, config)
# Run your simulation, get a score
genome.fitness = your_simulation(net) # must be >= 0
# 2. Load config (copy from neat-python/examples/xor/config-feedforward)
config = neat.Config(
neat.DefaultGenome,
neat.DefaultReproduction,
neat.DefaultSpeciesSet,
neat.DefaultStagnation,
'config-feedforward' # path to your config file
)
# 3. Run evolution
p = neat.Population(config)
p.add_reporter(neat.StdOutReporter(True))
p.add_reporter(neat.StatisticsReporter())
winner = p.run(eval_genomes, n=50) # 50 generations
print('Best genome:', winner)
Start with the XOR example at neat-python/examples/xor/ — it solves in ~20 generations and is the canonical "hello world" for NEAT. Documentation: https://neat-python.readthedocs.io/
Key config parameters to tune:
- pop_size: population size (default 150; MarI/O uses 300)
- fitness_threshold: stop when this fitness is reached
- compatibility_threshold: controls species granularity (lower = more species)
- node_add_prob / conn_add_prob: structural mutation rates
1. NEAT's core mechanism — innovation numbers — was glossed over. The video shows speciation visually but never explains why crossover between differently-structured networks works. The answer is innovation numbers: every new gene gets a globally incrementing tag, allowing NEAT to align homologous genes across networks of different sizes without topological analysis. This is the algorithm's most important contribution.
2. The "10% of your brain" claim is neuroscience myth. SethBling uses it as an analogy for sparse network activation. The claim that humans only use 10% of their brains is scientifically false — brain imaging shows activity throughout the brain. The analogy is harmless but misleading.
3. The original script's save/load is broken. The Pastebin version crashes every time you try to load a saved pool state. Community forks (mam91/neat-genetic-mario, SngLol/NEATEvolve) fix this — use them instead.
4. Training is level-specific and does not generalize. The learned network is optimized for Donut Plains 1 only. For each new level, you must restart training from scratch with a new save state. NEAT does not learn transferable representations.
5. BizHawk Lua scripting is Windows-only. Mac and Linux users must use Wine, which adds setup complexity. An alternative is the FCEUX port (https://github.com/juvester/mari-o-fceux) which runs on Linux.
6. NEAT does not scale well to high-dimensional inputs. The major limitation of vanilla NEAT is that it evolves a single network that must simultaneously extract features and select actions. For raw pixel inputs or large state spaces, this becomes intractable. The research community has addressed this with HyperNEAT (scales to millions of connections via indirect encoding), DeepNEAT (evolves layer-level topology), and hybrid approaches like NEAT+PPO. MarI/O sidesteps this by using a hand-crafted 13×13 tile grid rather than raw pixels — a significant design choice the video doesn't highlight.
7. Alternatives to NEAT for game-playing AI:
- PPO / DQN (deep RL): Gradient-based, far more sample-efficient on complex tasks, requires GPU but handles raw pixels natively. OpenAI's PPO beat many Atari games; NEAT cannot match this at scale.
- Evolution Strategies (OpenAI ES): Gradient-free like NEAT but parallelizes trivially across CPUs; used by OpenAI to train MuJoCo locomotion policies.
- HyperNEAT: Direct descendant of NEAT, evolves large-scale geometric connectivity patterns; better for spatially structured problems.
- neat-python + Gymnasium: The modern way to apply NEAT to control tasks — includes BipedalWalker, InvertedDoublePendulum, and Hopper examples out of the box.
The paper is confirmed: Stanley and Miikkulainen, University of Texas at Austin, published in Evolutionary Computation journal in 2002, DOI 10.1162/106365602320169811.
— SourceMultiple GitHub forks confirm the script is Lua for BizHawk; the emulator name in the video is slightly misspelled ('Bisok') but is definitively BizHawk.
— SourceThe 34-generation figure is specific to SethBling's single run; NEAT is stochastic and results vary across runs — no published benchmark confirms this exact figure.
— SourceSource code confirms 8 buttons for SMW (A, B, X, Y, Up, Down, Left, Right) and a 13×13 tile grid (BoxRadius=6) as inputs, totaling 169+1=170 inputs.
— SourceThe paper explicitly states NEAT starts from minimal structure (no hidden nodes) and grows complexity only as beneficial mutations survive selection.
— SourceSpeciation is one of NEAT's three core innovations; the paper confirms it was novel in the context of topology-evolving neural networks (TWEANNs).
— SourceNo independent source confirms the 24-hour duration; it is plausible given BizHawk's default emulation speed and 34 generations with population 300, but cannot be verified.
— SourceThe '10% of your brain' claim is a well-documented neuroscience myth; brain imaging shows activity throughout the brain. SethBling uses it loosely as an analogy, not a scientific claim.
— SourceInstall BizHawk on Windows, clone https://github.com/mam91/neat-genetic-mario, obtain a legal SMW ROM, create the DP1.state save state, and run MarI/O at 4x emulation speed to observe NEAT learning firsthand within a weekend
Direct hands-on experience with the system makes abstract concepts like speciation and fitness progression concrete and observable; accelerated emulation speed cuts the 24-hour run to ~6 hours
Run the neat-python XOR example locally: pip install neat-python, copy the config-feedforward file from neat-python/examples/xor/, run evolve.py, and instrument it to print species count and innovation numbers each generation to see the three NEAT pillars in action
XOR solves in ~20 generations and is the canonical NEAT hello-world; adding print statements to expose innovation numbers and species boundaries builds intuition for the algorithm's core mechanisms before tackling harder problems
Read the original NEAT paper (https://nn.cs.utexas.edu/downloads/papers/stanley.ec02.pdf) focusing specifically on Section 3 (historical markings), Section 4 (speciation), and Table 1 (ablation results showing 7x slowdown without speciation)
The video glosses over innovation numbers entirely; the paper's ablation study quantifies exactly what each pillar contributes, giving you the ability to make informed decisions when tuning NEAT for your own problems
Apply neat-python to a Gymnasium control task: pip install neat-python gymnasium, then wire NEAT's eval_genomes to CartPole-v1 or BipedalWalker-v3, log fitness per generation, and compare convergence speed against a PPO baseline from stable-baselines3
Directly benchmarking NEAT against gradient-based RL on the same task produces concrete data on where neuroevolution is competitive versus where it falls short, informing when to choose each approach
Modify MarI/O's fitness function in config.lua to add a bonus for enemy kills or coins collected (beyond rightward X-position), retrain from generation 0, and compare the learned behavior and network topology against the original fitness function
Fitness function design is the single highest-leverage variable in any evolutionary system; this experiment makes the reward-shaping tradeoff visceral and teaches you how NEAT exploits whatever objective you give it
Train MarI/O on three different SMW levels (Donut Plains 1, Yoshi's Island 1, Vanilla Dome 1) using separate save states, then compare the final network topologies and generation counts to quantify how level-specific the learned solutions are
This directly tests the generalization limitation identified in the research — that NEAT learns level-specific solutions — and produces concrete evidence for or against transfer learning claims, which is critical if you plan to use NEAT in production robotics or game AI
Implement a minimal NEAT from scratch in ~200 lines of Python (innovation counter as a global dict, compatibility distance function, speciation loop, mutation functions) without using neat-python, then verify it solves XOR
Writing NEAT from scratch forces you to resolve every ambiguity in the paper — how to handle disjoint vs excess genes, when to reset innovation numbers, how to assign offspring counts per species — producing deep algorithmic understanding that library use cannot provide
Read the HyperNEAT paper (Stanley et al. 2007, available at https://www.cs.ucf.edu/~kstanley/neat.html) and prototype a HyperNEAT experiment on a spatially structured task like a 2D maze, comparing network size and convergence against vanilla NEAT
HyperNEAT is the direct answer to NEAT's scaling limitation identified in the research; understanding indirect encoding is the necessary next step if you want to apply neuroevolution to robotics or vision tasks with large input spaces
Want research like this for any video?
Save a link, get back verified intelligence.