How Did AI Learn to Pick Seismic Phases? From PhaseNet to SeisBench

How did AI come to take over the job of picking seismic phases? In just a few years, the field went from every lab writing its own code to calling someone else's pre-trained model in a single line. This is the story of how that path got laid, with three protagonists: PhaseNet, EQTransformer, and the SeisBench toolbox that tied them together.

2019

PhaseNet

Recast phase picking as an image-segmentation problem (U-Net's home turf), with Gaussian-curve targets that train well; accurate enough to become the field's benchmark.

2020

EQTransformer

Brought in the attention mechanism and split detection, P, and S into three independent heads so they don't interfere.

2022

SeisBench

Wrapped many models and datasets in one interface, so results could finally be compared fairly and reproduced.

PhaseNet: turning phase picking into an image-segmentation task

In 2019, Weiqiang Zhu and Greg Beroza at Stanford introduced PhaseNet. Its most important move was to redefine the problem. Picking had been treated as "finding one precise arrival time in a long stream of signal"; PhaseNet recast it as an image-segmentation task: label point by point whether each spot is a P-wave, an S-wave, or nothing at all. That is exactly what the U-Net architecture was built for; U-Net was originally designed to segment medical images, outlining organ boundaries in X-rays and MRIs. Picking seismic phases and outlining an organ are, at heart, the same kind of problem.

The second change is subtler but just as important. The "correct" arrival time of a seismic wave is a single instant, but asking a model to hit one exact point is very hard to train: the target is sharp and sparse, and being slightly off counts as almost entirely wrong. PhaseNet spreads that sharp instant into a Gaussian curve, centered on the human-labeled position with a little width on each side, which writes the labeling error humans always have straight into the training target. A wider target gives the model room for error and trains far more easily.

Feed the three-component waveform into a U-Net, and out come three probability curves (P, S, and noise); wherever a curve spikes, that phase has arrived. Trained on decades of hand-labeled data from Northern California, it was fast and accurate, accurate enough to become the field's benchmark: when a new method appears, the first thing people do is compare it against PhaseNet.

A phase-picking example: a three-component waveform and the P and S Gaussian probability outputs — A real example: the top three traces are the three components (Z / N / E) of a single earthquake, and the bottom panel is the model's output. The light curves are the human-labeled **Gaussian targets** (the ground truth); the dark ones are the predictions, with the P (blue) and S (red) peaks landing almost exactly on the targets. Both ideas show up here: the input is the whole waveform read point by point (the image-segmentation view), and the target is not a single instant but a curve with width.
Image / SeisBlue

EQTransformer: pulling three tasks apart

In 2020, Mousavi and colleagues from the same circle introduced EQTransformer, published in Nature Communications.

If PhaseNet proved that "deep learning can pick phases," EQTransformer asked "can we do it more cleverly?" It brought in the attention mechanism that was then sweeping the AI world (the same underlying idea that large language models later run on), letting the model weigh distant and nearby cues simultaneously as it reads a stretch of waveform.

Its key design is to pull three tasks apart: detecting whether there's an earthquake in the segment at all, picking the P-wave, and picking the S-wave, each handled by its own independent output branch (head) so they don't interfere. The reason: PhaseNet originally predicts P, S, and noise together as three side-by-side channels, but a detection signal (is there an earthquake?) spans the whole event and is wide and long, carrying far more weight than the two narrow little Gaussian peaks of P and S. Learn them all mixed together and the tiny P and S signals are easily drowned out by the detection one. Splitting them into separate heads lets each task focus on its own job. EQTransformer was trained on a large public dataset called STEAD, bigger than the one PhaseNet used.

The trouble after a hundred flowers bloom

After PhaseNet and EQTransformer, all kinds of models appeared. That's a good thing in principle, but it soon ran into a practical problem: nobody could compare them fairly.

Each model used differently formatted data, different preprocessing, different scoring. To find out whether model A really beat model B, you first had to spend weeks rebuilding both data pipelines and environments from scratch. Just the "load the data in" step was enough to put many people off. If research can't produce reproducible comparisons, the field struggles to move forward.

SeisBench: a USB port for seismic AI

In 2022, a German research team (Woollam, Münchmeyer, and colleagues) released SeisBench, which addresses exactly this.

Think of it as a USB port for seismic AI: any pre-trained model (PhaseNet, EQTransformer, and more) and any public dataset plug in the same way. Want to swap models? Change one line. Want a different dataset? Change another. What used to take weeks to set up now runs in a few lines of code, and everyone's results can be compared directly.

SeisBench didn't invent a new phase picker, but what it did matters just as much: it turned the field from "everyone doing their own thing" into "anyone can pick it up, and results can be reproduced." Today a newcomer can try out the results of several past papers in a single afternoon, something that was barely possible five years ago.

So what's next?

Over these few years, AI phase picking grew from a new idea into an open, reproducible ecosystem. The infrastructure is in place, and the barrier to entry has dropped.

But mature tools don't mean every problem is solved. Models still fail in certain situations, for example the S-wave amplitude getting suppressed, or struggling when they meet regions and instruments they never saw during training. Once the foundation is solid, attention naturally shifts from "can it be done at all" to these more specific flaws.

For a look at how one of those specific problems gets taken apart, read on: Why Does AI Keep Missing S-Waves? →

PhaseNet: turning phase picking into an image-segmentation task

EQTransformer: pulling three tasks apart

The trouble after a hundred flowers bloom

SeisBench: a USB port for seismic AI

So what's next?

References