A real fully-connected network learning to read digits. Every ball is a neuron, every line is a weight. Click any ball to open it and watch its exact sum; a line is bright because upstream activation × its weight is large.
∇-check pending…
epoch
0
samples seen
0
train loss
—
test accuracy
—
parameters
—
status
idle
forward · activation signal
excitatory (+ weight)
inhibitory (− weight)
backward · error gradient
gradient pulse →
lines = the model (weights)
balls = activations (per input)
drag orbit · scroll zoom · click a ball open it
During training: known vs learned
INPUTvalues KNOWN — the pixels we feed
HIDDEN 1weights UNKNOWN — meaning self-taught · no target
HIDDEN 2weights UNKNOWN — meaning self-taught · no target
OUTPUTmeaning + target KNOWN · weights UNKNOWN
Every wire is a weight = UNKNOWN; training solves for all — of them. We design the layers, sizes, activations & loss; we obtain the trained weights. Hidden layers get no answer key — only blame passed back from the output.
Inside the neuron · the exact sum
— pick one —Why is a line bright? · the mechanism
Each wire carries one number: the upstream neuron's activation times this wire's weight
(ai · wij). A wire is bright when that product is large — because the
source neuron is firing hard, or the weight is big, or both. Warm = pushing the target
up, cool = pulling it down. Open a neuron, then hover its bars to light up the exact wire.
Supervision · what's given
probingthe image always known
answer: —
net's guess: — —
loss: — (how wrong it is)
correction sent backward
Each training image arrives paired with its correct answer; the net shifts its weights toward it. At probe / test time the answer is withheld — the net must produce it alone.
Output · the 10 answer neurons (softmax)
The last layer's 10 neurons compete; softmax turns their scores into probabilities that sum to 1.
Probe & draw · feed it a digit
Send a clean digit, or paint one (16×16 is the network's real input size). The whole graph re-fires instantly.