Neuron 1863 (Layer 0, mlp)

Universal number detector (decomposition case study)

Total weights 21
Inputs (W_in) 10
Outputs (W_out) 11

Input Weights (residual dim → neuron 1863)

Each input weight carries a different signal into the neuron. Click any row to see the full profile.

Rank Weight Indices Max KL Mean KL
#16 blocks.0.mlp.W_in [464, 1863] 26.7662 0.009274
#23 blocks.0.mlp.W_in [71, 1863] 22.7326 0.004690
#30 blocks.0.mlp.W_in [395, 1863] 17.1137 0.002023
#34 blocks.0.mlp.W_in [799, 1863] 16.0585 0.002183
#40 blocks.0.mlp.W_in [578, 1863] 13.6505 0.005442
#103 blocks.0.mlp.W_in [131, 1863] 8.0407 0.030843
#163 blocks.0.mlp.W_in [678, 1863] 5.9731 0.004917
#1032 blocks.0.mlp.W_in [87, 1863] 1.9184 0.000055
#2101 blocks.0.mlp.W_in [558, 1863] 1.0570 0.000906
#3751 blocks.0.mlp.W_in [299, 1863] 0.0121 0.000036

Output Weights (neuron 1863 → residual dim)

Each output weight routes the neuron's signal to a different downstream channel. Click any row to see the full profile.

Rank Weight Indices Max KL Mean KL
#718 blocks.0.mlp.W_out [1863, 968] 2.4633 0.003398
#1166 blocks.0.mlp.W_out [1863, 773] 1.7583 0.001808
#2617 blocks.0.mlp.W_out [1863, 495] 0.8604 0.001381
#3180 blocks.0.mlp.W_out [1863, 360] 0.6373 0.000885
#3694 blocks.0.mlp.W_out [1863, 915] 0.4525 0.001033
#3714 blocks.0.mlp.W_out [1863, 126] 0.4328 0.000428
#3745 blocks.0.mlp.W_out [1863, 903] 0.2944 0.000787
#3747 blocks.0.mlp.W_out [1863, 756] 0.2026 0.000323
#3748 blocks.0.mlp.W_out [1863, 396] 0.1069 0.000297
#3749 blocks.0.mlp.W_out [1863, 267] 0.0995 0.000183
#3750 blocks.0.mlp.W_out [1863, 696] 0.0833 0.000105