wip: explain intro better

This commit is contained in:
nobody 2025-11-04 13:14:50 -08:00
commit 515a0e6d81
Signed by: GrocerPublishAgent
GPG key ID: D460CD54A9E3AB86

View file

@ -8,11 +8,20 @@ import { Math } from "~/components/math/math"
A couple of days ago I came across
[github.com/mattneary/salience](https://github.com/mattneary/salience) by Matt Neary. I thought it
was quite neat how he took sentence embeddings and in just a few lines of code
was able to determine the significance of all sentences in a document.
was quite neat how armed with a good understanding of math he was able to take sentence embeddings and in fewer lines of code
than this introduction determine the significance of all sentences in a document.
This post is an outsider's view of how salience works. If you're already working with ML models in Python, this will feel
torturously detailed. I wrote this for the rest of us old world programmers: compilers, networking, systems programming looking at
This is not a description of [all the changes I made and extra book-keeping involved to turn Matt's script into a proper web app demo](/grunt-work).
This post is an outsider's view of how Matt's salience code works. If you're
already working with ML models in Python, this will feel torturously detailed.
I'm going to be explaing everything 3 times, the equations a ML engineer would doodle out, the element by element matrix operations to give you feel for the dataflow, and the numpy code that implements it.
When you see `sims /= norms.T` in numpy, I want to explain the matrix dimensions
I wrote this for the rest of us old world programmers: compilers, networking, systems programming looking at
C++/Go/Rust, or the poor souls in the frontend Typescript mines.
For us refugees of the barbarian past, the tooling and notation can look foreign. I wanted to walk through the math and
numpy operations in detail to show what's actually happening with the data.