Stephen Wolfram's "What Is ChatGPT Doing … and Why Does It Work?" gives an example where certain small neural nets cannot learn a function.

We will understand how larger trained networks approximate the function. Then by handcrafting weights ourselves, we'll discover that the function can be approximated arbitrarily well using fewer neurons than expected.

This will be part of my in-progress book "LLM Foundations" (working title). A YouTube video will also be produced, but likely after the jam's conclusion.

Recent Activity

!til &reveng-ann As a sign-off, for the teaching portion decided to majorly revise the first part of my writings (on solving Caesar ciphers using a transformer) to ensure overall quality. Gave a project recap and had a fun time!

!til &reveng-ann I was coding along with a popular article on ChatGPT by Stephen Wolfram. It suggests that tiny neural nets can't reproduce a certain function (using automated training). By reverse-engineering how some larger nets solved it, I discovered that actually the tiny nets can reproduce the function, and arbitrarily well. This discovery will be the basis of my jam entry.