Ask HN: How to learn AI from first principles?
121 points by HardikVala 5 days ago | 54 comments
A variant of this question seems to get asked every 6 mo. but so far, I haven't seen this question tackled directly: If I want to learn the concepts and fundamentals of AI from first principles, what educational resources should I use?
I'm not interested in hands-on guides (eg. how to train a DNN classifier in TensorFlow) or LLM-centric resources.
So far, I've put together the following curriculum:
1 Artificial Intelligence: A Modern Approach (https://aima.cs.berkeley.edu/) - Great for learning the breadth of foundational concepts, eg. local search algorithms, building up to modern AI.
2 Probabilistic Machine Learning: An Introduction (https://probml.github.io/pml-book/book1.html) - Going more in-depth into ML.
3 Dive into Deep Learning (https://d2l.ai/) - Going deep into DL, including contemporary ideas like Transformers and Diffusion models.
4. Neural networks and Deep Learning (http://neuralnetworksanddeeplearning.com/) could also be a great resource but the content probably overlaps significantly with 3.
Would anybody add/update/remove anything? (Don't have to limit recommendations to textbooks. Also open to courses, papers, etc.)
Sorry for the semi-redundant post.
noduerme 5 days ago | next |
The following is not a take that will get you a job or teach you precisely how LLMs work, because you can look that up yourself. However, it may inspire you and you may create something that has a better-than-lottery-ticket chance of being an improvement over the AI status quo:
Without reading about how it's done now, just think about how you think a neural network should function. It ostensibly has input, output, and something in the middle. Maybe its input is a 64x64 pixel handwritten character, and its output is a unicode number. In between the input pixels (a 64x64 array) and the output, are a bunch of neurons. Layers of neurons. That talk to each other and learn or un-learn (are rewarded or punished).
Build that. Build a cube where one side is a pixel grid and the other side delivers a number. Decide how the neurons influence each other and how they train their weights to deliver the result at the other end. However you think it should go. Just raw code it with arrays in whatever dimensions you want and make it work; you can do it in Javascript or BASIC. link them however you want. Don't worry about performance, because you can assume that whatever marginally works can be tested on a massive scale and show "impressive" results.