Here are some non-exhaustive notes on the frontiers of my understanding of various topics. I read about Feynman doing something similar; I’ve found that keeping these notes has been helpful for me since I started in late 2017.

If you know a lot about something that I list here and you want to tell me about it, I’d love to hear your explanation!

## Paul Christiano’s work

Concepts I don’t understand

- Meta-execution
- learning the right model
- Annotated functional programming
- benign model-free RL
- benign model-based RL (https://ai-alignment.com/aligned-search-366f983742e9)

Concepts I do understand

- reliability and robustness
- oversight/reward learning
- Two approaches: IRL, learning from human feedback
- deliberation and amplification
- secure
- competitive
- scalable
- adversarial training
- ensembling and consensus
- Against mimicry
- When can mimicry go wrong? robot and blocks example
- Weak and strong HCH
- KWIK learning— knows what it knows

## Reinforcement learning

- Generalized advantage estimation
- vanishing gradients
- I’ve forgotten the RL stuff I learned before. At some point I should draw a list of connections between RL ideas and things in deep RL.
- Eg td(lambda)
- WRT policy gradients: Finite difference methods, likelihood ratio methods, REINFORCE, natural policy gradients
- Variance reduction, in policy gradients
- local response normalization
- q-prop
- control variate
- How powerful is AlphaZero’s value function
- According to this post Deep RL only works well and better than other things when (a) you can run simulations cheaply (so sample efficiency doesn't matter much) but (b) you can't plan against the true model (so you can't use normal control theory or tree search). Do conditions (a) and (b) ever actually obtain in the real world? Alternatively, is this claim just wrong? (HT Daniel Filan)

things formerly on this list:

- linear policies
- I wasn't quite clear on how convolution across multiple timesteps works. Now I understand
- Conv nets—I don’t really understand them very well and should do the tutorial
- I feel a little better now
- Action advantage/Advantage function
- I think I know this now! This is how much better an action is than other actions would be.
- Specifics of convolution: what are the arguments? I should consider rewriting these functions myself.
- Max pooling
- Python’s named_tuple things
- qfunc network vs target network
- Dueling DQN
- Double DQN
- model-free RL
- model-based RL
- batch normalization
- TRPO

## Math

- calculus of variations, Lagrange multipliers
- I understood this for a brief period once!
- Lagrange multiplier type things for constrained optimization
- Why is group theory cool? This is embarrassing to admit: I like abstract algebra but I don’t actually know of concrete situations where it’s very helpful.
- Contour integration
- What are the Clebsch Gordan coefficients? How does this work?
- Representation theory
- I don’t know about the different types of infinities
- Category theory
- I should relearn the definitions of sum, product, and exponentials
- i now know sum and product, I think
- Sigma algebras
- Is there biological life inside Game of Life? Eg, self-replicating structures. Also, what is the very large scale, very long term behavior of GoL?
- What is the proper scoring rule for confidence intervals?
- answer is here but I haven’t read it
- Linear algebra
- I’m sure there’s a whole bunch of conceptual linear algebra points I don’t know.

## Economics

- How has productivity changed over time? I hear it hasn’t increased?
- What does “returns to capital” vs “Returns to labor” mean?
- International trade stuff
- I don’t understand monetary policy very well
- Where does money come from?
- Why do fiat currencies work? Is it just something about stickiness of prices?
- Why don’t more contracts involve inflation-indexed prices? Seems like this would often be better for both parties (eg employees, especially when getting paid by a multinational corporation)

## Statistics

- Time series modelling
- ANOVA
- correlation coefficient
- In general there are a bunch of things about linear regression where I don’t think I know the words for things.

## Statistical mechanics

- Equipartition theorem
- Apparently this implies the Dulong-Petit law—the heat capacity of a solid is proportional to 3 times the “gas constant” times the number of molecules. So elements with lower atomic weights have higher heat capacity per unit mass.
- This law overestimates the heat capacity, because nearby molecules have correlated motion.
- Thermal wavelength
- Maximum entropy thermodynamics
- Suppose I have a bunch of oxygen and a bunch of nitrogen in tanks next to each other, at the same temperature. This has lower entropy than the state where the gasses are at diffusive equilibrium. How in general do I extract energy from the process of mixing them?
- I don’t know about the different kind of cycles and stuff
- I have now decided that I don’t care much about this
- Deriving the Sackur-Tetrode equation for entropy from information theory

- "In addition to using the thermodynamic perspective of entropy, the tools of information theory can be used to provide an information perspective of entropy. The Sackur–Tetrode equation for entropy can be derived in information theoretic terms. The equation can be seen to consist of the sum of four entropies (missing information) due to positional uncertainty, momenta uncertainty, the quantum mechanical uncertainty principle and the indistinguishability of the particles.” this sounds pretty dope and I don’t understand it
- Calculating a black hole radius from no-hair and thermodynamics.
- $$dU = T \cdot dS - P \cdot dV + \mu \cdot dN$$
- I am sort of familiar with these facts now
- How do we use fundamental physical models to calculate:
- thermal conductivity
- phase transitions
- Why does a mix of oxygen and nitrogen start to liquify at below the freezing point of either oxygen or nitrogen? (This is explained in 5.4 of Schroeder.)
- I still don’t know anything about how engines and refrigerators work, but I don’t know if I care
- I still don’t quite understand how black body radiation works. I should read Manton and Mee section 10.10 properly.

Things I now understand:

- Some systems have their temperature fall as their energy rises. And allegedly some have their entropy fall as their energy rises? I don’t get this.
- Negative temperatures. For example, a paramagnet with more than half its bits in the odd side of things. These try to give their energy to any finite-temperature system that’s nearby.
- $$\frac1T = \frac{\partial S}{\partial U}$$
- Gibbs free energy is defined as $$G = U + PV - TS$$: this is the amount of energy required to create the thing and make space for it, if you’re allowed to use temperature from the environment.
- What proportion of atoms in a paramagnet have aligned spins, as a function of temperature?
- Enthalpy: the energy of a system plus the work required to make room for it, in an environment with constant pressure. Enthalpy H = U + P V.
- What precisely is a degree of freedom? How come I can’t turn my degree of freedom into more degrees of freedom by interpreting “horizontal position” with some weird ugly function from R -> R^2?
- How do fractional degrees of freedom work?
- The answer to this is that “degrees of freedom” are just an approximation to the underlying reality. What actually matters is all of the definite energy wavefunctions available, and their energy levels. You can’t turn your degree of freedom into more degrees of freedom with a space-filling curve because your degree of freedom was just being approximated as a continuous space and if you apply a space-filling curve then you break the approximation. The reason we bother talking about “degrees of freedom” at all is that often in practice we actually do have degrees of freedom that can be modelled pretty well as uncorrelated. If our degrees of freedom of a system (let’s call them a and b) are "orthogonal", this means that we can write the energy of the system as the sum of a function of a and a function of b.
- Fractional degrees of freedom occur when your energy function can’t be separated out into a function of many of its degrees of freedom.
- An example model of fractional degrees of freedom is the Ising model.
- Imagine you have a cylinder that's a centimeter wide and 1000km tall, with a base here on Earth somewhere. You have some amount of gas in it. How does pressure vary with height in the cylinder?
- I think this is the barometric pressure equation.
- What is the distribution of velocities and positions in this cylinder?
- When we extract energy from increasing entropy (eg by mixing two substances), where does the energy come from? Kinetic energy of the molecules?
- yes

## Machine learning

- Deep learning
- Adam, RmsProp
- Details of attentional models like the Transformer
- Variational auto encoders
- global average pooling
- Inception score
- Resnet
- Variational inference, evidence lower bound, etc
- Markov chain monte carlo
- SVMs
- What’s the statistical interpretation of logistic regression? It’s something about a maximum likelihood Gaussian or something, but I don’t remember the details.
- What’s the statistical interpretation of an MLP?

Things I don’t know from theoretical ML:

- Jacobian
- the matrix of partial derivatives in a function from vectors to vectors
- Hessian
- Kalman filters
- Mean field approximation
- Legendre transform
- Tree-reweighted approximation
- Junction tree algorithm
- Turbo codes
- Thermodynamic limit
- Density free energy
- What’s the relationship between Markov random fields and the exponential family
- Marginal polytope?
- the geometric structure which consistent marginal distributions need to be on
- Is Bethe free energy always greater than Gibbs free energy, or smaller, or can it go either way?
- Why does Gibbs free energy correspond to the amount of energy you can get out of a system in ML?
- Are phase transitions (from Bethe lattices) related to anything in ML?
- What is the Dirichlet distribution
- It’s like the beta distribution, but for n != 2
- What does “linear Gaussian” mean
- What are a bunch of sufficient statistics for the exponential family?
- more generally, what does “sufficient statistic” mean
- How do you do that MLE thing for exponential distributions in general?
- how does it work for graphical models?
- Can you do some kind of dynamic programming thing where you find the best causal model of a bunch of data by mixing over PGMs of different connectivities?
- How do conjugate priors work in general for the exponential family?
- Naive Bayes
- What’s the connection between [Markov random fields and the exponential family?](https://en.wikipedia.org/wiki/Markov_random_field#Exponential_family)

## Biology

- What exactly is protein folding?
- What exactly is Adenosine triphosphate (ATP)? How does it work?
- I should more clearly understand how DNA works. It’s something like:
- DNA gets turned into RNA somehow
- You use ribosomes to express the DNA as proteins somehow?
- The base pairs code for proteins by using three consecutive letters to give you one of twenty amino acids. How does this work?
- How does DNA get reproduced?
- How do we sequence DNA?
- What precisely do they do in bioinformatics?

## Engineering

- Materials science
- Why are some things brittle, vs being ductile or malleable? (My guess is that it’s the valence electrons that allow this—as you move the metal, you have to shift it between local minima in the energy of the structure, which is why it resists you moving it, but why it doesn’t shatter or crack.)
- What shapes of crystals do different things have? Why? What effects does this have on their strain resistance etc?
- How do you model strain in materials? I have a guess but I don’t know if it’s correct. (My guess is that you describe the strain energy in the material as something like a quadratic function in the distance it is from equilibrium in some sense.)
- Why is it that if you break an object and then push the two halves together, it doesn’t fuse?
- My guess is that this is because there are tiny little irregularities in the material which don’t match when you put it back together. Or maybe it oxidizes quickly.
- Why don’t cells break when I squeeze my fingers together? Intuitively it seems like they’re small so they should be fragile. I think the intuition is something like “but the force is spread over many many of them”
- Electrical engineering
- I want to understand all of the components of a computer, both functionally (as in, how to model their behavior) and from a physical perspective. Here are some components, but there are probably more:
- capacitor
- resistor
- inductor
- diode
- transistor
- MOSFET?
- DMZ told me that there’s some neat duality between resistance and something else. Or between current and voltage or something. I would love to know what this is.

## Philosophy

- I can never remember the names for the different metaethical views. I should memorize those.
- What are “dialectics”? Is it just like debate except you’re trying to not be dishonestly persuasive?
- Hegelian dialectics? What really is “thesis, antithesis, synthesis”?

## History

- It seems like the American internment of Japanese during World War 2 was totally unhelpful for national security (though also not harmful). Why did it happen anyway? Possibilities:
- People in the government thought it was a good idea
- People in the government didn’t really think about whether it would be helpful and were just reacting emotionally to their fear of the Japanese
- It was extremely popular and the government gave in to public pressure
- Questions about WW2:
- I would like a better picture of how the strategy and tactics of WW2 warfare works
- Before the rise of the Nazis, how popular was anti-semetism?
- How did Germans feel about the Holocaust when they heard about it?
- How did the Nazis feel about war crimes as they were committing them?
- To what extent did Truman understand that Hiroshima was a city rather than a military base?
- Why did Americans engage in strategic bombing? How strategically reasonable does this look in retrospect?
- Why didn’t strategic bombing get banned before WW2?
- How did the attitude of the Japanese towards other Asians compare to the attitude of Nazis to Slavs?
- What’s the history of German, Slavic, and Japanese attitudes to other races?
- My impression of WW2 is that the US and UK were pretty morally upstanding compared to other nations, with the main stain on their reputation coming from their bombing of civilians in cities. Am I missing something here?

## Computer science

- If I want to hear someone better when they’re talking to me on Skype, does it help if I turn off my video? What about if they turn off theirs?
- Programming language theory
- Generalised Algebraic Data Types
- Existential quantification
- Rank n types
- Liquid Haskell
- Dependent types
- Refinement types
- what do these not have that dependent types have
- Lots of higher-kinded types: I’m solid on monads, but need to think more about comonads, applicatives, etc
- Lambda-pi calculus

## International relations/political science

- theoretical IR stuff:
- Why did people argue for realism over neoliberalism?
- How much empirical evidence is there for constructivism?
- How do leaders generally value their own power vs the welfare of their nation?
- How much has the defense-dominance vs offence-dominance of military technology changed over time?
- Is Japan purposefully pretending to have a weaker economy than it actually has, to prevent other countries from imposing tariffs on it out of fear? Someone told me this and I don’t know if it’s true.
- How strong a democracy is America compared to other countries? What metrics could we use? I want this because it would be interesting to see if this allows predictions of future stability. Eg if the probability of a civil war in a given year is equal to one tenth of the probability that the head of the government lies to the judicial branch of the government that year, then we could use the more common event to forecast the less common event. Possible metrics:
- Rate of civil wars/coups
- Rate of rigging elections
- Rate of federal government lying to other branches of government. I know of two cases: Nixon and the sketchiness around Japanese internment. How many more are there?

## Physics

- Classical mechanics
- I don’t totally understand Lagrangian mechanics. I forget the Euler-Lagrange equation.
- I don’t know much about orbital mechanics
- Legendre transforms
- I think the most important hole is that I don’t really understand where Poisson brackets come from in Hamiltonian mechanics
- Cosmology
- I don’t quite know how we measure some stuff.
- Things I think I know: We know how far away stars are because many of them happen to be a very similar size to each other? (Why is that?) And then we know how fast they’re going because they have hydrogen spectra that is redshifted.
- Why did our solar system form planets instead of remaining a cloud of stuff? Why does mass congeal?
- I should read the first few chapters of the Loeb book about this
- How do we know the age of the universe?
- I think you can guess it by looking at how fast things are moving away from you as a function of how far away they are, and then extrapolating to see when they would have all been in the same place.
- What is the distribution of baryonic matter? This one is absurdly hard to look up.
- Special relativity
- What is the question whose answer is preserved by Lorentz boosts and rotations?
- The literal answer is “what is the time squared minus distance squared between the two events, where time is converted to distance by multiplication by the speed of light”. This follows from wanting the speed of light to appear as the same from all coordinate frames.
- General relativity
- I’m on the path to understanding this; I just need to read the relevant chapters of “The Physical World” more carefully. Example concepts that I am currently shaky on:
- Christoffel symbols
- Gaussian curvature
- What’s the Gaussian curvature of a quadratic bowl
- Riemann curvature tensor
- Ricci tensor
- Einstein tensor
- Einstein field equation
- Stress-energy tensor
- Ricci curvature
- https://en.wikipedia.org/wiki/Scalar_curvature "In general relativity, the scalar curvature is the Lagrangian density for the Einstein–Hilbert action. The Euler–Lagrange equations for this Lagrangian under variations in the metric constitute the vacuum Einstein field equations, and the stationary metrics are known as Einstein metrics. The scalar curvature of an n-manifold is defined as the trace of the Ricci tensor, and it can be defined as n(n − 1) times the average of the sectional curvatures at a point.”
- Linearized Einstein field equations
Gauss-Bonnet theorem

- After I learn more GR, here are some questions I want to be able to answer:
- Given a mass distribution, how do I calculate the geodesic of a light ray? Eg, calculate the path taken by light near a star.
- Also, how do I calculate the geodesic of a small test mass?
- Does the path of an object get closer and closer to the path of light as speed gets closer to c?
- How do I calculate the paths of planets with first-order corrections from GR? (Or second order corrections.)
- How much of a correction do I need in order to see gravitational waves?
- Does GR make sense in 2D?
- Quantum field theory
- What’s the 4-momentum of a photon?
- Relatedly, what’s the conjugate momentum of the Maxwell field?
- Under what conditions does a Lagrangian ensure that the L2 norm of a field stays constant over time
- I think this is just an easy Noether’s theorem application? I think it might be something like “if there’s a phase symmetry in this field, the total amount of thing stays constant”. But I don’t really know.
- What is gauge invariance?
- It’s something like “This field seems physically real, but everything is the same if the field has a constant factor added”. Or something like that. People seem to think it’s very physically important and I don’t understand why.
- Gauge covariant derivatives
- What does it mean when the Hamiltonian “couples” to something?
- I think it means that there’s a term in the potential energy part of the hamiltonian that relates the two things
- What does positive definiteness mean; why does energy have to be positive definite? P&S p77
- How does statistics arise from spin?
- this is covered in P&S in chapter 4
- Why do you get the Dirac equation for spin 1/2 things, but the Klein-Gordon equation for spin 0 things? What do you get for things with spins other than that? Is there some more general equation that takes a spin and gives you the right equation?
- "The Hamiltonian and Lagrangian which are rather abstract constructions
in classical mechanics get a very simple interpretation in relativistic quantum
mechanics. Both are proportional to the number of phase changes per
unit of time. The Hamiltonian runs over the time axis while the Lagrangian
runs over the trajectory of the moving particle, the t’-axis.” from here
- Why do same-charge particles repel each other in electromagnetism but attract in gravity?
- From a different page: "See Zee (Quantum Field Theory in a nutshell), Chapter 1.5, for a complete discussion.”
- Can you have stable muons inside a white dwarf? I don’t understand the arguments here.
- I want to have a list of all possible types of interaction in QFT, a la the discussion in P&S section 4.1
- Nonrelativistic quantum mechanics
- I still don’t totally get Slater determinants
- Higher-order pertubation theory?
- Why do interactions between electrons and photons have to take particles to eigenstates?
- it’s something about how matrices diagonalize or something
- I should revise the use of time-dependent perturbation theory to predict transition probabilities
- Interaction part of the Hamiltonian
- Ladder operators for the harmonic oscillator
- Scattering
- Rotation group as applied to the quantum mechanics of spin
- Zeeman effect
- canonical commutation relations—why do some people start their explanation of QM with these?
- Ah, I understand this better now—it naturally follows from the Poisson brackets in Hamiltonian classical mechanics. But I still don’t really know where these come from. In general I don’t understand the Heisenberg picture very well.
- After I understand this properly, I should look at the justification for doing QFT with canonical quantization
- Why does Galilean relativity lead to conservation of mass?
- Quantum monte carlo methods
- This is an approach where you directly simulate the many-electron system. Instead of trying to explicitly represent the complete many-body wavefunction, you sample it.
- Variational Monte Carlo
- One of the largest gains in accuracy over writing the wave function separably comes from the introduction of the so-called Jastrow factor. In this formulation the wave function assumes independence except that it incorporates a term that depends on the distance between two particles in a configuration whose form is to be determined.
- Diffusion Monte Carlo
- "DMC is potentially numerically exact, meaning that it can find the exact ground state energy within a given error for any quantum system. When actually attempting the calculation, one finds that for bosons, the algorithm scales as a polynomial with the system size, but for fermions, DMC scales exponentially with the system size. This makes exact large-scale DMC simulations for fermions impossible; however, DMC employing a clever approximation known as the fixed-node approximation can still yield very accurate results."
- “Chemical accuracy”: this is 300K = 0.026eV
- I still don’t why things look different. That is: white, black, green, transparent, reflective.
- Part of the story is that a photon can only be absorbed if it corresponds to a gap in energy levels. In glass, the band gap is big and so visible light isn’t energetic enough to knock an electron into an excited state.
- But this would predict that metals would be black. ???
- I think that the search term “electromagnetic response” is part of the key here.
- You can allegedly use the Kubo formula for calculating the linear response of an observable quantity from a perturbation.
- Density functional theory
- What exactly is the effective Hamiltonian? Am I right that the electronic density function is a function from position to $$\mathbb{R}$$?
- What behavior is present in the approximations made in DFT in practice? How much of normal chemistry is present?
- Density matrices
- Condensed matter physics
- i basically don’t know any of this stuff.
- I’m told Fermi liquid theory is cool.
- I was recommended "Fundamentals of Condensed Matter Physics” by Cohen and Louie
- Nuclear physics
- There’s a shell model and a liquid drop model. The shell model takes into account the fact that the nuclei have to occupy states with energy that increases quadratically and capacity that increases according to 2n**2. But these models aren’t super solid. How not accurate are they? Is the inaccuracy just because we don’t know how the strong nuclear force works? How did we try to model them in the 1940s?
- I now understand these better. In particular I understand that the models are semi-empirical—we don’t even try to model the interactions from first principles, we just try to fit empirical parameters into our models to match reality as well as we can.
- How do atomic bombs work? How complicated are the rules required to see what reactions are possible?
- How good is the liquid drop model at predicing all observed binding energies?
- What is enriched uranium?
- Computing stuff
- I still don’t understand the rules about fundamental limits on computation. How does the Landauer limit work? Why is entropy a thing? I know it’s something to do with the unitarity of physics but I don’t quite understand it.
- i now understand this
- I don’t understand what negentropy is
- Quantum computing
- I basically don’t remember anything about quantum computing
- Acoustics
- Why does Jeff Kaufman’s pipe sound like it does? What is the physics that tells me how to guess the resonant frequency of any pipe system? What is the analogy between air pipe resonance and circuit resonance?
- Helmholtz resonance
- Misc
- How do lasers work?
- How do LEDs work?
- How do semiconductors work?
- Crazy speculative stuff
- Why don’t magnetic monopoles exist? What would happen if they did?