News

Happy 2025! I'm back home for winter break! :-)

Blog

[1/7/2025:] SLAHMR and New Years Updates
[11/5/2024:] Satisfying Sip
[9/28/2024:] Fomenko's Art
[9/18/2024:] Truth and Orientation
[9/2/2024:] Ghee and Ethics

Notes

Working on notes on the quantum mechanics, derivatives (AKA tangent spaces vs. algebraic approaches), and uploading my course notes onto this blog!

Projects

Finally started a projects page! I've recently made some nice upgrades to my post component, so it looks pretty clean! ;)

šŸŒŠ

I'm considering whether or not to continue this project using WebGL or Three.js.

I'm also researching methods for generating the 3D scenes I want for this project automatically.

In the meantime, I've decided to proceed with some preliminary prototypes of the other interactive parts of this project.

Orange Juice

I like orange juice. :)

Mlog


SLAHMR and New Years Updates

January 7, 2025
By Aathreya Kadambi

Table of Contents

Blog Updates

Happy New Year! Iā€™m a bit lateā€¦ unforunately. My backlog of blog posts has grown astonishingly, all because I got slightly stuck on my second ladder operators post! Iā€™m really close to being done, but the proof of the result is evading me. But now I have to make this one, which makes me feel guilty about the other planned ones which are nearly done but not ready yetā€¦ maybe at the end of this post Iā€™ll sketch out some of the posts I plan to finish up and release in this next week. :-)

In any case, in the last two days, I was playing around with this thing called SLAHMR! Itā€™s a model for pose detection and tracking, but itā€™s especially cool because it also models the motion of the camera itself, along with a ground plane!

In this post, I mainly want to discuss a few features I noticed of the SLAHMR model:

  • Covers Bodies Accurately
  • Slight Foot Skating Can Still Occur (e.g. if there are multiple people in the scene, and there is a complicated tradeoff in the losses by fixing a ground plane which allows for greater joint stiffness elsewhere)
  • Joint Stiffness (likely due to the optimization on the shape and pose priors)
  • Increased Stability (due to modeling the camera, we can adjust view much better)

Experiments with SLAHMR

For my experiments, I used three 3-second clips: one with my mom recording me shoveling snow, one of my brother and dad coming back from my brotherā€™s bike race, and one of a crowd of people in SF enjoying the Fleet Week air show.

After runnnig SLAHMR, we get the following:

The first two look pretty good! šŸ”„ The third one is a bit wacky though. Iā€™ll discuss each video in a bit more detail below.

Bike Race

This one is very good! Letā€™s take a closer look at the input PHALP videos in comparison to the resulting ones:

The left videos are the PHALP inputs and the right ones are the output ones (you can tell by the presence of the ground plane). Here are some things Iā€™m noticing:

Overall, Coverā€™s Bodies! Overall, the SMPL body models seem to line up with the people in the video, which is great!

Slight Foot Skating: You can see the biker (my brother) seems to experience some foot skating in the upper right video, while my dad seems to be standing still. The PHALP input seems to capture the motion of both people slightly better, although the joints seem to be bent more unnaturally. That being said, the videos on the right seem to be making good contact with the ground plane. It appears that maybe the energy penalty from the foot skating was counteracted by the better stability from my dadā€™s feet being stuck to the ground plane.

Joint Stiffness: It seems to me that compared to the PHALP input, the SLAHMR results tend to have stiffer leg joints. We see that here with my dadā€™s legs. I believe this makes sense based on the shape/pose priors and optimization.

Here are some other plots:

From left to right, top to bottom, these are the losses from Bone Length, Contact Height, Contact Velocity, and 2D Joints. I'm actually not totally sure what the box and whisker necessarily represents (perhaps over all the people in the video?) but at the very least, we see a general pattern of decay. The spikes seem to be due to the fact that we progressively increase how much of the video we are processing.

Shoveling Snow

While the previous video had two subjects, this one just has one.

Overall, Coverā€™s Bodies! Overall, the SMPL body models seem to line up with the people in the video, which is great!

Joint Stiffness: Again, we see a bit of joint stiffness, but in this case it is actually a good thing, matches the original much better: hand stays in better contact with the shovel.

Increased Stability: The right top video has significantly better stability than the left top one.

The plots are relatively the same in terms of interpretation. The shape prior was particularly interesting though:

I'm not sure why it seems to oscillate so smoothly like this. Perhaps it is coincidence?

Hereā€™s one more comparison to showcsae the increased stability and joint stiffness (see the legs) when we change the view:

Blue Angels

This one was particularly funny but had issues.

Itā€™s interesting to note that here, the PHALP inputs arenā€™t particularly bad (except for that blue guy casually floating around hehe). I think the issue here was more that the ground plane was abnormally slanted in the original video (itā€™s San Francisco) and it probably also had local curvature (which ideally, shouldnā€™t be too much of a problem to be honest). But I think this caused the initial body models to be very offset from things like the ground plane, which probably caused a weird initial loss/gradient that just threw it on an entirely different course towards noise.

Looking at these losses we get maybe another side of the story (?): perhaps as the camera panned around, several new people were added. This is why we see an increasing step situation, and the losses likely barely decreased each time because the initial conditions themselves were not very good. In fact, I think it might be that the initial optimization step on the first chunk of the video might have interfered too much with the remaining chunks, causing everything to kind of jumble up. Afterwards, I think pose detection/matching tends to be a sort of ā€œjigsaw puzzle problemā€, so you need to be very close to the true solution to have good convergence. I think here, the PHALP inputs were just too far off themselves to produce good SLAHMR results.

While I still have to read up on how PHALP works, perhaps a solution to this would therefore be to interleave steps of PHALP (detecting humans and then lifting to 3D) and SLAHMR optimization, or regenerating views each time and rerunning PHALP. These might take a while though, and perhaps other models might perform better at this particular task. Or maybe, one could change the regularization/lambda parameters mentioned in the paper, to decrease the effect or do the optimization with a lower step size.

Other Videos

I also tried to see if it would do anything with this video that could fuel skinwalker conspiracies:

but it didnā€™t, mainly just errored. I guess itā€™s to be expected, since the joints are all wack and the only subject isnā€™t human. Perhaps Iā€™ll try to fidget around one day to get this to work.

January Planned Posts

The issue with some of these posts is that after Iā€™ve figured out what I want to write about and lay it out in my head too well, actually writing it out becomes a pain! And also, everything kind of temporarily slowed down in early November because I decided to go back to a Pomodoro schedule for the rest of the semester.

Butā€¦ now with the new year and all, I promise to be more on top of my posts šŸ¤§, especially in this next week so that I can get it all out of the way and out of my system before the semester starts!

General Posts:

  1. Tales From Berkeley: Iā€™ve been thinking, people should start documenting their lives as legends more in the modern day! Myths and legends from the ancient world are often just funny stories from peopleā€™s lives. I think this would make life more interesting (at the least, we do have memes šŸ˜‚).

Iā€™ll be compiling this one for a while, and Iā€™ll post maybe next month or later!

Math Posts:

  1. Motivating Ladder Operators II: Boundedness of the spectrum of an operator might imply discretness of the point spectrum (and maybe even the other parts of the spectrum?). I really believe in this result, even if the true one might have extra conditions.
  2. What is dx: Finally, I think Iā€™ve come to better understand variational methods, what differentials really mean, forces, energy, and gradient descent. Lots of tangents (oh boy, this might be a long post).
  3. Rectified Flow: Iā€™ve recently been interested in optimal transport, and I checked out the TRELLIS paper and the one on rectified flow! Theyā€™re amazing!
  4. Krylov Methods: BiCGStab and and all the other numerical methods for solving systems have always been a bit of a jumble in my head, but apparently, theyā€™re all special cases of Krylov methods!
  5. Kernel Methods and Mercerā€™s Theorem: A proof of Mercerā€™s theorem which we learned in my ML class, and then Sturm-Liouville equations.
  6. USD: An apologetic post about why USD is actually designed extremely well.
  7. Words I Pretend to Know: A confession about several words I use a lot but secretly donā€™t actually understand! šŸ˜‚ Of course, now I do though.
  8. Graph Theory in n Dimensions: Simplicial complexes and generalizing graph theoretic analyses to them! Perhaps there are many more generalizations of things like the Euler characteristic, and if we take all combinations of the different counts of n-dimensional cells of a simplicial complex (moments being things like vertices, edges, faces, etc.) what do we get?

The good thing is, math posts 1 through 3, 5, and 6 are well on their way to being done. These are the ones I can hope to finish this week or next. 4, 7, and 8 might take a while though.



As a fun fact, it might seem like this website is flat because you're viewing it on a flat screen, but the curvature of this website actually isn't zero. ;-)

Copyright Ā© 2024, Aathreya Kadambi

Made with Astrojs, React, and Tailwind.