As mentioned previously, my ICM finals was to design a Pose-Karaoke experience. For my motivations and background on the project, please go to the post here.
While I had major issues with getting the ICM code to work on the ml5js platform, much of it has been rectified by the maintainers of the code-base and this current example solves pretty much all the problems.
But this was the easy part.
While starting the project, I did not realise the complexity of comparing 2 poses. A lot of what I had to do relied on being able to compare 2 images and it wasn’t a trivial problem. Trawling through the depths of the internet, I came across this post by Google research where they had worked on a similar problem. This post is a wealth of information on how to compare poses and it was outside my technical ability to be able to incorporate everything in my work. But the chief things that I could incorporate were:
1) Cosine similarity: It is a measure of similarity between two vectors: basically, it measures the angle between them and returns -1 if they’re exactly opposite, 1 if they’re exactly the same. Importantly, it’s a measure of orientation and not magnitude.
2) L2 normalization: which just means we’re scaling the vector to have a unit norm. This helps in ensuring that the scale does not play a factor in comparison and the 2 images can be compared normally.
The cosine similarity helped my code run faster and the L2 normalization ensured that the relative distance from the camera won’t play a role in the comparison.
Getting these 2 things to work proved to be a big challenge and once that was done, the comparison went pretty smoothly as seen in the video below:
I ran out of time to build a complete experience for the users which involve an engaging UI but that gives me something to do for the winter break. While I could not match the scope I had set initially, I am very happy that I could dive into algorithmic complexities and solve those issues to make something working. This gives me a lot of hope for the future and my coding abilities. All in all, time well spent!