Face Emoji Week 5 - Choose your path!

    Face Emoji Week 5 - Choose your path!

    Week 5 content of face emoji

    By AI Club on 3/17/2025
    0

    Week 5 of Face Emoji - Choose your path!

    Hello all! Starting this week of this project, we are going to let you on your own for a bit to make progress in this project. We about have 3 more weeks left in this project. Going forward, we need you to build out the project as you see fit. We are going to provide two pathways going forward. You need to pick one and code out the project that way - weighing in the pros and cons of each path.

    Path 1: Train Pytorch on Emotion-based Images Dataset

    In this path, you will:

    • Gather facial expression dataset from the internet. The dataset will contain images of different facial expressions like "angry", "sad", "happy", etc.

    • Start small and train a Pytorch NN on those images and use the expressions as the labels.

    • Once trained, you then use the camera and Mediapipe to detect the faces available. However, you will NOT be making use of the facial landmarks features. Instead, for each face you detect, you would need to crop it and send that face as an image to your Pytorch model to detect the expression.

    Pros

    This would be an easier route just because the dataset is readily available on the internet. You can find many Kaggle datasets and elsewhere which have images of people expressing different emotions. So, it would be quicker to implement this.

    Cons

    However, training the model on images might take a while and you might be a computer with a good (Nvidia) GPU for training. Otherwise, I would recommend starting small with maybe 5 or less number of images for each expression. However, training on small dataset would also mean your model overall has low accuracy, and may negatively affect your final product and competitiveness for the competition. Moreover, since you need to send the face image to your model, it might also be a bit time to detect the emotion. Moreover, when cropping the face in the camera, you need to make sure to resize it so that it matches your training image resolution, otherwise your model will spit out wrong detection.

    Path 2: Train on the Mediapipe facial landmarks directly

    In this path, you will:

    • Gather facial expressions from the internet EXACTLY like in path 1.

    • Instead of training the Pytorch model on the images you collected, you will instead run Mediapipe on those images and extract the facial landmarks. You will also need to create the labels for the landmarks. Then, you will feed the landmarks to Pytorch and the labels to train it.

    • Once trained, you can simply use the camera and Mediapipe and detect faces. Then, extract landmarks in real-time from the faces and feed those landmarks to the model you have trained

    Pros

    This would be much faster than the previous path. Firstly, training might be faster since you are training on the facial landmarks directly instead of the images. So, you probably can train on a larger number of images. Secondly, in real-time emotion detection, it'll be very much faster since you will be feeding in MediaPipe facial landmarks to your Pytorch model and not whole images.

    Cons

    The only con of this route is that it involves an additional step. For training, you will have to:

    • Run mediapipe on all the images you collect and extract the landmarks.

    • Make a list of all the corresponding labels (happy, sad, angry, etc).

    • Feed the Pytorch model the landmarks data and the labels.

    To be fair, this is not that much more work or more difficult than the first path, but the performance gains you will get in terms of how fast it will be might make me more eligible to win the competition.

    Moving on

    You might be wondering that both the images and the facial landmarks are numbers at the end of the day when being trained on, However, look at this:

    Images:

    • A typical webcam image might be 640x480 pixels with 3 color channels

    • This means 640 × 480 × 3 = 921,600 values (about 1MB uncompressed)

    • Even a small 64x64 face crop is still 12,288 values

    MediaPipe Landmarks:

    • MediaPipe's face mesh has 468 landmarks

    • Each landmark has 3 coordinates (x, y, z)

    • This means 468 × 3 = 1,404 values (about 5.5KB)

    So landmarks are approximately 200 times smaller than full images and still significantly smaller than cropped face images.

    So, we recommended Path 2, but you are free to choose Path 1 as well.

    Tasks for this week

    Regardless of whatever path you choose, the first step is always the same - collecting the facial images dataset. So, your task for this week is to research online and collect the best dataset of images that contain different emotions. This is one example:

    https://www.kaggle.com/datasets/jonathanoheix/face-expression-recognition-dataset

    But please, feel free to do your own research and collect a better dataset if you find one.

    Secondly, we have already gone over the Mediapipe stuff where we extracted landmarks, and we have also gone over the training stuff with Pytorch. So, for your next task, you will need to train your model. For those choosing path 1, train your Pytorch model on the images. Just train it. Use as many images as you can. For those choosing path 2, gather the images, run mediapipe to get the facial landmarks for each image (do it in a loop), and train the Pytorch model on those facial landmarks.

    This week is just for gathering data and training. We will not be providing any code for this week either. Please reference code from prior weeks and online docs to code out the process. It should be very similar to what we have done in the past, just without the real time video feed.

    Next weeks

    Afterwards, once you are done gathering data and training, we can then move on to testing it on real-time camera feed and seeing if the emotions are detected correctly. After that, we map those emotions to emojis. Lastly, we setup a frontend to package the project nicely. And keep in mind, the more you do, the more chances of winning the competition and making it to our end of semester showcase where you can compete for prize money!

    Comments