Transcript Editor

    HTMLCSSJavascriptpythonFFMPEGDjango
    GitHub

    Inspiration

    Have you ever tried editing a video and wanted to get rid of a mistake you made while recording or get rid of some awkward pauses? This is a very common use case for video editing. However, it is a lot more difficult than it needs to be. When cutting specific words or phrases from a video, you have to often replay the video over and over again until you find exactly where the word begins and where it ends. With our Transcript web application, we wanted to make it easier to edit these smaller portions of the video.

    What it does

    Our Transcript web application is designed to make video editing much easier. Once a user uploads a video, an interactive transcript of the video is generated. Each word in the transcript is linked to that specific portion in the video. When a user clicks on any word, the video automatically moves to when the word is first said. There is also an editing mode for the transcript that allows the user to select the words they wish to delete. Once the user is finished making their changes, they can save the transcript, and all those words that have been deleted will be cut from the video. Additionally, we have a timeline of the video that allows for editing larger portions of the video. This allows the user to have more than one option for video editing.

    How we built it

    For the front-end, we used HTML, CSS, and JavaScript, and for the back-end, we used Python and Django. The model used to transcribe the video is a fork of the Open-AI Whisper model.

    Challenges we ran into

    The main challenges that we faced when developing this project were to have all the moving parts of the project working together simultaneously. For example, to generate the transcript, we had to receive that information from our python back-end, transfer it to JavaScript, then go through the data to extract the start and end time of each word and then have it all be put in the HTML for the user to interact with. There was also the engineering of the back-end. We had to keep track of the uploads and the videos that have been edited and assign them a unique ID that allows it to be tracked in the back-end. Additionally, for the timeline, there was no ready JavaScript library that generated a timeline for us. Therefore, we had to make it from scratch. In all, these challenges taught us many great lessons in web development that will likely be used in the future.

    Accomplishments that we're proud of

    We are proud to have a fully completed working project that does everything that we planned on doing. We were able to create a unique solution to a common problem in video editing. The ability to edit a video by selecting individual words or phrases is a game-changer in the industry.

    What we learned

    During the development of Transcript, we learned many valuable lessons. We learned how to work with different programming languages and how to integrate different components of a web application. We also learned how to work with speech recognition technology and how to use machine learning models to transcribe video. Finally, we learned how to create a timeline for video editing from scratch.

    What's next for Transcript

    Firstly, we wish to make the video editing process more seamless by reducing loading time. We also wish to add features to the timeline that allow the user to move around different portions of the video and drag in additional videos or audios. In the future, we also want to implement the ability to add deepfake to the video. Deepfake involves using an AI model to generate a video of a person saying the words in a script. It not only moves the lips but also the head to make it more realistic. This would be a great feature to finish, and we believe it would be very possible to implement.

    Project Team

    Mohammad Alshaikhusain

    Mohammad Alshaikhusain