1. Prompt it to extract the audio track, then give it to a speech-to-text API, translate it to another language, then make it add it back to the video file as a subtitle track.
2. Retrain the model to where it does this implcitly when you say "hey can you add Portuguese subtitles to this for me"?
I don't have words for how much this seems like a relatively trivial thing to do now, and 1 year ago I would have laughed at someone if they suggested this was a possibility in 5 years
I'm feeling a mixture of feelings that I can't begin to describe
1. Prompt it to extract the audio track, then give it to a speech-to-text API, translate it to another language, then make it add it back to the video file as a subtitle track.
2. Retrain the model to where it does this implcitly when you say "hey can you add Portuguese subtitles to this for me"?