pleroma.debian.social

2025/02/09 1:10:48 PM UTC

This year, as an experiment, we're also generating vtt subtitles of all released videos at @fosdem. These are generated after publishing of the video using OpenAI whisper.

I would love to get feedback about these subtitles. Are people using them? Are they useful?

As they're plain text files, it should also be possible to perform some analysis at scale on the contents of all the talks that were performed at FOSDEM. If anyone is doing something like this, please let me know!

2025/02/09 5:13:13 PM UTC

Wouter Verhelst wouter

@mirabilos
Maybe do some research before assuming 'AI' == power wastage? The tiny model is small enough that you can use it to transcribe live audio on a raspberry pi.
@fosdem

2025/02/09 5:39:51 PM UTC

Wouter Verhelst wouter

@mirabilos
Audio transcription is not an LLM. I don't know how much power OpenAI spent training whisper models, because honestly I don't care. That said, this type of thing was possible with the state of the art 20 years ago, so I can't imagine it needed that much.

Thanks for the feedback (I know I did ask), but EOT for me.
@fosdem

replies: 0
announces: 0
likes: 1

2025/02/10 6:04:56 AM UTC

Wouter Verhelst wouter

@lieter
None whatsoever at this point. This is part of it being an experiment 😉