Using LLM to process video. I have done a lot of experimenting with passing screenshots into an LLM to give it additional context, but I really want to try passing video directly into the LLM. I believe Gemini is the only one that supports it. This is a summary of Simon Willison’s recent experiment with Gemini for video.
Recent articles
- Remember Clippy: Screen-aware voice AI in the browser - 12th February 2026
- Task Master: Voice-first todo list that updates live as you talk - 27th December 2025
- Setting up Android phone to work with Apple Watch and iMessage - 11th June 2025