Today wraps up my 1 week special of prompting Gemini in Google AI Studio (where it’s free to start prototyping and building!). Before I dive into today’s examples - I wanted to quickly recap some of the things we’ve explored so far:
Getting festive - finding creative ways to hide the Elf on the Shelf
Captioning photos
Playing games - guess the movie, pick the odd one out
Planning meals
Predicting patterns
Coming up with activities to do with kids
Organizing my life
Making choices - selecting the best option based on criteria
To end things off, I decided that today I thought I’d see how vision understanding worked on some of the more common Generative AI use cases…
Summarization
Usually summarization applies to large bodies of text, but I thought it’d be interesting to see how Gemini summarized what it understood about a photo.
Sentiment understanding
This use case typically applies to extracting sentiment from things like user reviews. I decided to go a little more abstract, and see if Gemini could extract sentiment from photos of my family.
Categorization
For this task, I decided to take a picture of a collection of items that I thought would easily fall into 3 different categories. Then I asked Gemini to identify what 3 categories it thought should exist, and which objects should go into each. I was actually a little surprised with what it came up with (it wasn’t what I was thinking).
Applying All This in Real Life
One thing I’ve breezed over in this special 1 week series is how much back and forth it took to get these prompts to work well. Even with all of that, I still ended up optimizing with one-shot prompts. In a lot of cases, it would help if I were to include examples in my prompts to improve repeatable performance and overall quality, but for the sake of brevity and levity I’ve stopped here.
That said, this series was meant to provide a glimpse into some fun and creative ways for you all to start thinking about how to use multimodal models. I hope it has achieved just that - and I can’t wait to see what the world builds!