Comparison of free and paid AI captioning models (2024) [update 14th may]

What’s the state of AI captioning in 2024? Can it replace human? Are paid models better than free? And of course which model is the best? What exactly did model output?

If you got access to a model that isn't described here, please contact me, 
so I could send you better quality photos for you to test it.

Basic questions

Answering own questions – it’s one of the strongest points of current AI. In my opinion, much better than generating text from prompt or generating code. I think it can’t replace human yet, at least not as non-assisted generator. But with assistance – it’s pretty well. It can create very good description, but there’s no single winner and one would have to try a few websites (or install local applications in Hugging Face models case) before getting good one. As it seems free applications doesn’t necessarily generate worse results. Overall the time spent on generating descriptions with different models and validating it could take longer than describing photos for stock manually. At least until 3rd party tools like DeepMeta will allow you to click on the button to fill AI title, description and tags.

The test and where lies the difficulty

To make this comparison I used three photos. They all have some perks and quirks which may be difficult to AI. Just one or two correctly recognized the point of the photo of “KATOWICE” sign (i.e. the sun shining throught “O” letter) and none has recognized the concrete fence on the rooster photo. And the other difficulty step was the butterfly background picture – it sits on the grass and there’s nothing else in the very blurred background: so-called bokeh. One model recognized it correctly (I think it was lucky guess). And Bing did really an awesome job with all photos. Without threatening to destroy everything I love, as it did in the past for somebody.

Subjective results

For applications requiring prompt, I added prompt in the first column. I left all formatting – lists, bold font and Bing’s emoticons. In my opinion, if you want to go with free apps, Google Gemini and Microsoft Copilot are a good starting choices. From paid alternatives pro version of Gemini doesn’t make a big difference, while GPT-4 results are rather nice, but comparable to free Copilot and Gemini. For me, they can be assessed using two attributes: how specific (vs. poetic or humorous) they are and how correct (vs. delusional or wrong) they are. That’s the totally subjective comparison:

Chart with X axis showing correct-wrong models and Y with poetic-specific.
Fig. 1 Subjective comparison of AI captioning models – red are paid, yellow are demos from Hugging Faces and blue are other

Objective, raw results

The last note before the main part: I used this formatting to highlight errors in recognition and this one for my own notes


free models
free, but noticeably limited
paid models


HuggingFace - GITthe sun behind the sign ...that's all?a blue butterfly on a blade of grass.
Wow, it's recognized the grass!
a black and white rooster with a white head standing in front of a brick wall.
HuggingFace - BLIP1a large sign that says katotowce on top of a buildinga small blue butterfly that is sitting on a planta black rooster with a white and red comb on its head
HuggingFace - BLIP2katowice sign with sun shining behind ita blue butterfly is sitting on top of some grassa black and white rooster standing in the grass
HuggingFace - InstructBLIP
prompt: describe photo for a stock
photo a sign that says katowice with the sun shining behind itphoto of a blue butterfly sitting on a leafphoto a rooster with a mohawk standing in front of a wall
HuggingFace - Fuyu8B

Detailed mode
In the image, there is a darkened sky with a gleaming light shining down. The gleaming light appears to be in the center of the sky, creating a glowing circle. The sky is dotted with clouds, creating a moody atmosphere.
In the image, a beautiful blue butterfly is perched on a wooden stick. The butterfly is facing the camera, and it appears to be resting. The insect is surrounded by green leaves, giving the impression of a natural setting. The butterfly is positioned towards the left side of the image, occupying a significant portion of the sic! no end. Further testing shows random cuts of the description.In the image, a large, rooster-like bird is standing on a grassy area next to a brick wall. The bird is displaying its long, black tail feathers. The bird is positioned towards the left side of the image, with its tail hanging down.
HuggingFace - Fuyu8B

Normal mode
The words kato and tove are displayed on a dark background.A close up view of a blue butterfly on a branchA rooster is standing in the grass near a building.
HuggingFace - SRD Image Captiona sunset scene with a sign that says "no parking"a blue and white bird is sitting on a blue surfacea bird with a long beak and a black tail
Bing Copilot

• Overusing emojis like a crazy teenager, tries to be poetic, but gets pretentious.
• Semi-free, oficially it needs Windows and Microsoft Edge, unoficially it can work using some hackish ways, but it's against terms of use
used prompt: describe this photo for a stock
This captivating image features the word “KATOWICE” in bold capital letters. The letters are silhouetted against a bright sky, with the sun positioned directly behind the letter “O.” This creates a striking effect of light and shadow, emphasizing the contours of the letters. The sky appears mostly clear, and the intense sunlight generates rays that extend outward, adding to the visual impact of the installation. Overall, it’s a dramatic and eye-catching composition. 🌞🔠

That's very nice!
Title: “Azure Wings Unfurled”
Description: This mesmerizing image captures the delicate beauty of a bright blue butterfly. With its wings fully spread, the butterfly reveals intricate patterns and textures. Its body is adorned with fine, iridescent blue hairs, while the wings transition from a deep azure near the body to softer shades towards the edges. Against a blurred backdrop of lush greenery, the butterfly perches gracefully, its antennae reaching out to the world what the hell?. This fleeting moment encapsulates the wonder of nature’s artistry. 🦋🌿
This captivating image features a distinctive rooster with a fluffy, white head and a dark body. The rooster stands against a backdrop of a concrete wall and lush greenery, creating a striking contrast between natural and man-made elements. The composition draws attention to the rooster’s unique appearance and the interplay of textures. 🐓🌿
Google Gemini (ex-Bard)

used prompt: describe this photo for a stock
Subject: The photo features the word "Katowice" illuminated by sunlight breaking through clouds.
Keywords: Katowice, Poland, city, travel, sunshine, clouds, light, perspective
Places: This image could be used to represent the city of Katowice in Poland.

Additional details:

The photo does not show any people.
There are no recognizable landmarks in the background.
The focus is on the illuminated text and the sunbeams.
Subject: Close-up of a blue butterfly perched on a green plant.
Keywords: Butterfly, blue, nature, macro, insect, wings, plant, flower (potential)
Concepts: Nature, wildlife, beauty, fragility, summer

Additional details:

The butterfly is commonly identified as a Common Blue butterfly (Polyommatus icarus) the only one to correctly identify not just family, but also species! based on its appearance.
The background is out of focus and difficult to discern.
There are no people in the image.
This image features a close-up of a Polish Chicken, a breed known for its distinctive feathered headwear. The chicken has a white body with black markings and a red comb and wattles. The most prominent feature is the abundant white feathers on its head, which cascade down the sides of its face. The chicken's beak is slightly open, and its eyes are alert and curious.
(...) then after a few questions requesting better description it says:
Breeds: If the specific breed is important, you can mention other characteristics of Polish chickens, such as their docile temperament or their ability to lay white eggs. I assure you: this rooster won't lay ANY egg
GPT 4o
(free, but very limited trial model)
photograph captures the silhouette of a sign spelling "KATOWICE" against a vibrant sky. The sun is positioned directly behind the letters "T" and "O," creating a striking visual effect with rays of light radiating through the gaps. The photo employs a high-contrast technique, with the sign appearing in dark shadow against the bright, glowing sky, highlighting the details of the letters. The composition and lighting create a dramatic and dynamic image, suitable for use in themes related to urban landscapes, city life, or travel in Katowice, Poland.The image showcases a vibrant blue butterfly, prominently positioned against a blurred natural background. The butterfly's wings are fully open, revealing intricate details and textures with shades of blue and subtle hints of purple. The delicate veins and edges of the wings are clearly visible, adding to the butterfly's delicate and ethereal appearance. The background consists of soft green hues, suggesting a grassy or leafy environment, which contrasts beautifully with the butterfly's striking color. The image captures a moment of stillness and natural beauty, perfect for themes related to nature, tranquility, and the elegance of wildlife.The image features a distinctive rooster standing confidently in an outdoor setting. The rooster has striking plumage, with a prominent white crest atop its head, contrasting sharply with its black body feathers. Its red comb and wattles add a vibrant touch of color to its appearance. The background shows a textured concrete wall so far the only AI that almost recognized concrete fence with some green foliage climbing up the side, suggesting an urban or semi-rural environment. The ground is covered with grass and patches of earth, adding to the natural setting. The rooster's pose and the detailed feathers convey a sense of pride and elegance, making it an eye-catching subject for themes related to farm life, unique animals, and rural scenes.
Geimini pro (paid model)Title: Katowice, Poland
Description: The photo shows the city of Katowice, Poland. The sun sets behind the buildings of the city.
Tags: Katowice, Poland, skyline, sunset, city
Title: Blue Butterfly
Description: A close-up of a blue butterfly perched on a leaf. The butterfly is mostly a deep blue color with some black markings on its wings. The background is soft green.
Tags: butterfly, blue, wings, macro, nature, insect
Title: Proud Rooster
Description: A proud rooster stands tall and proud in front of a stone wall.
Tags: rooster, chicken, bird, animal, proud, majestic, colorful, vibrant, feathers, beak, wattle

Second try with the following prompt: “Generate the title, description and tags for the following image. Make a description at least 5 sentences long.”
Title: Proud Rooster
Description: A proud rooster stands tall and proud in the barnyard, his feathers gleaming in the sunlight. He is the king of the roost and he knows it lol. He is a beautiful creature and he knows it too lol too. He struts around the barnyard, showing off his feathers and his tail. He is a very proud rooster indeed.
Tags: rooster, chicken, barnyard, proud, feathers, tail
GPT-4-Turbo-Vision (paid model)Title: Silhouette of Katowice Sign Against the Sun

Description: This striking image captures the bold letters spelling out 'KATOWICE' in a dramatic silhouette. The sun is perfectly positioned behind the sign, creating a halo effect that highlights the contours of each letter. The clear sky provides a serene backdrop, contrasting with the dark outlines of the sign. This photo encapsulates the beauty of backlighting and the power of simple yet impactful composition.

Tags: Katowice, silhouette, sun, backlighting, sky, signage, halo effect, photography, contrast, dramatic lighting, city name sign.
Title: Close-Up of a Delicate Blue Butterfly

Description: This captivating image features a close-up view of a beautiful blue butterfly, showcasing its delicate wings spread wide. The rich hues of blue are highlighted by the intricate patterns and fine scales on the wings, while the butterfly's furry body and long antennae are also visible in fine detail. The natural green background provides a soft contrast to the butterfly's vibrant color, emphasizing its ethereal beauty.

Tags: blue butterfly, insect, close-up, nature, wildlife, macro, delicate, wings, antennae, natural beauty, entomology, blue, scales, patterns.

Second try:
Title: Splendid Blue Butterfly in Natural Habitat
Description: The image features a striking blue butterfly with its wings fully extended, revealing the intricate patterns and textures that grace its wings. The butterfly's fuzzy thorax and slender antennae are also prominently displayed, set against a backdrop of soft-focus greenery that accentuates the creature's vivid coloration. This photograph captures the serene beauty of a butterfly at rest in the wild.
Tags: blue butterfly, nature, wildlife, macro photography, insect wings, fuzzy thorax, natural habitat, green background, delicate insect, antennae, butterfly close-up.
Title: Majestic Rooster with a Stylish Crest
Description: This vibrant image showcases a black rooster with an impressive white crest that resembles a stylish hairdo. The bird's striking features, including its red comb and wattles, stand out against the neutral background of a gray brick wall and green foliage. Its tail feathers arc elegantly upward, adding to its regal and distinctive appearance. This photograph captures the unique charm and character of poultry in a natural setting.
Tags: rooster, bird, crest, comb, wattles, feathers, poultry, farm animal, stylish, nature, wildlife, brick wall, greenery.



