Android

ChatGPT can now respond to images with text thanks to GPT-4

There is a new version of OpenAI’s GPT language model called GPT-4. In general, the version has been subtly improved over its predecessor, although there are also a number of new features. In this article we list some of them for you.

GPT-4 latest GPT language model

OpenAI is running the latest version of its GPT language model – GPT-4. The company made this known in a news item. According to the company, the differences with GPT-3.5 are subtle, including the following: ”In casual conversation, the distinction between GPT-3.5 and GPT-4 can be subtle. The difference emerges when the complexity of the task reaches a sufficient threshold – GPT-4 is more reliable, more creative and capable of handling much more nuanced instructions than GPT-3.5”.

1. Images and text as output

However, there is one major extra to GPT-4. It can take images and text as input and produce text as output based on this. They also show an example of this, in this example we see that a user of GPT-4 wants to know what is unusual about a certain image. To do this, the user uploads an image of a man ironing on a taxi. GPT-4 knows how to describe what is wrong and reports that it is quite unusual for a man to iron clothes on an ironing board on a moving taxi.

2. Process more text

Compared to GPT-3.5, version four can process much more text in its short-term memory. For example, it can record more text from the user and also generate more and longer answers, where it can also come up with more details in one go. The answers must also improve because it is now able to sift through large scientific articles in one go and base its answers on this. In simple terms, GPT-3.5 could handle just over 3,000 words, GPT-4 can handle about 25,000 words.

3. Good Binging

Microsoft, one of the companies behind OpenAI, has admitted that GPT-4 is the version that powers Bing’s chat experience. It still doesn’t work as well and it has even threatened people. But the company indicates that it is still learning and that it must continue to improve. You can read more about the chatbot in Bing in the articles below.

Related articles

4. Helping visually impaired people

The Be My Eyes app is getting a virtual volunteer. Be My Eyes is a free app that aims to make the world more accessible to people with a visual impairment. Among other things, the app connects them with volunteers who can help them solve certain problems. This is done via a live video call that allows a volunteer to instruct that person.

The latest version of this app integrates GPT-4. This allows users in the app, for example, to send images to a virtual volunteer powered by AI. This ‘volunteer’ can provide answers based on these images and provide visual assistance. As an example, it is given that someone who has poor eyesight sends a photo of the inside of a gym. The person is looking for the treadmill and can ask where it is based on the photo. The virtual volunteer then takes this person there in several steps.

Still not perfect

OpenAI still reports that the GPT language model is far from perfect. Providing answers to thousands of different questions it receives is still a very complex process. The algorithms are getting better, but it is certainly not perfect yet. Because of this, the GPT language model will still come up with wrong answers and make things up. Thus, OpenAI reports that GPT-4 “shows human-level performance,” but is not yet as proficient when compared to humans in real-live scenarios and conversations.

What do you think of the GPT language model and chatbots? Let us know in the comments below this article.

Related articles

Leave a Reply

Your email address will not be published. Required fields are marked *