How to solve the’strawberrry problem’: AI’s limitations
Join our daily and weekly emails to receive the latest AI news and updates. Learn More
As of now, large-scale language models (LLMs), such as ChatGPT and Claude, are a commonplace across the globe. It is no surprise that many people are worried about AI taking their jobs. However, it is also ironic that almost all LLMs fail at a simple task, such as counting the number of letters in the word strawberry. LLMs are powerful AI programs that have been trained to generate and understand human-like speech. By predicting and creating coherent responses, they excel at answering questions, translating language, summarizing information, and even producing creative writing. LLMs are designed to recognize patterns in text, which allows them to handle a wide range of language-related tasks with impressive accuracy.
Despite their prowess, failing at counting the number of “r”s in the word “strawberry” is a reminder that LLMs are not capable of “thinking” like humans. They do not process the information we feed them like a human would.
Conversation with ChatGPT and Claude about the number of “r”s in strawberry.
A large number of the high-performance LLMs that are currently available are built using transformers. This deep learning architecture does not directly ingest texts as input. The process is called tokenization and it converts text into numerical tokens. Some tokens could be complete words (like the word “monkey”) while others might be part of a full word (like mon and key). Each token is a code which the model can understand. The model is able to predict better the next token by breaking down everything into tokens.
LLMs do not memorize words, but rather try to understand the different ways that these tokens can be arranged. This helps them guess what will come next. In the case of the word “hippopotamus,” the model might see the tokens of letters “hip,” “pop,” “o” and “tamus”, and not know that the word “hippopotamus” is made of the letters — “h”, “i”, “p”, “p”, “o”, “p”, “o”, “t”, “a”, “m”, “u”, “s”.
A model architecture that can directly look at individual letters without tokenizing them may potentially not have this problem, but for today’s transformer architectures, it is not computationally feasible.
Further, looking at how LLMs generate output text: They predict what the next word will be based on the previous input and output tokens. This is great for creating contextually aware text that looks like human language, but it’s not good for simple tasks such as counting letters. When asked to answer the number of “r”s in the word “strawberry”, LLMs are purely predicting the answer based on the structure of the input sentence.
Here’s a workaround
While LLMs might not be able to “think” or logically reason, they are adept at understanding structured text. Computer code is a great example of structured text. It comes in many different programming languages. ChatGPT will likely give the right answer if we ask it to use Python in order to count the “r”s within “strawberry”. When there is a need for LLMs to do counting or any other task that may require logical reasoning or arithmetic computation, the broader software can be designed such that the prompts include asking the LLM to use a programming language to process the input query.
Conclusion
A simple letter counting experiment exposes a fundamental limitation of LLMs like ChatGPT and Claude. These AI models are capable of generating text that is human-like, writing code, and answering questions. However, they cannot “think” as humans do. The experiment shows that the models are not intelligent, but rather pattern matching algorithms. Knowing what prompts are most effective can help alleviate this problem. Welcome to the VentureBeat community!
DataDecisionMakers
Welcome to the VentureBeat community!
DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.
If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.
You might even consider contributing an article of your own!
Read More From DataDecisionMakers