While we have all been amazed by GAI text, image and/or video at some point in the last two years, the question remains: Can it really produce something truly original not predictable by its training data?
The answer to this depends on a definition of emergent abilities and, thanks to this very important study by Lu et al (2024), is most probably no.
In this study, the authors examined LLM outputs in over 1000 experiments and asked whether these answers could be best explained by emergent capabilities, i.e. output not merely based on training data input. Their research suggests that output can be equally well explained by in-context learning (ICL) and instruction-tuning. In other words, prompts with examples of expected outcomes will yield better results. Further findings confirmed their hypotheses that results can be explained better by ICL rather than any emergent abilities.
That means, for the time being, we can assume that any LLM output, no matter how brilliant, is not emergent. LLMs perform well when they carry out instructions, and best when those instructions include detail and samples of output. No evidence suggests that they can reason through problems and creative novel solutions to them.
LLMs are therefore currently extremely unlikely to begin providing material for hacking banks, creating recipes for ABC warfare, or taking over control of a smart city. It does NOT mean that all output is harmless – fake news on social networks will certainly be accelerated due to the new technology and examples are already popping up. This study should help us not to overestimate LLM abilities but also closely watch how they develop.