Large language models are a type of artificial intelligence currently taking the world by storm. They include OpenAI’s ChatGPT, Google’s Bard and various others. All are trained on vast databases of written articles in which they measure the likelihood of a word appearing, given the sequence of words that appear before it.
Armed with that knowledge, the AI produces responses to a given prompt by listing the most likely sequence of words that the model suggests. Computer scientists have further refined these processes and fine-tuned the capabilities of these systems to improve the output.
The results are by turns impressive, confusing and frightening. These AIs have the ability to write jokes, produce poetry and mimic literary styles. But they also confidently make mistakes, sometimes compounding them with a series of errors, a phenomenon that AI engineers call hallucinating.
Question Marks
Nevertheless, many observers predict a bright future for these AIs. Microsoft is building the capability into its Office products to help produce written reports, presentations and to analyze data. Google has a similar approach with its Workspace. The hope — at least as far as these technology giants are concerned — is that these AI systems can dramatically improve the productivity of workers and the companies who employ them.
And that raises the question of how people will use them and which jobs are likely to be most influenced by the emergence of Large Language Models.
Now we get an answer of sorts thanks to the work of Tyna Eloundou at OpenAI, an AI start up based in San Francisco, and colleagues. This group asked whether ChatGPT3.5 could be useful for a list of almost 20,000 tasks associated with over 1000 occupations, ranging from computer system architects and nurses to journalists and mathematicians.
(The database is called O*NET and maintained by the US Department of Labor.)
The team then determined the impact that ChatGPT3.5 would have. In each case they wanted to know whether the AI would make the task harder, easier or would require some additional software to have a positive effect.
For example, one task that an acute care nurse has to perform is to “set up, operate, or monitor invasive equipment and devices, such as colostomy or tracheotomy equipment, mechanical ventilators, catheters, gastrointestinal tubes, and central lines”. By contrast one task for a kindergarten teacher is to “Involve parent volunteers and older students in children’s activities to facilitate involvement in focused, complex play.” While one task for an online merchant is to “Deliver e-mail confirmation of completed transactions and shipment.”
The team asked a group of humans to decide this impact and also asked ChatGPT4 — the most advanced version of OpenAI’s large language model — the same question.
The results make for interesting reading. “Our findings indicate that approximately 80% of the U.S. workforce could have at least 10% of their work tasks affected by the introduction of GPTs,” say Eloundou and co. “While around 19% of workers may see at least 50% of their tasks impacted.”
But not all skills will be impacted in the same way. For example, the team say the skills associated with science and critical thinking will be less influenced while writing and programming skills will be more so.
The team also analyzed the impact by industry. “We discover that information processing industries exhibit high exposure, while manufacturing, agriculture, and mining demonstrate lower exposure,” they say.
Game Changer
The team predicts that some jobs will not be influenced by ChatGPT3.5 at all. They include: Tire Repairers and Changers, Motorcycle Mechanic, Short Order Cooks and Cement Masons and Concrete Finishers
In general, jobs that have a higher barrier-to-entry for humans are more exposed to ChatGPT. These are jobs that require the highest levels of education, experience and training.
But these conclusions come with an important caveat. A significant difficulty is in knowing what kind of impact ChatGPT3.5 could have on any given task. The team acknowledge this and the fact the results are entirely subjective.
Neither is it clear whether ChatGPT3.5 will replace human activity or enhance it. This is an important distinction for predicting future employment trends.
Nevertheless, ChatGPT3.5 and other similar AI systems are here to stay and likely to get better. They are also likely to have a pervasive influence throughout society and lead to a wide range of other innovations. For that reason, Eloundou and co conclude that large language models are a “general purpose technology” like electricity or information technology.
There’s not much argument about the impact these technologies have had on civilization and how crucial they have become for everyday life.
But just how large language models will influence society in the coming months, years and decades is perhaps one of the most important questions we face.
Ref: GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models : arxiv.org/abs/2303.10130