THE FIRST VIEW:

Early lessons
in GenAI

Published: April 12, 2024

My first observation of ChatGPT was that a machine had passed the Turing Test. At first pass, I think that was probably a fair observation but now I’d say it’s possible to tell the difference between a human and GenAI. For one, it is overly verbose; unless you are having a conversation with Charles Dickens, GenAI says much more than a human would in response to any question.

This is partly to do with the initial assessment of the output where users preferred longer responses but it has shown a frailty. Humans approach interactions as a conversation, able to establish context, handle follow up questions and ask questions themselves. GenAI isn’t particularly good at this, at least, not for now.

Some other issues have shown up in live application, much of this due to its ‘general’ capabilities and a lack of defined context. For example, a chat bot, powered by ChatGPT and deployed by a Chevrolet dealer, got itself into a spot of bother that went viral. One user was able to get it to agree to sell a new car for $1. The deal was not legally binding.

Hence, the NYC chatbot gives a lot of answers that sound right, but get the law wrong.

New York City established a LLM based advisor on state legal matters designed to help local businesses comply with the law. The problem was that LLMs are not a good choice for this type of interaction; the app continuously missed out crucial elements of legal advice and misinterpreted laws leading to incorrect outcomes. LLMs are probabilistic, pattern-based systems that make things that look like a good answer – they are not databases, and at a minimum, they need a lot of careful productisation before you can use them for something like this.

Hence, the NYC chatbot gives a lot of answers that sound right, but get the law wrong.

I have heard of several initiatives in finance around regulatory compliance that would fall foul of the above situation. A highly regulated industry like finance is right to be cautious of new technology until we work out how to use it effectively.

All of this is just teething issues for new technology. I still think the development of LLMs is a major step forward in computing but it’s early days and the use cases so far are pretty weak. That is not surprising, we tend to imagine new technology in the context of the existing world and so it takes a while for new, differentiated uses to emerge. It reminds me of the early days of the internet, something very interesting has been born but it needs to grow up before it becomes truly useful.

David Collins, Managing Director - First Derivative

David Collins
CEO

The latest insights, perspectives and analysis

FIRST VIEW

Sign up to the mailing list

FIRST VIEW signup

THE FIRST VIEW:

Early lessons in GenAI

Hence, the NYC chatbot gives a lot of answers that sound right, but get the law wrong.

The latest insights, perspectives and analysis

FIRST VIEW

Sign up to the mailing list

Early lessons
in GenAI