Much has been made of how Generative AI might replace the need for software developers. That day’s not here yet, but already, shifts are underway in the software industry. One such shift is occurring for companies that adopt new AI technology in their products.
Before Generative AI and Large Language Models (LLMs) exploded onto the scene in the past year, most product development teams were used to a fair amount of certainty. You wrote code, you tested to ensure it did what you wanted, and then you iterated things. Reasonably straightforward.
Now that’s changed. Countless product teams are weaving LLM functionality into what they do, promising leaps forward in fields like customer experience, law, and finance. Increasingly, AI is popping up in the products we all use for work and play. It’s hugely promising, not only because of the compelling outputs for consumers, but due to the extremely quick and low-cost deployment that’s now possible for software companies. Teams no longer have to embark on their own, highly-expensive machine learning initiatives; they can leverage powerful models from companies like OpenAI to get the work done.
But there’s a challenge underlying these advances. Use of LLMs makes product development much less predictable for the builders of those products. LLMs can be unpredictable, and they don’t necessarily say the same thing twice. Consider:
- How do you manage a prompt? How do you record and monitor what it actually did?
- How do you simulate what prompts will do in a product across different use cases?
- How do you you (as a human) know what actually occurred?
- How do you judge whether the LLM passed or failed a test, or whether it’s still working?
- How do you choose among distinct LLMs like OpenAI, Google, Anthropic, and more?
These changes require new management systems and tools for creating software. And that, in turn, spawns companies to create them.
Freeplay.ai is one such firm. Based near Denver, Colorado, the company was founded in the past year by two former product and engineering leaders at Twitter. Ian Cairns, Freeplay’s CEO, explains the opportunity this way. “We saw there was going to be a generational shift in how people were building software. Prompt engineering can feel like a dark art where you’re just coaxing the computer to do what you ask, and it surprisingly disobeys at times. We saw that there were going to be different needs around how you experiment, test, and monitor these systems.”
Freeplay.ai is an example of a firm developing tools to manage the murky world of LLMs.
That shift is already underway. Cairns described a range of companies who are adopting new development tools and practices that were previously unfamiliar. “An order of magnitude more teams are building with AI technology today than 9 months ago. Many don’t have experience with machine learning best practices. They’re looking for help to find the right workflows, not just tools.”
So, how do you create systems and tools to manage these challenges in a world that’s as fast-moving as LLMs? For Cairns, it starts with humility and inspiration from long ago. “We couldn’t make too many assumptions about exactly what workflows and tooling would look like for customers in a year or two. So we looked back to early UNIX commands, from the 1970s. They were really atomic, and that enabled them to talk to each other and for people to use them together in new ways. Using that approach, as things change we can re-arrange foundational tools in new ways to meet how the market moves.”
Work processes may also need to adapt for product developers. Human labeling of data can be essential to machine learning, and it matters for reliable testing and evaluation of LLMs. System outputs need to be monitored to ensure they produce appropriate results. AI models may drift and decay over time as changes are made — changes outside your control if you rely on a third party like OpenAI. Software development and maintenance becomes an always-on endeavor.
Even as LLMs make it far easier and faster to execute certain types of AI interactions, they create demands for new tools and approaches. Software is always an exciting industry, but these may be some of its most thrilling times.