They say writers fear a blank page. I harbour another fear – the blinking prompt on a command line, while a computer patiently waits for my input.
I want Auto-GPT to:
the machine asks. Almost innocently. As if it weren’t poised to submit my request to the massive bank of computers running OpenAI’s GPT-4 large language model (LLM), the brains behind the very brightest and capable version of ChatGPT.
This is the hard bit. What do I want? I don’t want trivia, nor something that’s easy. I want something that’s hard enough and interesting enough and useful enough that it justifies all of the computer time – and dollars – that will be invested in finding a solution.
Because this computer prompt isn’t like any other that I’ve typed into over the last 45 years that I’ve been interacting with computers on a daily basis. This isn’t a simple request for a computer to print a document or send an email. This is the genesis of a system that will do whatever it takes to solve my problem. As soon as I can decide what that problem is.
Suddenly I hit upon it: This is Cosmos, after all – so let’s put that to the computer. I type,
Create a daily digest of the latest science news and scientific discoveries.
After I hit return, the computer thinks for a solid 15 seconds (I can almost imagine banks of blinking lights in some far-away data centre), then responds with, well, let’s call it what it is: a mission statement and a plan…
Name: SciDigestGPT
Role: a cutting-edge AI news curator that compiles and delivers a daily digest of the latest science news and groundbreaking discoveries from reputable sources, tailored to your interests.
Goals:
– Monitor and analyze a wide range of reputable science news sources to identify the most relevant and impactful stories for your daily digest.
– Summarize and present the selected news articles in a concise, engaging, and easily digestible format.
– Continuously refine and adapt the content selection process based on your preferences and feedback to ensure the daily digest remains relevant and engaging.
– Stay up-to-date with the latest advancements in science and technology to provide accurate and timely information.
– Maintain a user-friendly interface and seamless delivery process to ensure a positive and convenient user experience.
The computer has translated my request into a ‘role’ – what it plans to do, and ‘goals’ – how it will be able to grade its progress in fulfilling its role. Then it sets to work.
THOUGHTS: I need to gather the latest science news for the daily digest.
REASONING: As a science news curator, my primary task is to collect and summarize relevant news articles.
PLAN:
– Browse reputable science news websites
– Identify and summarize impactful stories
– Refine content selection based on user preferences
CRITICISM: I should ensure that I am using the most efficient methods to gather and summarize news articles.
NEXT ACTION: COMMAND = browse_website ARGUMENTS = {‘url’: ‘https://www.sciencedaily.com/’, ‘question’: ‘What are the latest science news headlines?’}
This is something I’ve never seen a computer do before. It’s breaking its goals down into distinct steps, and working out a path to the solution of each step. At the end of each step there’s some action – the computer needs to go out and read data off a website, or feed that data into GPT-4 to summarise it, or write those summaries to my hard drive. Each time, the computer asks me for permission to perform these operations – because I’ve asked it to do so.
I could have asked the computer to do all of this on its own, without my permission – entirely autonomously. In that case it would simply plow through tasks, one after another, until it satisfied its goals. I could monitor its activity – and intervene, if I felt the need – but the computer really needs nothing more from me. All it needs is a task. After that it operates entirely on its own.
This is Auto-GPT, an open source project which converts these AI tools into the equivalent of an electric motor – capable of powering their way toward a solution of almost any conceivable task.
Auto-GPT does not deliver the back-and-forth banter of ChatGPT or Google Bard or Microsoft Bing. While all three are interesting and useful, they’re somewhat sterile, walled away within the plastic bubble of a web browser, able to converse but not much more. Instead, Auto-GPT gives GPT-4 deep links into my computer – which has loads of other programs (Including for example, email and messaging and word processing), connections to other systems (via the Web) and other devices (such as webcams and microphones and smart lights).
Instantly, the scope of what’s possible with GPT-4 changes completely. It’s no longer about getting just the right wording for on an application for a job or share house; instead it’s about using the tools at hand (both on my computer and across the Internet) to fashion a workflow of tools and data and intelligence to perform a task.
Over the course of the next 15 minutes, Auto-GPT learns that it can find what it’s looking for on the website for Science News (bit of a miss that it didn’t go for Cosmos which had published eight of the ten articles), goes and scrapes the site for headlines and article content, feeds all of that into GPT-4, then creates summaries of the top ten news articles, leaving me with this file on my hard drive:
1. Rapid ice melting in Greenland and its impact on sea level rise estimation.
2. Conscious-like activity found in the dying brain.
3. Previously unknown intercellular electricity possibly powering biology.
4. Detection of a nearby black hole devouring a star.
5. Creation of a human pangenome reference for understanding genomic diversity.
6. Inheritance of a nose shape gene from Neanderthals.
7. Novel ultrasound technique using microbubbles to treat glioblastoma.
8. Saturn’s rings being younger than previously thought.
9. The largest cosmic explosion ever seen.
10. A new approach to explore the earliest universe dynamics with gravitational waves.
Other research topics discussed include leaky-wave metasurfaces, gene regulation mapping, supermassive stars and globular clusters, infrared atlas of stellar nurseries, the universe’s expansion rate, 3D-printed robot heads, symmetric graphene quantum dots, AI’s influence on trust in human interaction, and many-legged robots inspired by centipedes.
Its work done, Auto-GPT finishes execution and leaves me back at the computer prompt.
I could have done all of this: scraped a website, and either written my own summaries or had them written for me by a tool like ChatGPT. All of this was well within the scope of my capacities. But that’s not the point. Auto-GPT got from a request to a completed task by “reasoning”, establishing a role and goals, then breaking those down into step-by-step operations that methodically moved toward fulfilling its goals.
Did it get through all of its goals? No. It clearly fulfilled the first two, and somehow managed to avoid the rest. But this is only the first time I’ve run this newly-created ‘SciDigestGPT’ tool. The next time I run it, it may move toward fulfilling other goals. But it’s early days yet, and I’m not surprised that Auto-GPT bit off a bit more than it could chew. Don’t we all when we’re learning how to do something?
What can Auto-GPT do? It may serve us better to ask what lies beyond it: nearly anything that involves a lot of real-time integration with the real world (such as driving a car in traffic or flying a drone over a crowd) will – for the moment – be beyond its grasp. But the open source nature of Auto-GPT has encouraged a legion of programmers to use improve the tool, adding functionality that will give it many of the capabilities it doesn’t yet possess.
Although I’ve used the most powerful PC I own for this test drive of Auto-GPT, it turns out that wasn’t necessary, for Auto-GPT needs not much more than a connection to GPT-4 in the cloud to operate. We should expect some version of Auto-GPT on our smartphones sooner rather than later.
So where does this leave us? The explosive popularity of ChatGPT has inadvertently hidden its fundamental utility as a universal problem-solver – something far more useful, in far more use cases, than what you can squeeze into a conversation in a web browser. Just as electric motors went into everything as mains electricity went everywhere, we are about to see large language models in everything – not just our computers, but in nearly every tool we touch. These tools will run their own versions of Auto-GPT, asking us what needs to be done, then work for themselves how to do it. That’s what’s new here.
We’re all now sitting before a blinking prompt, as the computer waits patiently for its next problem to solve.