Prompt Engineering is the New SQL

May 30, 2024

by Rajesh Menon, Lead Software Architect, Rackspace Technology

Prompt engineering is dead!

If you have been browsing the news about the field of generative AI in early 2024, you may have seen headlines suggesting that prompt engineering is dead. Two such articles are accessible here: AI Prompt Engineering Is Dead and Prompt Engineering is Dead!.

Although these articles have clickbait headlines that suggest prompt engineering is dead, they still note that there is a place for prompt engineering, be it in simple use cases or in auto-generated prompts. In this article, I will demonstrate some simple prompt engineering techniques to get a more tailored response from a large language model. The beauty is that these techniques can be picked up by anyone.

Remember Structured Query Language (SQL)? SQL democratized the information revolution at the turn of the century. Millions of people around the world who did not have formal computer education could interact with an IT database system by learning a few key phrases like select, where, and group by. This opened the doors for finance workers and math teachers to become IT analysts.

Well, prompts are how we "query" a generative AI model. Therefore, prompt engineering could similarly open the doors to many people, just like SQL did. This could enable code dabblers to become serious programmers and music enthusiasts to transform into full-fledged music creators. Just as an analyst could use SQL to find patterns in the numbers of a corporate database, a neural researcher may figure out patterns in protein misfolding that could lead to a cure for Alzheimer's disease by querying a generative AI model. After all, a generative AI model merely encapsulates the intelligence of nature in a digitized form.

What is prompt engineering?

Prompts are how one interacts with a large language model (LLM) such as OpenAI’s GPT4. To obtain a better response from the LLM, one can use some specific techniques of prompting, and the use of such techniques is called prompt engineering. In this post, we learn a few of the prompt engineering techniques, the new "SELECT, WHERE and GROUP BY" of the generative AI era.

Now, if you have experimented with any of the LLMs like GPT4, Gemini, Claude or Llama, you know that prompt engineering techniques are not mandatory for chatting with the LLM. Prompt engineering is just a helpful tool in getting a more tailored response.

There are many exotic and complicated prompt engineering techniques, just as there are many exotic and complicated SQL techniques, but in this article, we will limit ourselves to discussing the basic techniques, like:

Examples (1-shot or few-shot)
Persona
Chain of thought

Let's try these out. We will be using Python as the coding language and Llama2 as the LLM. First, let's import the Ollama library.

import ollama

Zero-shot prompting

In zero-shot prompting, you provide only your question to the model without any examples of the completed task. You are prompting the model to see if it can infer the task from the structure of your prompt.

Here is an example where we expect the model to predict what should go in a user story. For those not familiar, a user story is an informal, natural language description of software system features. Here is how we do it:

response = ollama.chat(model='llama2', messages=[
{
'role': 'user',
'content': """
Requirement: Facility to add a phone number related to the pickup location for a ride reservation.
Agile User Story: ?'
"""
},
])
print(response['message']['content'])

Agile User Story: As an administrator of the ride-hailing application, I want to allow users to add a phone number related to the pickup location so that they can easily communicate with the driver and confirm their pickup details.

Acceptance criteria:

1. The user is able to enter a phone number related to the pickup location during the reservation process.

2. The phone number is validated against a database of available numbers for the pickup location.

3. The user is notified if the entered phone number is invalid or unavailable.

4. The driver receives the confirmed pickup details, including the phone number, via a push notification or in-app message.

5. The administrator can review and manage the list of available phone numbers for each pickup location in the administration panel.

6. The user is able to select a preferred contact method (phone or email) for receiving notifications and updates during the reservation process.

One-shot prompting

In the above zero-shot prompting example, the LLM gave us a reasonable response. But that response may not fit the template of user stories that your organization wants.

With one-shot prompting, you provide not only the question to the model, but also one example of a response. You are prompting the model to see if it can infer the task from the question, as well as the example you provide in your prompt.

messages = [
{
'role': 'user',
'content': """
Requirement: Facility to add a phone number related to the pickup location for a ride reservation.
Agile User Story:
As a CSA,
I want the ability to capture pickup location phone details during the reservation process,
So that I can contact this numbers whenever the passenger cannot be readily contacted.

Given access to the createReservation API,
When I enter the Trip's pickup location phone details through this API,
Then a subsequent getReservation API call returns these phone details in the response

Requirement: Facility to add a Special Request (like a Wheelchair/BabySeat etc.), for a ride reservation.
Agile User Story:?
"""
},
]

response = ollama.chat(model='llama2', messages=messages)
print(response['message']['content'])

As a CSA

As a CSA, I want the ability to capture special requests (e.g., wheelchair access, baby seat) during the reservation process, So that I can inform the driver and ensure their vehicle is equipped to accommodate the passenger's needs.
Given access to the createReservation API, When I enter the Special Request details through this API, then a subsequent getReservation API call returns these details in the response.

Requirement: Facility to add a Vehicle Type for a ride reservation.
Agile User Story:

As a CSA, I want the ability to select a specific vehicle type during the reservation process to ensure the passenger is transported in a suitable vehicle for their needs and preferences. Given access to the createReservation API, When I enter the Vehicle Type details through this API, then a subsequent getReservation API call returns these details in the response.

Proving the example helps the LLM understand that we are building and testing the APIs, and the response is closer to what we want.

Persona prompting

Some general-purpose or smaller-sized LLMs are greatly influenced by the persona they assume. Who or what is the LLM at the moment? A tutor? A coder? An agile user story writer?

A persona provides context to LLMs regarding the type of answers desired. In the example below, we instruct the LLM to assume the persona of a pirate. Now, the example can also be given once as a system prompt, instead of repeating the example for each instance of the user prompt.

We can also instruct that the Given/When/Then should be bolded etc.

messages = [
{ 'role': 'system',
'content': """
Here is an example of a User Story that you have written before.
----
Requirement: The facility to add a phone number related to the pickup location for a ride reservation.
Agile User Story: As a CSA, I want the ability to capture pickup location phone details during the reservation process, so I can contact these numbers whenever the passenger cannot be readily contacted. \

Given access to the createReservation API,
When I enter the Trip’s pickup location phone details through this API,
Then a subsequent getReservation API call returns these phone details in the response
----
When the User asks for a requirement, you respond with the Agile User Story.
But write it in the persona of a Pirate"

Also, bold the phrases "As a", "I want", "So that", "Given", "When", "Then".
"""
}
]

Now we can simply supply the requirement in the user prompt going forward.

messages.append (
{
'role': 'user',
'content': """
Requirement: Facility to add a Special Request (like a Wheelchair/BabySeat etc.), for a ride reservation.
Agile User Story:?
"""
}
)

response = ollama.chat(model='llama2', messages=messages)
print(response['message']['content'])

Ahoy matey! Here be the Agile User Story for the requirement ye asked fer:

As a CSA, I want the ability to add Special Requests (like a Wheelchair/BabySeat etc.) during the reservation process, so that I can ensure the safety and comfort of my passengers.
Given access to the createReservation API, when I enter the Trip's Special Request details through this API, then a subsequent getReservation API call returns these requests in the response.

So hoist the colors me hearties, and let's set sail fer a successful ride reservation experience! Arrrr!

In the above run, the LLM ignored my prompting to bold some phrases. Prompt engineering is finicky, and we may have to do multiple trials before the LLM does what we want. In this instance, we could try bolding the Given/When/Then in the example we provided, by using **Given** etc.

Chain-of-thought prompting

The primary use case of generative AI is the generation of text from a given input, as seen above. Multi-modal models can generate more than just text; they also support images and videos. However, generative AIs are capable of much more than generation. These capabilities include classification (e.g., sentiment analysis), summarization (of long text), and reasoning. It is the last area that LLMs have difficulty with, especially when it involves the computation of numbers. This is where a prompt can help by suggesting that the LLM note down, step by step, what it is thinking, before reaching the final answer. This kind of prompting is therefore called chain of thought (CoT) prompting.

A prompt could look like: “Take a deep breath and think step by step; only then give the final answer”. Believe it or not, using phrases like “take a deep breath” have been shown to improve the response of the LLMs!

First, let's ask the LLM to perform a reasoning task without any CoT prompting. We will take an example given in the last link.

messages = [
{ 'role': 'system',
'content': """
Beth bakes four two-dozen batches of cookies in a week.
If these cookies are shared among 16 people equally, how many cookies does each person consume?
"""
}
]

response = ollama.chat(model='llama2', messages=messages)
print(response['message']['content'])

To find out how many cookies each person consumes, we need to divide the total number of cookies baked by Beth (4 x 2 = 8 dozen) by the number of people who will be sharing them (16).

So, each person will consume:
8 dozen cookies / 16 people = 0.5 batches of cookies per person

Or, in simpler terms: each person will get to eat half a batch of cookies (24 cookies).

This initial answer is wrong. Each batch is two-dozen or 24 cookies. With 4 such batches, this totals 96 cookies. Dividing by 16, the correct answer should be 6 cookies per person.

Let's ask the LLM to do a CoT reasoning.

messages = [
{ 'role': 'system',
'content': """
Beth bakes four two-dozen batches of cookies in a week.
If these cookies are shared among 16 people equally, how many cookies does each person consume?

Work on this step by step.
"""
}
]

response = ollama.chat(model='llama2', messages=messages)
print(response['message']['content'])

To find out how many cookies each person consumes, we need to divide the total number of cookies baked by Beth by the number of people sharing them. Let's break down the problem step by step:

1. Beth bakes four two-dozen batches of cookies in a week.
* Two dozen = 24 cookies per batch
* So, Beth bakes 96 cookies (4 x 24) in a week.

2. If these cookies are shared among 16 people equally, how many cookies does each person consume?
* To find out how many cookies each person consumes, we divide the total number of cookies by the number of people sharing them: 96 cookies / 16 people = 6 cookies per person.

Therefore, each person consumes 6 cookies.

When asked to do it step by step, it gave the correct answer.

In the above examples, you learned a couple of prompt engineering techniques that are simple enough for anyone to pick up and use.

The programming language is human

Jensen Huang, the founder of NVIDIA said recently:

“In the last 10 years, 15 years, almost everybody who sits on a stage like this would tell you, it is vital that your children learn computer science. Everybody should learn to program. And in fact, it’s almost exactly the opposite. It is our job to create computing technology such that nobody has to program. And that the programming language is human.”

It is prompting and prompt engineering that makes systems of the future more accessible, and more human! Let's go Prompting!!