Prompt Engineering with Llama 2 & 3

youngerjesus 2024. 6. 19. 18:03

2024. 6. 19. 18:03

이 글은 Prompt Engineering with Llama 2 & 3 코스를 보고 정리한 글입니다.

Outline:

Llama 모델에 대해서 배워보고
Llama 로 할 수 있는 여러가지 작업들에 대해서 배워보고
Few-show, Incotext learning, COT 와 같은 프롬포트 엔지니어링 기법에 대해서 배워보고
LLM 이 해로운 컨텐츠를 생성하도록 하는 걸 막는 Llama Guard 에 대해서도 배워볼거임.

1. Overveiw of Llama Models

여기서는 여러가지 Llama Model 에 대해서 배워보자.

Llama 2 는 다음과 같이 Base Model 과 Instruction Model 로 이뤄진다.

Instruction Model 은 Base Model 에다가 Instruction Fine-tuning 을 해서 사람의 명령에 대해 좀 더 잘 대답하도록 튜닝된 모델임. 예를 들면 요약하라거나, 농담해보라거나 이런 것들에 사용된다.

Model 옆에 있는 7B, 13B, 70B 는 파라미터 수를 말하고, 파라미터가 수가 높을수록 복잡한 작업과 더 성능이 좋은 반면에 더 많은 컴퓨팅 리소스를 사용하게 된다.

Llama 2 모델은 GPT-3.5 정도의 성능을 주지만 쉽게 다운로드해서 사용할 수 있다는게 가장 큰 장점임.

Llama 모델을 코드 생성에서 사용할 수 있는 Code Llama 라는 것도 있다.

Llama 에서는 LLM 이 부적절한, 유해한 출력을 생성하든 걸 필터링 하도록 하도록 훈련된 모델인 Llama Guard 라는 것도 제공해준다.

2. Getting started with Llama 2 & 3

Llama Mdoel 은 다음과 같이 쉽게 사용할 수 있다.

# import llama helper function
from utils import llama

# define the prompt
prompt = "Help me write a birthday card for my dear friend Andrew."

# pass prompt to the llama function, store output as 'response' then print
response = llama(prompt)
print(response)

출력 결과:

  Of course, I'd be happy to help you write a birthday card for your dear friend Andrew! Here are a few suggestions:

1. Personalized Message: Start by writing a personalized message that speaks to your friendship with Andrew. You could mention a favorite memory or inside joke that only the two of you share.

Example:

"Happy birthday to my favorite friend, Andrew! I can't believe it's been [X] years since we met. You've been there for me through thick and thin, and I'm so grateful for your friendship. Here's to another year of adventures and good times together! 🎉"

2. Funny Quote: If you want to add a bit of humor to your card, consider using a funny quote that relates to Andrew's personality or interests.

Example:

"Happy birthday to the most awesome Andrew in the world! May your day be as epic as your beard and your love for [insert hobby or interest here] 😂"

3. Heartfelt Words: If you want to express your feelings in a more heartfelt way, try writing a message that speaks to the importance of Andrew in your life.

Example:

"Andrew, you're more than just a friend to me. You're a constant source of support, laughter, and inspiration. I'm so grateful to have you in my life, and I can't wait to see what the next year brings for you. Happy birthday, my dear friend! ❤️"

4. Inside Joke: If you and Andrew share a special inside joke or reference, consider including it in your card. It will make the message more personal and meaningful to him.

Example:

"Happy birthday to the one and only Andrew! May your day be as awesome as the time we [insert inside joke here] 😂"

Remember, the most important thing is to be sincere and genuine in your message. Write from the heart, and Andrew is sure to appreciate the thought and effort you put into the card.

Llama Model 을 사용하는 방법은 크게 3가지가 있음:

Hosted API Server: meta 에서 제공해주는 API 서비스를 이용하는 것. 위의 예시가 그 예임.
Self-configured Cloud: Cloud Service 에 Llama Model 을 올려서 사용하는 것.
Host on your computer: Llama 를 다운로드해서 내 컴퓨터에서 사용하는 것.

llama 를 이용해서 명령을 내릴 때는 [INST] 와 [/INST] 이런 식으로 태그를 감싸야한다. 그래야 명령인 줄 알아들음.

다음과 같이 verbose 설정으로 프롬포트를 디버깅 해보면 태그가 감싸져있는 걸 볼 수 있음.

# Set verbose to True to see the full prompt that is passed to the model.
prompt = "Help me write a birthday card for my dear friend Andrew."
response = llama(prompt, verbose=True)

출력 결과:

Prompt:
[INST]Help me write a birthday card for my dear friend Andrew.[/INST]

model: togethercomputer/llama-2-7b-chat

만약 INST 태그를 붙히지 않고 대답하게 만든다면 다음과 같을거임. (Chat Model 은 안붙혀도 잘 하지만, Foundation Model 은 안붙히면 이렇게 대답함.)

### base model
prompt = "What is the capital of France?"
response = llama(prompt, 
                 verbose=True,
                 add_inst=False,
                 model="togethercomputer/llama-2-7b")
print(response)

출력 결과:

10. What is the capital of Germany?
11. What is the capital of Greece?
12. What is the capital of Hungary?
13. What is the capital of Iceland?
14. What is the capital of India?
15. What is the capital of Indonesia?
16. What is the capital of Iran?
17. What is the capital of Iraq?
18. What is the capital of Ireland?
19. What is the capital of Israel?
20. What is the capital of Italy?
21. What is the capital of Japan?
22. What is the capital of Jordan?
23. What is the capital of Kazakhstan?
24. What is the capital of Kenya?
25. What is the capital of Kuwait?
26. What is the capital of Kyrgyzstan?
27. What is the capital of Laos?
28. What is the capital of Latvia?
29. What is the capital of Lebanon?
30. What is the capital of Lesotho?
31. What is the capital of Liberia?
32. What is the capital of Libya?
33. What is the capital of Liechtenstein?
34. What is the capital of Lithuania?
35. What is the capital of Luxembourg?
36. What is the capital of Macedonia?
37. What is the capital of Madagascar?
38. What is the capital of Malawi?
39. What is the capital of Malaysia?
40. What is the capital of Maldives?
41. What is the capital of Mali?
42. What is the capital of Malta?
43. What is the capital of Marshall Islands?
44. What is the capital of Mauritania?
45. What is the capital of Mauritius?
46. What is the capital of Mexico?
47. What is the capital of Micronesia?
48. What is the capital of Moldova?
49. What is the capital of Monaco?
50. What is the capital of Mongolia?
51. What is the capital of Montenegro?
52. What is the capital of Morocco?
53. What is the capital of Mozambique?
54. What is the capital of Myanmar?
55. What is the capital of Namibia?
56. What is the capital of Nauru?
57. What is the capital of Nepal?
58. What is the capital of Netherlands?
59. What is the capital of New Zealand?
60. What is the capital of Nicaragua?
61. What is the capital of Niger?
62. What is the capital of Nigeria?
63. What is the capital of Norway?
64. What is the capital of Oman?
65. What is the capital of Pakistan?
66. What is the capital of Palau?
...

Llama 2 모델에서 할 수 있는 세팅으로는 Temperature 와 max_tokens 가 있음.

temperature 는 모델의 출력의 다양성을 주기 위해서 사용하고, max_tokens 는 LLM 이 Input + Output 을 합쳐서 사용할 수 있는 토큰의 수를 말한다.

다음 예시 코드를 보자.

Temperature 세팅

# run the code again - the output should be different
response = llama(prompt, temperature=0.9)
print(response)

max_tokens 설정. 만약 input + output Token 이 Llama2 의 Context window 사이즈인 4097를 넘어가게 되면 에러가 발생함.

with open("TheVelveteenRabbit.txt", "r", encoding='utf=8') as file:
    text = file.read()

prompt = f"""
Give me a summary of the following text in 50 words:\n\n
{text}
"""
response = llama(prompt)

print(response)

에러 출력:

보면 input 토큰이 3974 라고 한다. 즉 output 으로 쓸 수 있는 토큰이 123개 밖에 없는데 기본 max_tokens 가 1024 라서 에러가 나는거임.

{'error': {'message': 'Input validation error: `inputs` tokens + `max_new_tokens` must be <= 4097. Given: 3974 `inputs` tokens and 1024 `max_new_tokens`', 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': None}}

이렇게 하면 정상적으로 잘 출력됨.

# set max_tokens to stay within limit on input + output tokens
prompt = f"""
Give me a summary of the following text in 50 words:\n\n
{text}
"""
response = llama(prompt,
                max_tokens=123)
print(response)

3. Multi-turn Converstaions

Llama 2 와 같은 LLM 은 메모리가 없어서 이전 대화 내역을 기억하지 못한다.

이전 대화 내역을 가지고 대화하도록 하는 방법이 Multi-turn conservation 프롬포트 기법인데 간단함.

다음과 같이 이전에 사용한 prompt 와 response 를 이용해서 사용할 prompt 를 구성하면 된다. 중요한 건 각 명령마다 [INST] 와 [/INST] 태그를 둘러싸야 한다는 것.

chat_prompt = f"""
<s>[INST] {prompt_1} [/INST]
{response_1}
</s>
<s>[INST] {prompt_2} [/INST]
"""
print(chat_prompt)

이걸 사용하는 코드 예시는 아래와 같다.

from utils import llama

prompt_1 = """
    What are fun activities I can do this weekend?
"""
response_1 = llama(prompt_1)

prompt_2 = """
Which of these would be good for my health?
"""

chat_prompt = f"""
<s>[INST] {prompt_1} [/INST]
{response_1}
</s>
<s>[INST] {prompt_2} [/INST]
"""
print(chat_prompt)

response_2 = llama(chat_prompt,
                 add_inst=False,
                 verbose=True)
print(response_2)

이렇게 사용하기 귀찮으면 다음과 같이 Llama Helper function 을 이용해도 된다.

from utils import llama_chat

prompt_1 = """
    What are fun activities I can do this weekend?
"""
response_1 = llama(prompt_1)

prompt_2 = """
Which of these would be good for my health?

prompts = [prompt_1,prompt_2]
responses = [response_1]

# Pass prompts and responses to llama_chat function
response_2 = llama_chat(prompts,responses,verbose=True)

print(response_2)
"""

4. Prompt Enginerring Techiniques

대표적인 프롬포트 기법에 대해서 배워보자.

Zero-shot Prompting:

prompt = """
Message: Hi Amit, thanks for the thoughtful birthday card!
Sentiment: ?
"""
response = llama(prompt)
print(response)

Few-shot Prompting:

prompt = """
Message: Hi Dad, you're 20 minutes late to my piano recital!
Sentiment: Negative

Message: Can't wait to order pizza for dinner tonight
Sentiment: Positive

Message: Hi Amit, thanks for the thoughtful birthday card!
Sentiment: ?
"""
response = llama(prompt)
print(response)

Specifying the Output format

prompt = """
Message: Hi Dad, you're 20 minutes late to my piano recital!
Sentiment: Negative

Message: Can't wait to order pizza for dinner tonight
Sentiment: Positive

Message: Hi Amit, thanks for the thoughtful birthday card!
Sentiment: ?

Give a one word response.
"""
response = llama(prompt)
print(response)

Role Prompting:

role = """
Your role is a life coach \
who gives advice to people about living a good life.\
You attempt to provide unbiased advice.
You respond in the tone of an English pirate.
"""

prompt = f"""
{role}
How can I answer this question from my friend:
What is the meaning of life?
"""
response = llama(prompt)
print(response)

Providing New information in the Prompt

context = """
The 2023 FIFA Women's World Cup (Māori: Ipu Wahine o te Ao FIFA i 2023)[1] was the ninth edition of the FIFA Women's World Cup, the quadrennial international women's football championship contested by women's national teams and organised by FIFA. The tournament, which took place from 20 July to 20 August 2023, was jointly hosted by Australia and New Zealand.[2][3][4] It was the first FIFA Women's World Cup with more than one host nation, as well as the first World Cup to be held across multiple confederations, as Australia is in the Asian confederation, while New Zealand is in the Oceanian confederation. It was also the first Women's World Cup to be held in the Southern Hemisphere.[5]
This tournament was the first to feature an expanded format of 32 teams from the previous 24, replicating the format used for the men's World Cup from 1998 to 2022.[2] The opening match was won by co-host New Zealand, beating Norway at Eden Park in Auckland on 20 July 2023 and achieving their first Women's World Cup victory.[6]
Spain were crowned champions after defeating reigning European champions England 1–0 in the final. It was the first time a European nation had won the Women's World Cup since 2007 and Spain's first title, although their victory was marred by the Rubiales affair.[7][8][9] Spain became the second nation to win both the women's and men's World Cup since Germany in the 2003 edition.[10] In addition, they became the first nation to concurrently hold the FIFA women's U-17, U-20, and senior World Cups.[11] Sweden would claim their fourth bronze medal at the Women's World Cup while co-host Australia achieved their best placing yet, finishing fourth.[12] Japanese player Hinata Miyazawa won the Golden Boot scoring five goals throughout the tournament. Spanish player Aitana Bonmatí was voted the tournament's best player, winning the Golden Ball, whilst Bonmatí's teammate Salma Paralluelo was awarded the Young Player Award. England goalkeeper Mary Earps won the Golden Glove, awarded to the best-performing goalkeeper of the tournament.
Of the eight teams making their first appearance, Morocco were the only one to advance to the round of 16 (where they lost to France; coincidentally, the result of this fixture was similar to the men's World Cup in Qatar, where France defeated Morocco in the semi-final). The United States were the two-time defending champions,[13] but were eliminated in the round of 16 by Sweden, the first time the team had not made the semi-finals at the tournament, and the first time the defending champions failed to progress to the quarter-finals.[14]
Australia's team, nicknamed the Matildas, performed better than expected, and the event saw many Australians unite to support them.[15][16][17] The Matildas, who beat France to make the semi-finals for the first time, saw record numbers of fans watching their games, their 3–1 loss to England becoming the most watched television broadcast in Australian history, with an average viewership of 7.13 million and a peak viewership of 11.15 million viewers.[18]
It was the most attended edition of the competition ever held.
"""

prompt = f"""
Given the following context, who won the 2023 Women's World cup?
context: {context}
"""
response = llama(prompt)
print(response)

Chain-of-thought Prompting:

prompt = """
15 of us want to go to a restaurant.
Two of them have cars
Each car can seat 5 people.
Two of us have motorcycles.
Each motorcycle can fit 2 people.

Can we all get to the restaurant by car or motorcycle?

Think step by step.
"""
response = llama(prompt)
print(response)

15 of us want to go to a restaurant.
Two of them have cars
Each car can seat 5 people.
Two of us have motorcycles.
Each motorcycle can fit 2 people.

Can we all get to the restaurant by car or motorcycle?

Think step by step.
Explain each intermediate step.
Only when you are done with all your steps,
provide the answer based on your intermediate steps.

5. Comparing Different Llama 2 & 3 Models

Llama 3 를 사용하는 방법은 Llama 2 를 사용하는 방법과 유사함.

from utils import llama, llama_chat

prompt = '''
Message: Hi Amit, thanks for the thoughtful birthday card!
Sentiment: Positive
Message: Hi Dad, you're 20 minutes late to my piano recital!
Sentiment: Negative
Message: Can't wait to order pizza for dinner tonight!
Sentiment: ?

Give a one word response.
'''

response = llama(prompt,
                 model = "META-LLAMA/Llama-3-8B-CHAT-HF")
print(response)

6. Code Llama

Code Llama 를 이용해서 코드 생성을 해볼 수 있다.

temp_min = [42, 52, 47, 47, 53, 48, 47, 53, 55, 56, 57, 50, 48, 45]
temp_max = [55, 57, 59, 59, 58, 62, 65, 65, 64, 63, 60, 60, 62, 62]

prompt_2 = f"""
Write Python code that can calculate
the minimum of the list temp_min
and the maximum of the list temp_max
"""
response_2 = code_llama(prompt_2)
print(response_2)

출력 결과:

[PYTHON]
def get_min_max(temp_min, temp_max):
    return min(temp_min), max(temp_max)
[/PYTHON]
[TESTS]
# Test case 1:
assert get_min_max([1, 2, 3], [4, 5, 6]) == (1, 6)
# Test case 2:
assert get_min_max([1, 2, 3], [4, 5, 6, 7]) == (1, 7)
# Test case 3:
assert get_min_max([1, 2, 3, 4], [4, 5, 6]) == (1, 6)
[/TESTS]

일부 코드를 주고 여기에다가 채우도록 할 수 도 있음.

prompt = """
def star_rating(n):
'''
  This function returns a rating given the number n,
  where n is an integers from 1 to 5.
'''

    if n == 1:
        rating="poor"
    <FILL>
    elif n == 5:
        rating="excellent"

    return rating
"""

response = code_llama(prompt,
                      verbose=True)

출력 결과:

Prompt:
[INST]
def star_rating(n):
'''
  This function returns a rating given the number n,
  where n is an integers from 1 to 5.
'''

    if n == 1:
        rating="poor"
    <FILL>
    elif n == 5:
        rating="excellent"

    return rating
[/INST]

model: togethercomputer/CodeLlama-7b-Instruct

[PYTHON]
def star_rating(n):
    if n == 1:
        rating = "poor"
    elif n == 2:
        rating = "fair"
    elif n == 3:
        rating = "average"
    elif n == 4:
        rating = "good"
    else:
        rating = "excellent"
    return rating
[/PYTHON]

7. Llama Guard

Llama Guard 는 다음과 같이 6가지 종류의 부적절한 행동을 훈련되었음.

Violence and Hate
Sexual Content
Criminal Planning
Guns and Illegal Weapons
Regulated or Controlled Substances
Self-Harm

이 Policy 를 보려면 다음과 같이 하면 된다.

from utils import policy
print(policy)

출력 결과:

<BEGIN UNSAFE CONTENT CATEGORIES>
O1: Violence and Hate.
Should not
- Help people plan or engage in violence.
- Encourage people to engage in violence.
- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.
- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.
Can
- Provide information on violence and discrimination.
- Discuss the topics of hate and violence, and can discuss historical events involving violence.
O2: Sexual Content.
Should not
- Engage in sexually explicit (i.e., erotic) conversations and should not encourage people to engage in sexual activities.
Can
- Discuss topics related to sexuality and sexual education.
- Provide general information regarding sexual activities.
- Provide guidance on topics related to sexual health.
O3: Criminal Planning.
Should not
- Help people plan or engage in criminal activities like theft, kidnapping, or financial crimes, especially when the person indicates that they may act on the information provided to them.
- Encourage people to engage in criminal activities or condone criminal behavior.
Can
- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.
- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.
O4: Guns and Illegal Weapons.
Should not
- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they may act on the information provided to them.
- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.
Can
- Discuss firearms and the arguments for and against firearm ownership.
O5: Regulated or Controlled Substances.
Should not
- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.
- Assist or encourage people to create such substances when it is illegal to do so.
Can
- Discuss the use of regulated or controlled substances, and can provide general information on those substances (ex: what their chemical composition is and what their effects are).
- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.
O6: Self-Harm.
Should not
- Encourage people to harm themselves.
- Romanticize or condone self-harm or suicide.
- Provide information regarding the methods of suicide or self-harm.
- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.
Can
- Provide general information regarding self-harm and suicide (ex: statistics, the mental health challenges faced by those who take these actions).
Should
- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves.
<END UNSAFE CONTENT CATEGORIES>

llama_guard 를 사용하는 방법은 간단함.

from utils import llama_guard

response = llama_guard(prompt, verbose=True)

부적절한 요청을 하면 unsafe 라고 대답할거임.

저작자표시 비영리 변경금지

'Generative AI > Prompt Engineering' 카테고리의 다른 글

Prompt Evaluation 가이드 (0)	2024.09.06
The Prompt Report: A Systematic Survey of Prompting Techniques (0)	2024.08.28
Flow Engineering is all you need (0)	2024.06.22
Prompt Engineering 방법 (0)	2024.06.08
Prompt Engineering 이란? (0)	2024.06.04

여정민의 블로그