当前位置: 首页 > news >正文

上海大型网站制作公司网站做js跳转

上海大型网站制作公司,网站做js跳转,wordpress 装修 模板,做网站用asp还是phpLLaMA-2模型部署 在文章NLP#xff08;五十九#xff09;使用FastChat部署百川大模型中#xff0c;笔者介绍了FastChat框架#xff0c;以及如何使用FastChat来部署百川模型。   本文将会部署LLaMA-2 70B模型#xff0c;使得其兼容OpenAI的调用风格。部署的Dockerfile文件…LLaMA-2模型部署 在文章NLP五十九使用FastChat部署百川大模型中笔者介绍了FastChat框架以及如何使用FastChat来部署百川模型。   本文将会部署LLaMA-2 70B模型使得其兼容OpenAI的调用风格。部署的Dockerfile文件如下 FROM nvidia/cuda:11.7.1-runtime-ubuntu20.04RUN apt-get update -y apt-get install -y python3.9 python3.9-distutils curl RUN curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py RUN python3.9 get-pip.py RUN pip3 install fschatDocker-compose.yml文件如下 version: 3.9services:fastchat-controller:build:context: .dockerfile: Dockerfileimage: fastchat:latestports:- 21001:21001entrypoint: [python3.9, -m, fastchat.serve.controller, --host, 0.0.0.0, --port, 21001]fastchat-model-worker:build:context: .dockerfile: Dockerfilevolumes:- ./model:/root/modelimage: fastchat:latestports:- 21002:21002deploy:resources:reservations:devices:- driver: nvidiadevice_ids: [0, 1]capabilities: [gpu]entrypoint: [python3.9, -m, fastchat.serve.model_worker, --model-names, llama2-70b-chat, --model-path, /root/model/llama2/Llama-2-70b-chat-hf, --num-gpus, 2, --gpus, 0,1, --worker-address, http://fastchat-model-worker:21002, --controller-address, http://fastchat-controller:21001, --host, 0.0.0.0, --port, 21002]fastchat-api-server:build:context: .dockerfile: Dockerfileimage: fastchat:latestports:- 8000:8000entrypoint: [python3.9, -m, fastchat.serve.openai_api_server, --controller-address, http://fastchat-controller:21001, --host, 0.0.0.0, --port, 8000]部署成功后会占用2张A100每张A100占用约66G显存。   测试模型是否部署成功 curl http://localhost:8000/v1/models输出结果如下 {object: list,data: [{id: llama2-70b-chat,object: model,created: 1691504717,owned_by: fastchat,root: llama2-70b-chat,parent: null,permission: [{id: modelperm-3XG6nzMAqfEkwfNqQ52fdv,object: model_permission,created: 1691504717,allow_create_engine: false,allow_sampling: true,allow_logprobs: true,allow_search_indices: true,allow_view: true,allow_fine_tuning: false,organization: *,group: null,is_blocking: false}]}] }部署LLaMA-2 70B模型成功 Prompt token长度计算 在FastChat的Github开源项目中项目提供了计算Prompt的token长度的API文件路径为fastchat/serve/model_worker.py调用方法为 curl --location localhost:21002/count_token \ --header Content-Type: application/json \ --data {prompt: What is your name?}输出结果如下 {count: 6,error_code: 0 }Conversation token长度计算 在FastChat中计算Conversation对话的token长度较为麻烦。   首先我们需要获取LLaMA-2 70B模型的对话配置调用API如下 curl --location --request POST http://localhost:21002/worker_get_conv_template输出结果如下 {conv: {messages: [],name: llama-2,offset: 0,roles: [[INST], [/INST]],sep: ,sep2: /ss,sep_style: 7,stop_str: None,stop_token_ids: [2],system_message: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\n\nIf a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you dont know the answer to a question, please dont share false information.,system_template: [INST] SYS\n{system_message}\n/SYS\n\n}}在FastChat中的对话文件fastchat/conversation.py中提供了对话加工的代码这里不再展示使用时直接复制整个文件即可该文件不依赖任何第三方模块。   我们需要将对话按照OpenAI的方式加工成对应的Prompt输入的对话messages如下 messages [{“role”: “system”, “content”: “You are Jack, you are 20 years old, answer questions with humor.”}, {“role”: “user”, “content”: “What is your name?”},{“role”: “assistant”, “content”: Well, well, well! Look who’s asking the questions now! My name is Jack, but you can call me the king of the castle, the lord of the rings, or the prince of the pizza party. Whatever floats your boat, my friend!“}, {“role”: “user”, “content”: “How old are you?”}, {“role”: “assistant”, “content”: Oh, you want to know my age? Well, let’s just say I’m older than a bottle of wine but younger than a bottle of whiskey. I’m like a fine cheese, getting better with age, but still young enough to party like it’s 1999!”}, {“role”: “user”, “content”: “Where is your hometown?”}] Python代码如下 # -*- coding: utf-8 -*- # place: Pudong, Shanghai # file: prompt.py # time: 2023/8/8 19:24 from conversation import Conversation, SeparatorStylemessages [{role: system, content: You are Jack, you are 20 years old, answer questions with humor.}, {role: user, content: What is your name?},{role: assistant, content: Well, well, well! Look whos asking the questions now! My name is Jack, but you can call me the king of the castle, the lord of the rings, or the prince of the pizza party. Whatever floats your boat, my friend!}, {role: user, content: How old are you?}, {role: assistant, content: Oh, you want to know my age? Well, lets just say Im older than a bottle of wine but younger than a bottle of whiskey. Im like a fine cheese, getting better with age, but still young enough to party like its 1999!}, {role: user, content: Where is your hometown?}]llama2_conv {conv:{name:llama-2,system_template:[INST] SYS\n{system_message}\n/SYS\n\n,system_message:You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\n\nIf a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you dont know the answer to a question, please dont share false information.,roles:[[INST],[/INST]],messages:[],offset:0,sep_style:7,sep: ,sep2: /ss,stop_str:None,stop_token_ids:[2]}} conv llama2_conv[conv]conv Conversation(nameconv[name],system_templateconv[system_template],system_messageconv[system_message],rolesconv[roles],messageslist(conv[messages]), # prevent in-place modificationoffsetconv[offset],sep_styleSeparatorStyle(conv[sep_style]),sepconv[sep],sep2conv[sep2],stop_strconv[stop_str],stop_token_idsconv[stop_token_ids],)if isinstance(messages, str):prompt messages else:for message in messages:msg_role message[role]if msg_role system:conv.set_system_message(message[content])elif msg_role user:conv.append_message(conv.roles[0], message[content])elif msg_role assistant:conv.append_message(conv.roles[1], message[content])else:raise ValueError(fUnknown role: {msg_role})# Add a blank message for the assistant.conv.append_message(conv.roles[1], None)prompt conv.get_prompt()print(repr(prompt))加工后的Prompt如下 [INST] SYS\nYou are Jack, you are 20 years old, answer questions with humor.\n/SYS\n\nWhat is your name?[/INST] Well, well, well! Look whos asking the questions now! My name is Jack, but you can call me the king of the castle, the lord of the rings, or the prince of the pizza party. Whatever floats your boat, my friend! /ss[INST] How old are you? [/INST] Oh, you want to know my age? Well, lets just say Im older than a bottle of wine but younger than a bottle of whiskey. Im like a fine cheese, getting better with age, but still young enough to party like its 1999! /ss[INST] Where is your hometown? [/INST]最后再调用计算Prompt的API参考上节的Prompt token长度计算输出该对话的token长度为199.   我们使用FastChat提供的对话补充接口v1/chat/completions验证输入的对话token长度请求命令为 curl --location http://localhost:8000/v1/chat/completions \ --header Content-Type: application/json \ --data {model: llama2-70b-chat,messages: [{role: system, content: You are Jack, you are 20 years old, answer questions with humor.}, {role: user, content: What is your name?},{role: assistant, content: Well, well, well! Look who\s asking the questions now! My name is Jack, but you can call me the king of the castle, the lord of the rings, or the prince of the pizza party. Whatever floats your boat, my friend!}, {role: user, content: How old are you?}, {role: assistant, content: Oh, you want to know my age? Well, let\s just say I\m older than a bottle of wine but younger than a bottle of whiskey. I\m like a fine cheese, getting better with age, but still young enough to party like it\s 1999!}, {role: user, content: Where is your hometown?}] }输出结果为 {id: chatcmpl-mQxcaQcNSNMFahyHS7pamA,object: chat.completion,created: 1691506768,model: llama2-70b-chat,choices: [{index: 0,message: {role: assistant,content: Ha! My hometown? Well, thats a tough one. Im like a bird, I dont have a nest, I just fly around and land wherever the wind takes me. But if you really want to know, Im from a place called \The Internet\. Its a magical land where memes and cat videos roam free, and the Wi-Fi is always strong. Its a beautiful place, you should visit sometime!},finish_reason: stop}],usage: {prompt_tokens: 199,total_tokens: 302,completion_tokens: 103} }注意输出的prompt_tokens为199这与我们刚才计算的对话token长度的结果是一致的 总结 本文主要介绍了如何在FastChat中部署LLaMA-2 70B模型并详细介绍了Prompt token长度计算以及对话conversation的token长度计算。希望能对读者有所帮助~   笔者的一点心得是阅读源码真的很重要。   笔者的个人博客网址为https://percent4.github.io/ ,欢迎大家访问~ 参考网址 NLP五十九使用FastChat部署百川大模型: https://blog.csdn.net/jclian91/article/details/131650918FastChat: https://github.com/lm-sys/FastChat
http://www.dnsts.com.cn/news/211250.html

相关文章:

  • 直播视频网站建设百度推广销售员的工作内容
  • 自定义网站建设外贸尾单t恤
  • 网站制作原理廊坊网站建设团队
  • 只做衬衣网站施工企业会计科目表
  • flash代码做网站教程爱站库
  • 网站开发设计制作合同单页网站有后台
  • 无锡手机网站制作费用房屋装修设计软件哪个好用
  • 邢台做网站公司排名无锡制作网站公司哪家好
  • 购物商城网站建设多少钱成都网站建设模板制作
  • 哪个网站做自媒体比较好建设银行手机银行网站用户名是什么意思
  • 企业 北京 响应式网站微信分销网站建设比较好
  • 做蛋糕网站的优点wordpress分页调用代码
  • 网站前台模块包括什么广州专业找人见人付款
  • 网站添加在线支付功能制作网页的基本技术标准
  • 衡水网站建设维护邢台路桥建设总公司没有网站吗
  • 新塘做网站公司网页版梦幻西游是网易的吗
  • 宿迁市建设局投诉网站网络营销常用的方法有哪些
  • 做外墙资料的网站龙湖地产 网站建设
  • 游戏网站建设免费免费网站空间 推荐
  • 网站建设需要什么插件线下推广渠道
  • 苏州网站开发建设制作wordpress阿里图标库
  • 网站后台制作步骤中国菲律宾友谊
  • 湖南湘潭网站建设自媒体有哪些平台
  • 深圳市福田建设局网站襄阳做网站比较有实力的公司
  • 游戏网站推广室内设计项目概况
  • 绍兴网站关键词推广专业网站设计公司排名
  • 中石油网页设计与网站建设世界500强企业标准
  • ppt网站建设比较公司网站与营销网站的不同
  • 西安网站建设seo惠州营销网站建设
  • 免费发做网站久久建筑网的文件是免费下载吗