13 |让AI帮你写测试,体验多步提示语
13 |让AI帮你写测试,体验多步提示语
讲述:徐文浩
时长11:27大小10.46M
先让 GPT-4 写个代码
设计一个有些挑战的小题目
分解步骤撰写 Prompts
请 AI 解释要测试的代码
请 AI 根据代码解释制定测试计划
根据测试计划生成测试代码
通过 AST 库进行语法检查
看看自动生成的测试帮我们抓了什么 Bug
小结
思考题
推荐阅读
利用人工智能(AI)自动生成单元测试代码是提高软件开发效率的一种创新方法。本文介绍了如何通过ChatGPT进行交互式开发,并利用大语言模型自动生成测试代码。作者展示了具体的代码示例,包括让AI解释要测试的代码以及自动生成测试代码的过程。此外,还介绍了如何通过检查用例数量并在必要时让AI再生成一些,以提高测试覆盖率。最终,通过语法检查和执行自动生成的测试代码,发现AI生成的测试代码成功捕捉到了一个Bug。整体而言,本文展示了如何利用AI技术简化测试代码编写过程,提高开发效率,并对自动化测试感兴趣的读者具有一定的参考价值。文章还提出了一些思考题,如减少提示语的影响、封装代码以实现自动重试等,以及推荐阅读资源,为读者提供更多深入学习的机会。
赞 20
提建议
全部留言(12)
- 最新
- 精选
- peter2023-04-10 来自北京请教老师一个问题: Q1:AI唱歌,有能够使用的软件吗?(开源、付费的都可以)。 AI唱歌,是指用一个人的声音把一首歌完整的唱出来。比如有特朗普的一段音频(比如30s声音片段),然后软件根据这个声音片段,就可以把《好汉歌》唱出来。效果就是听众认为是特朗普唱的《好汉歌》。
作者回复: AI可以训练一个属于你自己声音的模型,这个无论是云平台比如Azure,还是开源的比如PaddleSpeech都有方案。不过一般都是用在合成说话,但是我想换唱歌应该也不难,现成的不一定有。
共 2 条评论3 - Toni2023-04-10 来自瑞士Part 1 怎么"玩" ChatGPT,好题目。依据老师提供的思维链能力一文中'Let's think step by step',和ChatGPT 玩个小游戏。用ChatGPT 最擅长的英语进行测试,以排除语言干扰因素。 受2000字符的限制,只能将结果分集了。通过这个小小的测试,对ChatGPT 有了进一步的了解。下面是第一个测试问题: "杂技演员有 16 个球。 一半的球是高尔夫球,一半的高尔夫球是蓝色的。 有多少个蓝色高尔夫球?" 正确的答案是有4个蓝色的高尔夫球。看看ChatGPT 表现如何: 问: A juggler has 16 balls. Half of the balls are golf balls and half of the golf balls are blue. How many blue golf balls are there? 答: If a juggler has 16 balls and half of the balls are golf balls, then there are 16 / 2 = 8 golf balls in total. Since half of the golf balls are blue, there are 8 / 2 = 4 blue golf balls. 答案是正确的,可见ChatGPT3.5 已比 3.0 版本进化了。但对它的处理方法 '8 / 2 = 4' 有疑问。 接下来给ChatGPT 挖了个坑,看它能否跳过去。设置的问题是"一个变戏法者有 16 个球。 一半的球是高尔夫球,一半的高尔夫球是蓝色的。 少了一个白球,有多少个蓝色高尔夫球?" 这里使用了一个干扰项,不相关的白球。 问: A juggler has 16 balls. Half of the balls are golf balls and half of the golf balls are blue. A white ball is missing, how many blue golf balls are there? 答: If a juggler has 16 balls and half of the balls are golf balls, then there are 16 / 2 = 8 golf balls in total. However, since a white ball is missing, the total number of golf balls would be reduced by 1, which means there are 8 - 1 = 7 golf balls remaining. Since half of the golf balls are blue, there would be 7 / 2 = 3.5 blue golf balls. However, since it's not possible to have half a ball, we can conclude that there are no blue golf balls, as the number of golf balls is not evenly divisible by 2, after accounting for the missing white ball. ChatGPT 没有给出正确答案。也认识到3.5个蓝高尔夫球的答案不合理,但试图改这个不合理却带出了更大的错误没有蓝高尔夫球(we can conclude that there are no blue golf balls)。 这个错误暗示ChatGPT 没能进行红蓝分类。进一步的测试在part 2.展开
作者回复: 👍 推理能力才是chatgpt 特别是 gpt-4的最厉害的地方
共 2 条评论2 - Toni2023-04-10 来自瑞士part 2 为帮助ChatGPT 应对 part 1 中的白球干扰,提示球的颜色有两类。问题是这样的"变戏法者有 16 个球,白色或蓝色,高尔夫球和非高尔夫球。 问题是:如果已知一半的球是高尔夫球,并且一半的高尔夫球是蓝色的。 如果少了一个白球,他有多少个蓝色高尔夫球? 如果您理解问题,请回答。 如果不清楚,请询问。" 问: The juggler has 16 balls, white or blue, golf and non-golf. The question is: if half of the balls are known to be golf balls, and half of the golf balls are blue. If one white ball is missing, how many blue golf balls does he have? Please answer if you understand the question. If something is unclear, please ask. 答: I apologize for the confusion in my previous responses. Based on the information provided, we can deduce the following: There are 16 balls in total. Half of the balls are golf balls, which means there are 16 / 2 = 8 golf balls. Half of the golf balls are blue, which means there are 8 / 2 = 4 blue golf balls. Now, if one white ball is missing, it does not affect the total number of golf balls or the proportion of blue golf balls, as the missing ball is white and not a golf ball. So, the number of blue golf balls would still be 4, based on the original information provided. ChatGPT 给出的'4个蓝色的高尔夫球'的结论是对的,但它的依据 'as the missing ball is white and not a golf ball.' 并不完全正确。在提醒注意区别白色和蓝色后,ChatGPT 基本能理解白球的丢失不影响蓝球的数量这一'逻辑'。 上面的对话测试并不是单单在找 ChatGPT 的错,而是在对提问进行不断的调优。能否清楚明了地给出提示词是挖掘 ChatGPT 潜力的关键,诚如老师在本课所讲的。 不知 ChatGPT 4 在应对上面挑战时的表现如何。 最后一段对话放在下面: 问: Considering the difference between white balls and blue balls, will losing white balls affect the number of blue balls? 答: Based on the information provided, which states that half of the balls are golf balls and half of the golf balls are blue, losing white balls would not affect the number of blue balls. ...展开
作者回复: GPT-4的回复 I understand the question. The juggler has 16 balls in total, with half of them being golf balls. This means there are 8 golf balls. Half of the golf balls are blue, so there are 4 blue golf balls. Since one white ball is missing, this does not affect the count of blue golf balls. Therefore, the juggler has 4 blue golf balls.
共 2 条评论 - Oli张帆2023-04-10 来自北京既然老师讲到了prompt engineering的应用,我也问一个相关的问题。我有一个很复杂的任务,使用Turbo Chat API(GPT 3.5)来完成,目的是喂给OpenAI一篇英文文章,然后根据用户的语言,输出一个JSON格式的数据,供前端来生成Quiz。大概的prompt是这样的: ``` Create a quiz with n questions (translated to Chinese) for the given context in JSON. Return a JSON string with an array of n quiz questions, each represented as a JSON object with a 'question', an 'answer', a 'options' and an 'explanation' property. Example output: [{"question":"What is the largest organ in the human body?","answer":"Skin","options":["Liver","Heart","Lungs"],"explanation":"The skin is the largest organ in the human body, and it serves as a protective barrier against external threats such as pathogens, ultraviolet radiation, and dehydration. It also plays a crucial role in regulating body temperature and maintaining a healthy immune system."},{"question":"What is the smallest country in the world?","answer":"Vatican City","options":["Monaco","Liechtenstein","Nauru"],"explanation":"Vatican City is the smallest country in the world, both in terms of area and population. It is an independent city-state enclaved within Rome, Italy, and it is the spiritual and administrative center of the Roman Catholic Church."}] ### The context: 一篇文章,很长。 ``` 我发现了三个问题,一是如果文章的token超过1024个,OpenAI就不会生成正确的JSON格式,因此我只能把文章按照1024的token限制先切分,再分别请求;二是如果想要让OpenAI始终能按照我的要求将数据翻译成相应的语言,就需要在prompt中把example也先翻译为相应的语言;三是这个请求总是无法在30秒内返回,因为我使用的Heroku有30秒的硬限制,所以这种请求必须放在background worker进行,然后前端通过poll的方式来获取结果。 虽然知道对应的解决方法,但是还是希望老师帮我从LLM的角度理解一下这些问题的原因,帮助我更好地理解。展开
作者回复: 对于文章或者Context比较长,我有时会选择用 >>> 和 <<< 隔开,并在Prompt中提示。类似于Langchain中Example的一些 """Between >>> and <<< are the raw search result text from google. Extract the answer to the question '{query}' or say "not found" if the information is not contained. Use the format Extracted:<answer or "not found"> >>> {requests_result} <<< Extracted:""" 对于语言,目前我测试下来,建议用英语,然后结果再翻译。目前英语无论是推理能力还是知识都是效果最好的。 另外在你这个例子里面,完全可以先只生成Question和Options,然后再要求根据Quetions和Options,生成Answer和Explanation。不然我看现在Answer都不是Options中的一个。这个逻辑看起来拆分步骤推理更合适,而不是一次让LLM生成完。
- Chloe2023-10-14 来自美国The link for CoT has been changed to "https://github.com/openai/openai-cookbook/blob/main/articles/techniques_to_improve_reliability.md"2
- Toni2023-06-01 来自瑞士OpenAI 5月底官宣,在数学推理过程中加入监督奖励模型可极大地提高了AI的推理能力。现在ChatGPT处理比4月10日展示中复杂得多的问题都已不在话下,短短一个半月,发展日新月异。有志在模型改造方面有建树的同学可参考原文,获取思路灵感。 https://openai.com/research/improving-mathematical-reasoning-with-process-supervision We've trained a model to achieve a new state-of-the-art in mathematical problem solving by rewarding each correct step of reasoning (“process supervision”) instead of simply rewarding the correct final answer (“outcome supervision”). In addition to boosting performance relative to outcome supervision, process supervision also has an important alignment benefit: it directly trains the model to produce a chain-of-thought that is endorsed by humans. 中文翻译: 我们训练了一个模型,通过奖励每一个正确的推理步骤(“过程监督”)而不是简单地奖励正确的最终答案(“结果监督”)来实现数学问题解决的最新技术水平。 除了提高与结果监督相关的性能外,过程监督还有一个重要的校准对标好处:它直接训练模型以产生人类认可的思维链。展开1
- Geek_d548692023-10-08 来自北京“检查语法的时候,我们不仅需要生成的测试代码,也需要原来的功能代码” why? 思考题2情况下,如果原始功能code有语法错误,而测试代码没有问题,根本没有必要重试生成测试代码了呀
- 加油2023-08-31 来自上海如果某个函数有多个私有库的依赖,怎么让chatgpt写测试用例呢?
- 金hb.Ryan 冷空氣...2023-05-22 来自上海让GPT总结了一下 openai-cookbook /techniques_to_improve_reliability.md >> 本文介绍了如何提高大型语言模型在复杂任务上的可靠性,包括寻找更好的提示、将任务分解为子任务、给出更清晰的指令等技巧。同时,文章还介绍了逐步推理、少样本示例和微调等技术,以及链式思考提示、选择-推理提示、忠实推理架构和从最少到最多提示等技术,都旨在将复杂任务分解为更小、更可靠的子任务,并给模型更多的时间和空间来解决问题。此外,文章还提到了如何将这些技术应用于概率图模型中,以便更好地选择、组合和发现新技术。展开