{"id":15175,"date":"2023-10-11T20:34:49","date_gmt":"2023-10-11T12:34:49","guid":{"rendered":"https:\/\/aif.amtbbs.org\/?p=15175"},"modified":"2023-10-11T20:34:49","modified_gmt":"2023-10-11T12:34:49","slug":"metamath%ef%bc%9a%e6%96%b0%e6%95%b0%e5%ad%a6%e6%8e%a8%e7%90%86%e8%af%ad%e8%a8%80%e6%a8%a1%e5%9e%8b%ef%bc%8c%e8%ae%ad%e7%bb%83%e5%a4%a7%e6%a8%a1%e5%9e%8b%e7%9a%84%e9%80%86%e5%90%91%e6%80%9d%e7%bb%b4","status":"publish","type":"post","link":"https:\/\/aif.amtbbs.org\/index.php\/2023\/10\/11\/15175\/","title":{"rendered":"MetaMath\uff1a\u65b0\u6570\u5b66\u63a8\u7406\u8bed\u8a00\u6a21\u578b\uff0c\u8bad\u7ec3\u5927\u6a21\u578b\u7684\u9006\u5411\u601d\u7ef4"},"content":{"rendered":"<div class=\"article-desc\">\u4e3a\u4e86\u63d0\u5347\u6a21\u578b\u7684\u6b63\u5411\u548c\u9006\u5411\u63a8\u7406\u80fd\u529b\uff0c\u5251\u6865\u3001\u6e2f\u79d1\u5927\u3001\u534e\u4e3a\u7684\u7814\u7a76\u8005\u57fa\u4e8e\u4e24\u4e2a\u5e38\u7528\u7684\u6570\u5b66\u6570\u636e\u96c6\uff08GSM8K \u548c MATH\uff09\u63d0\u51fa\u4e86 MetaMathQA \u6570\u636e\u96c6\uff1a\u4e00\u4e2a\u8986\u76d6\u9762\u5e7f\u3001\u8d28\u91cf\u9ad8\u7684\u6570\u5b66\u63a8\u7406\u6570\u636e\u96c6\u3002MetaMathQA \u7531 395K \u4e2a\u5927\u8bed\u8a00\u6a21\u578b\u751f\u6210\u7684\u6b63\u5411\u9006\u5411\u6570\u5b66\u95ee\u7b54\u5bf9\u7ec4\u6210\u3002<\/div>\n<div id=\"postspictures\" class=\"article-content\">\n<div id=\"container\" class=\"container am-engine\" data-v-0719ea8d=\"\" data-element=\"root\">\n<p>\u590d\u6742\u6570\u5b66\u63a8\u7406\u662f\u8bc4\u4ef7\u5927\u8bed\u8a00\u6a21\u578b\u63a8\u7406\u80fd\u529b\u7684\u91cd\u8981\u6307\u6807\uff0c\u76ee\u524d\u5e38\u7528\u7684\u6570\u5b66\u63a8\u7406\u6570\u636e\u96c6\u6837\u672c\u91cf\u6709\u9650\u4e14\u95ee\u9898\u591a\u6837\u6027\u4e0d\u8db3\uff0c\u5bfc\u81f4\u5927\u8bed\u8a00\u6a21\u578b\u5b58\u5728 [\u9006\u8f6c\u8bc5\u5492] \u7684\u73b0\u8c61\uff0c\u5373\u4e00\u4e2a\u8bad\u7ec3\u4e8e\u300cA \u662f B\u300d\u7684\u8bed\u8a00\u6a21\u578b\u65e0\u6cd5\u63a8\u5e7f\u5230\u300cB \u662f A\u300d[1]\u3002\u6b64\u73b0\u8c61\u5728\u6570\u5b66\u63a8\u7406\u4efb\u52a1\u4e2d\u7684\u5177\u4f53\u5f62\u5f0f\u662f\uff1a\u5373\u7ed9\u5b9a\u4e00\u4e2a\u6570\u5b66\u95ee\u9898\uff0c\u8bed\u8a00\u6a21\u578b\u64c5\u4e8e\u7528\u6b63\u5411\u63a8\u7406\u89e3\u7b54\u95ee\u9898\u4f46\u7f3a\u4e4f\u9006\u5411\u63a8\u7406\u89e3\u51b3\u95ee\u9898\u7684\u80fd\u529b\u3002\u9006\u5411\u63a8\u7406\u5728\u6570\u5b66\u95ee\u9898\u4e2d\u5341\u5206\u5e38\u89c1\uff0c\u5982\u4e0b 2 \u4e2a\u4f8b\u5b50\u3002<\/p>\n<h4>1. \u7ecf\u5178\u95ee\u9898 &#8211; \u9e21\u5154\u540c\u7b3c<\/h4>\n<ul data-id=\"ua73dd4b-YU4062DA\">\n<li data-id=\"ld70c578-m0mf7AK1\">\u6b63\u5411\u63a8\u7406\uff1a\u7b3c\u5b50\u91cc\u6709 23 \u53ea\u9e21\u548c 12 \u53ea\u5154\uff0c\u95ee\u7b3c\u5b50\u91cc\u6709\u591a\u5c11\u4e2a\u5934\u548c\u591a\u5c11\u53ea\u811a\uff1f<\/li>\n<li data-id=\"ld70c578-X0lkimZQ\">\u9006\u5411\u63a8\u7406\uff1a\u6709\u82e5\u5e72\u53ea\u9e21\u5154\u540c\u5728\u4e00\u4e2a\u7b3c\u5b50\u91cc\uff0c\u4ece\u4e0a\u9762\u6570\uff0c\u6709 35 \u4e2a\u5934\uff0c\u4ece\u4e0b\u9762\u6570\uff0c\u6709 94 \u53ea\u811a\u3002\u95ee\u7b3c\u4e2d\u5404\u6709\u591a\u5c11\u53ea\u9e21\u548c\u5154\uff1f<\/li>\n<\/ul>\n<h4>2. GSM8K \u95ee\u9898<\/h4>\n<ul data-id=\"ua73dd4b-O7encSXR\">\n<li data-id=\"ld70c578-XNNXgDOi\">\u6b63\u5411\u63a8\u7406: James buys 5 packs of beef that are 4 pounds each. The price of beef is $5.50 per pound. How much did he pay?<\/li>\n<li data-id=\"ld70c578-mQk9DF0K\">\u9006\u5411\u63a8\u7406: James buys x packs of beef that are 4 pounds each. The price of beef is $5.50 per pound. How much did he pay? If we know the answer to the above question is 110, what is the value of unknown variable x?<\/li>\n<\/ul>\n<p>\u4e3a\u4e86\u63d0\u5347\u6a21\u578b\u7684\u6b63\u5411\u548c\u9006\u5411\u63a8\u7406\u80fd\u529b\uff0c\u5251\u6865\u3001\u6e2f\u79d1\u5927\u3001\u534e\u4e3a\u7684\u7814\u7a76\u8005\u57fa\u4e8e\u4e24\u4e2a\u5e38\u7528\u7684\u6570\u5b66\u6570\u636e\u96c6\uff08GSM8K \u548c MATH\uff09\u63d0\u51fa\u4e86 MetaMathQA \u6570\u636e\u96c6\uff1a\u4e00\u4e2a\u8986\u76d6\u9762\u5e7f\u3001\u8d28\u91cf\u9ad8\u7684\u6570\u5b66\u63a8\u7406\u6570\u636e\u96c6\u3002MetaMathQA \u7531 395K \u4e2a\u5927\u8bed\u8a00\u6a21\u578b\u751f\u6210\u7684\u6b63\u5411\u9006\u5411\u6570\u5b66\u95ee\u7b54\u5bf9\u7ec4\u6210\u3002\u4ed6\u4eec\u5728 MetaMathQA \u6570\u636e\u96c6\u4e0a\u57fa\u4e8e LLaMA-2 \u5fae\u8c03\u5f97\u5230\u4e13\u6ce8\u4e8e\u6570\u5b66\u63a8\u7406 (\u6b63\u5411\u548c\u9006\u5411) \u7684\u5927\u8bed\u8a00\u6a21\u578b MetaMath\uff0c\u5728\u6570\u5b66\u63a8\u7406\u6570\u636e\u96c6\u4e0a\u8fbe\u5230\u4e86 SOTA\u3002MetaMathQA \u6570\u636e\u96c6\u548c\u4e0d\u540c\u89c4\u6a21\u7684 MetaMath \u6a21\u578b\u5df2\u5f00\u6e90\u4f9b\u7814\u7a76\u4eba\u5458\u4f7f\u7528\u3002<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-15176\" src=\"https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/387eaed2454c8bb8ad604123e34a1a989d7839.jpg\" width=\"664\" height=\"229\" alt=\"\u56fe\u7247\" title=\"\u56fe\u7247\" srcset=\"https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/387eaed2454c8bb8ad604123e34a1a989d7839.jpg 664w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/387eaed2454c8bb8ad604123e34a1a989d7839-300x103.jpg 300w\" sizes=\"auto, (max-width: 664px) 100vw, 664px\" \/><\/p>\n<ul data-id=\"ua73dd4b-KoZ4in6n\">\n<li data-id=\"ld70c578-T68QGNLl\">\u9879\u76ee\u5730\u5740\uff1ahttps:\/\/meta-math.github.io\/<\/li>\n<li data-id=\"ld70c578-MGYU5OkI\">\u8bba\u6587\u5730\u5740\uff1ahttps:\/\/arxiv.org\/abs\/2309.12284<\/li>\n<li data-id=\"ld70c578-JlBrTOET\">\u6570\u636e\u5730\u5740\uff1ahttps:\/\/huggingface.co\/datasets\/meta-math\/MetaMathQA<\/li>\n<li data-id=\"ld70c578-crm5lB2p\">\u6a21\u578b\u5730\u5740\uff1ahttps:\/\/huggingface.co\/meta-math<\/li>\n<li data-id=\"ld70c578-1h2ZfNAA\">\u4ee3\u7801\u5730\u5740\uff1ahttps:\/\/github.com\/meta-math\/MetaMath<\/li>\n<\/ul>\n<p>\u672c\u6587\u4e2d\u6784\u9020\u4e86\u4e00\u4e2a\u9006\u5411\u63a8\u7406\u6570\u636e\u96c6 GSM8K-Backward\u3002\u5b9e\u9a8c\u7ed3\u679c\u8868\u660e\uff0c\u76ee\u524d\u7684\u65b9\u6cd5\uff08e.g., SFT\u3001RFT\u3001WizardMath\uff09\u5728\u9006\u5411\u63a8\u7406\u95ee\u9898\u4e2d\u8868\u73b0\u8fdc\u5dee\u4e8e\u6b63\u5411\u63a8\u7406\u3002\u76f8\u8f83\u800c\u8a00\uff0cMetaMath \u6a21\u578b\u5728\u6b63\u5411\u548c\u53cd\u5411\u63a8\u7406\u4e0a\u90fd\u53d6\u5f97\u4e86\u4f18\u8d8a\u7684\u6027\u80fd\u3002<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-15177\" src=\"https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/b86a46c269be9fc916b03986069d9dcb623e54.png\" width=\"753\" height=\"549\" alt=\"\u56fe\u7247\" title=\"\u56fe\u7247\" srcset=\"https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/b86a46c269be9fc916b03986069d9dcb623e54.png 753w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/b86a46c269be9fc916b03986069d9dcb623e54-300x219.png 300w\" sizes=\"auto, (max-width: 753px) 100vw, 753px\" \/><\/p>\n<h4>\u65b9\u6cd5<\/h4>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-15178\" src=\"https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/b47edd9860f5a5fcf8c417a5d707ed1b758052.png\" width=\"1080\" height=\"476\" alt=\"\u56fe\u7247\" title=\"\u56fe\u7247\" srcset=\"https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/b47edd9860f5a5fcf8c417a5d707ed1b758052.png 1080w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/b47edd9860f5a5fcf8c417a5d707ed1b758052-300x132.png 300w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/b47edd9860f5a5fcf8c417a5d707ed1b758052-1024x451.png 1024w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/b47edd9860f5a5fcf8c417a5d707ed1b758052-768x338.png 768w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/p>\n<p>MetaMathQA \u5305\u542b\u56db\u79cd\u6570\u636e\u589e\u5f3a\u7684\u65b9\u6cd5\uff1a<\/p>\n<p><strong>1. Answer Augmentation (\u7b54\u6848\u589e\u5f3a)\uff1a<\/strong>\u7ed9\u5b9a\u95ee\u9898\uff0c\u901a\u8fc7\u5927\u8bed\u8a00\u6a21\u578b\u751f\u6210\u80fd\u5f97\u5230\u6b63\u786e\u7ed3\u679c\u7684\u601d\u7ef4\u94fe\u4f5c\u4e3a\u6570\u636e\u589e\u5e7f\u3002<\/p>\n<ul data-id=\"ua73dd4b-eIQ1R9UX\">\n<li data-id=\"ld70c578-fIISu0Vj\">Question: James buys 5 packs of beef that are 4 pounds each. The price of beef is $5.50 per pound. How much did he pay?<\/li>\n<li data-id=\"ld70c578-tYHbhe52\">Answer: James buys 5 packs of beef that are 4 pounds each, so he buys a total of 5 * 4 = 20 pounds of beef. The price of beef is $5.50 per pound, so he pays 20 * $5.50 = $110. The answer is: 110.<\/li>\n<\/ul>\n<p><strong>2. Rephrasing Question (\u95ee\u9898\u6539\u5199\u589e\u5f3a)\uff1a<\/strong>\u7ed9\u5b9a\u5143\u95ee\u9898\uff0c\u901a\u8fc7\u5927\u8bed\u8a00\u6a21\u578b\u91cd\u5199\u95ee\u9898\u5e76\u751f\u6210\u5f97\u5230\u6b63\u786e\u7ed3\u679c\u7684\u601d\u7ef4\u94fe\u4f5c\u4e3a\u6570\u636e\u589e\u5e7f\u3002<\/p>\n<ul data-id=\"ua73dd4b-1Fb3mUgV\">\n<li data-id=\"ld70c578-iTxrgc1r\">Question: What is the total amount that James paid when he purchased 5 packs of beef, each weighing 4 pounds, at a price of $5.50 per pound?<\/li>\n<li data-id=\"ld70c578-RmZCsa8E\">Answer: Each pack of beef weighs 4 pounds, so 5 packs weigh 4 * 5 = 20 pounds in total. The price per pound of beef is $5.50, so the total cost for 20 pounds is 20 * $5.50 = $110. Therefore, James paid a total of $110. The answer is: 110.<\/li>\n<\/ul>\n<p><strong>3. FOBAR Question (FOBAR \u9006\u5411\u95ee\u9898\u589e\u5f3a)\uff1a<\/strong>\u7ed9\u5b9a\u5143\u95ee\u9898\uff0c\u901a\u8fc7\u63a9\u7801\u6761\u4ef6\u4e2d\u7684\u6570\u5b57\u4e3a x\uff0c\u7ed9\u5b9a\u539f\u6709\u7b54\u6848\u5e76\u53cd\u63a8 x \u6765\u4ea7\u751f\u9006\u5411\u95ee\u9898\uff0c\u5e76\u57fa\u4e8e\u8be5\u9006\u5411\u95ee\u9898\u751f\u6210\u6b63\u786e\u7684\u601d\u7ef4\u94fe\u8fc7\u7a0b\u6765\u8fdb\u884c\u6570\u636e\u589e\u5e7f\uff08\u53cd\u63a8\u6837\u4f8b\uff1a\u201cIf we know the answer to the above question is 110, what is the value of unknown variable x?\u201d\uff09\u3002<\/p>\n<ul data-id=\"ua73dd4b-joXTKDHf\">\n<li data-id=\"ld70c578-eXXZwqZr\">Question: James buys x packs of beef that are 4 pounds each. The price of beef is $5.50 per pound. How much did he pay? If we know the answer to the above question is 110, what is the value of unknown variable x?<\/li>\n<li data-id=\"ld70c578-VmmBdyRO\">Answer: James buys x packs of beef that are 4 pounds each, so he buys a total of 4x pounds of beef. The price of beef is $5.50 per pound, so the total cost of the beef is 5.50 * 4x = 22x. We are given that the total cost is $110, so we can write: 22x = 110. Dividing both sides by 22, we get: x = 5. The value of x is 5.<\/li>\n<\/ul>\n<p><strong>4. Self-Verification Question (Self-Verification \u9006\u5411\u95ee\u9898\u589e\u5f3a)\uff1a<\/strong>\u5728 FOBAR \u7684\u57fa\u7840\u4e0a\uff0c\u901a\u8fc7\u5927\u8bed\u8a00\u6a21\u578b\u6539\u5199\u9006\u5411\u95ee\u9898\u90e8\u5206\u4e3a\u9648\u8ff0\u7684\u8bed\u53e5\u6765\u8fdb\u884c\u6570\u636e\u589e\u5e7f\uff08\u6539\u5199\u6837\u4f8b\uff1a\u201cHow much did he pay?\u201d (with the answer 110) \u88ab\u6539\u5199\u6210 \u201cHe paid 110\u201d\uff09\u3002<\/p>\n<ul data-id=\"ua73dd4b-SHWj84Yp\">\n<li data-id=\"ld70c578-xrGAGqGw\">Question: James buys x packs of beef that are 4 pounds each. The price of beef is $5.50 per pound. He paid 110. What is the value of unknown variable x?<\/li>\n<li data-id=\"ld70c578-WNO22a4Y\">Answer: To solve this problem, we need to determine the value of x, which represents the number of packs of beef that James bought. Each pack of beef weighs 4 pounds and costs $5.50 per pound. The total amount James paid is $110. We can set up the equation as follows: Number of packs of beef * Weight per pack * Price per pound = Total amount paid; x * 4 * $5.50 = $110; Let&#8217;s simplify and solve for x: 4x * $5.50 = $110; 22x = $110; To isolate x, we divide both sides of the equation by 22: 22x \/ 22 = $110 \/ 22; x = $5; The value of x is 5.<\/li>\n<\/ul>\n<h4>\u5b9e\u9a8c\u7ed3\u679c<\/h4>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-15179\" src=\"https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/356b65d14ac7a52b01835398c7bd230fb45832.png\" width=\"763\" height=\"1280\" srcset=\"https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/356b65d14ac7a52b01835398c7bd230fb45832.png 763w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/356b65d14ac7a52b01835398c7bd230fb45832-179x300.png 179w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/356b65d14ac7a52b01835398c7bd230fb45832-610x1024.png 610w\" sizes=\"auto, (max-width: 763px) 100vw, 763px\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-15180\" src=\"https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/9841f1850e2b22f95fe73276739b4278a92c69.png\" width=\"1080\" height=\"371\" alt=\"\u56fe\u7247\" title=\"\u56fe\u7247\" srcset=\"https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/9841f1850e2b22f95fe73276739b4278a92c69.png 1080w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/9841f1850e2b22f95fe73276739b4278a92c69-300x103.png 300w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/9841f1850e2b22f95fe73276739b4278a92c69-1024x352.png 1024w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/9841f1850e2b22f95fe73276739b4278a92c69-768x264.png 768w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/p>\n<p>\u5728\u4e24\u4e2a\u5e38\u89c1\u6570\u5b66\u63a8\u7406\u6570\u636e\u96c6\uff08GSM8K \u548c MATH\uff09\u7684\u5b9e\u9a8c\u7ed3\u679c\u8868\u660e\uff0c\u5728\u4e0d\u501f\u52a9\u5916\u90e8\u5de5\u5177\uff08e.g., code interpreter\uff09, MetaMath \u5728\u6027\u80fd\u4e0a\u663e\u8457\u4f18\u4e8e\u5df2\u6709\u5f00\u6e90 LLM \u6a21\u578b\u3002\u5176\u4e2d\uff0c\u6211\u4eec\u7684 MetaMath-7B \u6a21\u578b\u5728 GSM8K \u4e0a\u8fbe\u5230\u4e86 66.5% \u7684\u51c6\u786e\u7387\uff0c\u5728 MATH \u4e0a\u8fbe\u5230\u4e86 19.8% \u7684\u51c6\u786e\u7387\uff0c\u8d85\u8fc7\u4e86\u76f8\u540c\u89c4\u6a21\u7684\u6700\u5148\u8fdb\u6a21\u578b\u5206\u522b 11.6% \u548c 9.1%\u3002\u7279\u522b\u5730\uff0cMetaMath-70B \u5728 GSM8K \u4e0a\u8fbe\u5230\u4e86 82.3% \u7684\u51c6\u786e\u7387\uff0c\u8d85\u8fc7 GPT-3.5-Turbo\u3002<\/p>\n<p>\u6839\u636e Superficial Alignment Hypothesis [2]\uff0c\u5927\u8bed\u8a00\u6a21\u578b\u7684\u80fd\u529b\u6e90\u4e8e\u9884\u8bad\u7ec3\uff0c\u800c\u6765\u81ea\u4e0b\u6e38\u4efb\u52a1\u7684\u6570\u636e\u5219\u6fc0\u6d3b\u4e86\u9884\u8bad\u7ec3\u671f\u95f4\u5b66\u4e60\u5230\u7684\u8bed\u8a00\u6a21\u578b\u7684\u56fa\u6709\u80fd\u529b\u3002\u7531\u6b64\u4ea7\u751f\u4e86\u4e24\u4e2a\u91cd\u8981\u95ee\u9898\uff1a\uff08i\uff09\u4ec0\u4e48\u7c7b\u578b\u7684\u6570\u636e\u53ef\u4ee5\u6700\u6709\u6548\u5730\u6fc0\u6d3b\u6f5c\u5728\u77e5\u8bc6\uff0c\u4ee5\u53ca\uff08ii\uff09\u4e3a\u4ec0\u4e48\u4e00\u4e2a\u6570\u636e\u96c6\u5728\u8fd9\u79cd\u6fc0\u6d3b\u4e2d\u6bd4\u53e6\u4e00\u4e2a\u6570\u636e\u96c6\u66f4\u597d\uff1f<\/p>\n<p><strong>\u4e3a\u4ec0\u4e48 MetaMathQA \u6709\u7528\uff1f\u63d0\u9ad8\u4e86\u601d\u7ef4\u94fe\u6570\u636e\u7684\u8d28\u91cf (Perplexity)<\/strong><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-15181\" src=\"https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/325e7899614d69fdf22860d2e1333c8d5d8af0.png\" width=\"1080\" height=\"658\" alt=\"\u56fe\u7247\" title=\"\u56fe\u7247\" srcset=\"https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/325e7899614d69fdf22860d2e1333c8d5d8af0.png 1080w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/325e7899614d69fdf22860d2e1333c8d5d8af0-300x183.png 300w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/325e7899614d69fdf22860d2e1333c8d5d8af0-1024x624.png 1024w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/325e7899614d69fdf22860d2e1333c8d5d8af0-768x468.png 768w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/p>\n<p>\u5982\u4e0a\u56fe\u6240\u793a\uff0c\u7814\u7a76\u8005\u4eec\u8ba1\u7b97\u4e86 LLaMA-2-7B \u6a21\u578b\u5728\u4ec5\u7b54\u6848\u6570\u636e\u3001GSM8K CoT \u4ee5\u53ca MetaMathQA \u6570\u636e\u7684\u5404\u90e8\u5206\u4e0a\u7684 perplexity\u3002MetaMathQA \u5404\u90e8\u5206\u7684 perplexity \u663e\u8457\u4f4e\u4e8e\u5176\u4ed6\u4e24\u4e2a\u6570\u636e\u96c6\uff0c\u8fd9\u7a81\u663e\u4e86\u5b83\u56fa\u6709\u7684\u6613\u5b66\u6027\uff0c\u53ef\u80fd\u66f4\u6709\u52a9\u4e8e\u5f15\u5bfc\u51fa\u6a21\u578b\u7684\u6f5c\u5728\u77e5\u8bc6\u3002<\/p>\n<p><strong>\u4e3a\u4ec0\u4e48 MetaMathQA \u6709\u7528\uff1f\u589e\u52a0\u4e86\u601d\u7ef4\u94fe\u6570\u636e\u7684\u591a\u6837\u6027 (Diversity)<\/strong><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-15182\" src=\"https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/e80919a146b5750387c626703cea57be675201.png\" width=\"1080\" height=\"544\" srcset=\"https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/e80919a146b5750387c626703cea57be675201.png 1080w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/e80919a146b5750387c626703cea57be675201-300x151.png 300w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/e80919a146b5750387c626703cea57be675201-1024x516.png 1024w, https:\/\/aiforumimage.oss-cn-shanghai.aliyuncs.com\/wp-content\/uploads\/2023\/10\/e80919a146b5750387c626703cea57be675201-768x387.png 768w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/p>\n<p>\u901a\u8fc7\u6bd4\u8f83\u6570\u636e\u7684\u591a\u6837\u6027\u589e\u76ca\u548c\u6a21\u578b\u7684\u51c6\u786e\u7387\u589e\u76ca\uff0c\u4ed6\u4eec\u53d1\u73b0 Rephrasing\u3001FOBAR \u548c SV \u6dfb\u52a0\u76f8\u540c\u6570\u91cf\u7684\u589e\u5e7f\u6570\u636e\u5e26\u6765\u4e86\u660e\u663e\u7684\u591a\u6837\u6027\u589e\u76ca\uff0c\u663e\u8457\u63d0\u5347\u4e86\u6a21\u578b\u51c6\u786e\u7387\u3002\u76f8\u6bd4\u4e4b\u4e0b\uff0c\u7b80\u5355\u5730\u4f7f\u7528\u7b54\u6848\u589e\u5f3a\u4f1a\u5bfc\u81f4\u660e\u663e\u7684\u51c6\u786e\u7387\u9971\u548c\u3002\u5728\u51c6\u786e\u7387\u9971\u548c\u4e4b\u540e\uff0c\u589e\u52a0 AnsAug \u6570\u636e\u53ea\u4f1a\u5e26\u6765\u6709\u9650\u7684\u6027\u80fd\u63d0\u5347\u3002<\/p>\n<\/div>\n<\/div>\n<div class=\"pvc_clear\"><\/div>\n<p id=\"pvc_stats_15175\" class=\"pvc_stats total_only  \" data-element-id=\"15175\" style=\"\"><i class=\"pvc-stats-icon medium\" aria-hidden=\"true\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" version=\"1.0\" viewBox=\"0 0 502 315\" preserveAspectRatio=\"xMidYMid meet\"><g transform=\"translate(0,332) scale(0.1,-0.1)\" fill=\"\" stroke=\"none\"><path d=\"M2394 3279 l-29 -30 -3 -207 c-2 -182 0 -211 15 -242 39 -76 157 -76 196 0 15 31 17 60 15 243 l-3 209 -33 29 c-26 23 -41 29 -80 29 -41 0 -53 -5 -78 -31z\"\/><path d=\"M3085 3251 c-45 -19 -58 -50 -96 -229 -47 -217 -49 -260 -13 -295 52 -53 146 -42 177 20 16 31 87 366 87 410 0 70 -86 122 -155 94z\"\/><path d=\"M1751 3234 c-13 -9 -29 -31 -37 -50 -12 -29 -10 -49 21 -204 19 -94 39 -189 45 -210 14 -50 54 -80 110 -80 34 0 48 6 76 34 21 21 34 44 34 59 0 14 -18 113 -40 219 -37 178 -43 195 -70 221 -36 32 -101 37 -139 11z\"\/><path d=\"M1163 3073 c-36 -7 -73 -59 -73 -102 0 -56 133 -378 171 -413 34 -32 83 -37 129 -13 70 36 67 87 -16 290 -86 209 -89 214 -129 231 -35 14 -42 15 -82 7z\"\/><path d=\"M3689 3066 c-15 -9 -33 -30 -42 -48 -48 -103 -147 -355 -147 -375 0 -98 131 -148 192 -74 13 15 57 108 97 206 80 196 84 226 37 273 -30 30 -99 39 -137 18z\"\/><path d=\"M583 2784 c-38 -19 -67 -74 -58 -113 9 -42 211 -354 242 -373 16 -10 45 -18 66 -18 51 0 107 52 107 100 0 39 -1 41 -124 234 -80 126 -108 162 -133 173 -41 17 -61 16 -100 -3z\"\/><path d=\"M4250 2784 c-14 -9 -74 -91 -133 -183 -95 -150 -107 -173 -107 -213 0 -55 33 -94 87 -104 67 -13 90 8 211 198 130 202 137 225 78 284 -27 27 -42 34 -72 34 -22 0 -50 -8 -64 -16z\"\/><path d=\"M2275 2693 c-553 -48 -1095 -270 -1585 -649 -135 -104 -459 -423 -483 -476 -23 -49 -22 -139 2 -186 73 -142 361 -457 571 -626 285 -228 642 -407 990 -497 242 -63 336 -73 660 -74 310 0 370 5 595 52 535 111 1045 392 1455 803 122 121 250 273 275 326 19 41 19 137 0 174 -41 79 -309 363 -465 492 -447 370 -946 591 -1479 653 -113 14 -422 18 -536 8z m395 -428 c171 -34 330 -124 456 -258 112 -119 167 -219 211 -378 27 -96 24 -300 -5 -401 -72 -255 -236 -447 -474 -557 -132 -62 -201 -76 -368 -76 -167 0 -236 14 -368 76 -213 98 -373 271 -451 485 -162 444 86 934 547 1084 153 49 292 57 452 25z m909 -232 c222 -123 408 -262 593 -441 76 -74 138 -139 138 -144 0 -16 -233 -242 -330 -319 -155 -123 -309 -223 -461 -299 l-81 -41 32 46 c18 26 49 83 70 128 143 306 141 649 -6 957 -25 52 -61 116 -79 142 l-34 47 45 -20 c26 -10 76 -36 113 -56z m-2057 25 c-40 -58 -105 -190 -130 -263 -110 -324 -59 -707 132 -981 25 -35 42 -64 37 -64 -19 0 -241 119 -326 174 -188 122 -406 314 -532 468 l-58 71 108 103 c185 178 428 349 672 473 66 33 121 60 123 61 2 0 -10 -19 -26 -42z\"\/><path d=\"M2375 1950 c-198 -44 -350 -190 -395 -379 -18 -76 -8 -221 19 -290 114 -284 457 -406 731 -260 98 52 188 154 231 260 27 69 37 214 19 290 -38 163 -166 304 -326 360 -67 23 -215 33 -279 19z\"\/><\/g><\/svg><\/i> <img loading=\"lazy\" decoding=\"async\" width=\"16\" height=\"16\" alt=\"Loading\" src=\"https:\/\/aif.amtbbs.org\/wp-content\/plugins\/page-views-count\/ajax-loader-2x.gif\" border=0 \/><\/p>\n<div class=\"pvc_clear\"><\/div>\n","protected":false},"excerpt":{"rendered":"<p>\u4e3a\u4e86\u63d0\u5347\u6a21\u578b\u7684\u6b63\u5411\u548c\u9006\u5411\u63a8\u7406\u80fd\u529b\uff0c\u5251\u6865\u3001\u6e2f\u79d1\u5927\u3001\u534e\u4e3a\u7684\u7814\u7a76\u8005\u57fa\u4e8e\u4e24\u4e2a\u5e38\u7528\u7684\u6570\u5b66\u6570\u636e\u96c6\uff08GSM8K \u548c MATH [&hellip;]<\/p>\n<div class=\"pvc_clear\"><\/div>\n<p id=\"pvc_stats_15175\" class=\"pvc_stats total_only  \" data-element-id=\"15175\" style=\"\"><i class=\"pvc-stats-icon medium\" aria-hidden=\"true\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" version=\"1.0\" viewBox=\"0 0 502 315\" preserveAspectRatio=\"xMidYMid meet\"><g transform=\"translate(0,332) scale(0.1,-0.1)\" fill=\"\" stroke=\"none\"><path d=\"M2394 3279 l-29 -30 -3 -207 c-2 -182 0 -211 15 -242 39 -76 157 -76 196 0 15 31 17 60 15 243 l-3 209 -33 29 c-26 23 -41 29 -80 29 -41 0 -53 -5 -78 -31z\"\/><path d=\"M3085 3251 c-45 -19 -58 -50 -96 -229 -47 -217 -49 -260 -13 -295 52 -53 146 -42 177 20 16 31 87 366 87 410 0 70 -86 122 -155 94z\"\/><path d=\"M1751 3234 c-13 -9 -29 -31 -37 -50 -12 -29 -10 -49 21 -204 19 -94 39 -189 45 -210 14 -50 54 -80 110 -80 34 0 48 6 76 34 21 21 34 44 34 59 0 14 -18 113 -40 219 -37 178 -43 195 -70 221 -36 32 -101 37 -139 11z\"\/><path d=\"M1163 3073 c-36 -7 -73 -59 -73 -102 0 -56 133 -378 171 -413 34 -32 83 -37 129 -13 70 36 67 87 -16 290 -86 209 -89 214 -129 231 -35 14 -42 15 -82 7z\"\/><path d=\"M3689 3066 c-15 -9 -33 -30 -42 -48 -48 -103 -147 -355 -147 -375 0 -98 131 -148 192 -74 13 15 57 108 97 206 80 196 84 226 37 273 -30 30 -99 39 -137 18z\"\/><path d=\"M583 2784 c-38 -19 -67 -74 -58 -113 9 -42 211 -354 242 -373 16 -10 45 -18 66 -18 51 0 107 52 107 100 0 39 -1 41 -124 234 -80 126 -108 162 -133 173 -41 17 -61 16 -100 -3z\"\/><path d=\"M4250 2784 c-14 -9 -74 -91 -133 -183 -95 -150 -107 -173 -107 -213 0 -55 33 -94 87 -104 67 -13 90 8 211 198 130 202 137 225 78 284 -27 27 -42 34 -72 34 -22 0 -50 -8 -64 -16z\"\/><path d=\"M2275 2693 c-553 -48 -1095 -270 -1585 -649 -135 -104 -459 -423 -483 -476 -23 -49 -22 -139 2 -186 73 -142 361 -457 571 -626 285 -228 642 -407 990 -497 242 -63 336 -73 660 -74 310 0 370 5 595 52 535 111 1045 392 1455 803 122 121 250 273 275 326 19 41 19 137 0 174 -41 79 -309 363 -465 492 -447 370 -946 591 -1479 653 -113 14 -422 18 -536 8z m395 -428 c171 -34 330 -124 456 -258 112 -119 167 -219 211 -378 27 -96 24 -300 -5 -401 -72 -255 -236 -447 -474 -557 -132 -62 -201 -76 -368 -76 -167 0 -236 14 -368 76 -213 98 -373 271 -451 485 -162 444 86 934 547 1084 153 49 292 57 452 25z m909 -232 c222 -123 408 -262 593 -441 76 -74 138 -139 138 -144 0 -16 -233 -242 -330 -319 -155 -123 -309 -223 -461 -299 l-81 -41 32 46 c18 26 49 83 70 128 143 306 141 649 -6 957 -25 52 -61 116 -79 142 l-34 47 45 -20 c26 -10 76 -36 113 -56z m-2057 25 c-40 -58 -105 -190 -130 -263 -110 -324 -59 -707 132 -981 25 -35 42 -64 37 -64 -19 0 -241 119 -326 174 -188 122 -406 314 -532 468 l-58 71 108 103 c185 178 428 349 672 473 66 33 121 60 123 61 2 0 -10 -19 -26 -42z\"\/><path d=\"M2375 1950 c-198 -44 -350 -190 -395 -379 -18 -76 -8 -221 19 -290 114 -284 457 -406 731 -260 98 52 188 154 231 260 27 69 37 214 19 290 -38 163 -166 304 -326 360 -67 23 -215 33 -279 19z\"\/><\/g><\/svg><\/i> <img loading=\"lazy\" decoding=\"async\" width=\"16\" height=\"16\" alt=\"Loading\" src=\"https:\/\/aif.amtbbs.org\/wp-content\/plugins\/page-views-count\/ajax-loader-2x.gif\" border=0 \/><\/p>\n<div class=\"pvc_clear\"><\/div>\n","protected":false},"author":56,"featured_media":15176,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3,23,80],"tags":[850],"class_list":["post-15175","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","category-23","category-80","tag-metamath"],"_links":{"self":[{"href":"https:\/\/aif.amtbbs.org\/index.php\/wp-json\/wp\/v2\/posts\/15175","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/aif.amtbbs.org\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aif.amtbbs.org\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aif.amtbbs.org\/index.php\/wp-json\/wp\/v2\/users\/56"}],"replies":[{"embeddable":true,"href":"https:\/\/aif.amtbbs.org\/index.php\/wp-json\/wp\/v2\/comments?post=15175"}],"version-history":[{"count":1,"href":"https:\/\/aif.amtbbs.org\/index.php\/wp-json\/wp\/v2\/posts\/15175\/revisions"}],"predecessor-version":[{"id":15183,"href":"https:\/\/aif.amtbbs.org\/index.php\/wp-json\/wp\/v2\/posts\/15175\/revisions\/15183"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/aif.amtbbs.org\/index.php\/wp-json\/wp\/v2\/media\/15176"}],"wp:attachment":[{"href":"https:\/\/aif.amtbbs.org\/index.php\/wp-json\/wp\/v2\/media?parent=15175"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aif.amtbbs.org\/index.php\/wp-json\/wp\/v2\/categories?post=15175"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aif.amtbbs.org\/index.php\/wp-json\/wp\/v2\/tags?post=15175"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}