SageMaker TensorFlow对象检测模型

这篇文章描述了如何在Amazon SageMaker中使用TensorFlow对象检测模型API来实现这一点。

首先，基于AWS示例笔记本，将解释如何使用SageMaker端点在单个图像上运行模型。对于较小的图像，这种方法可行，但对于较大的图像，我们会遇到问题。

为了解决这些问题，改用批处理转换作业。

起点：使用SageMaker TensorFLow对象检测API进行模型推断

AWS提供了一些关于GitHub如何使用SageMaker的好例子。

使用此示例使用TensorFlow对象检测API对对象检测模型进行预测：

将模型部署为端点时，可以通过调用端点，使用该模型一次推断一个图像。此代码取自示例笔记本，显示了如何定义TensorFlowModel并将其部署为模型端点：

import cv2

import sagemaker

from sagemaker．utils import name＿from＿base

from sagemaker．tensorflow import TensorFlowModel

role ＝ sagemaker．get＿execution＿role（）

model＿artefact ＝＇＜your－model－s3－path＞＇

model＿endpoint ＝ TensorFlowModel（

name＝name＿from＿base（＇tf2－object－detection＇），

model＿data＝model＿artefact，

role＝role，

framework＿version＝＇2．2＇，

）

predictor ＝ model＿endpoint．deploy（initial＿instance＿count＝1， instance＿type＝＇ml．m5．large＇）

然后，将图像加载为NumPy数组，并将其解析为列表，以便将其传递给端点：

def image＿file＿to＿tensor（path）：

cv＿img ＝ cv2．imread（path，1）．astype（＇uint8＇）

cv＿img ＝ cv2．cvtColor（cv＿img， cv2．COLOR＿BGR2RGB）

return cv＿img

img ＝ image＿file＿to＿tensor（＇test＿images／22673445．jpg＇）

input ＝｛

＇instances＇：［img．tolist（）］

｝

最后，调用端点：

detections ＝ predictor．predict（input）［＇predictions＇］［0］

问题：端点请求负载大小太大

这在使用小图像时很好，因为API调用的请求负载足够小。然而，当使用较大的图片时，API返回413错误。这意味着有效负载超过了允许的大小，即6 MB。

当然，我们可以在调用端点之前调整图像的大小，但我想使用批处理转换作业。

解决方案：改用批处理转换作业

使用SageMaker批量转换作业，你可以定义自己的最大负载大小，这样我们就不会遇到413个错误。其次，这些作业可用于一次性处理全套图像。

图像需要存储在S3存储桶中。所有图像都以批处理模式（名称中的内容）进行处理，预测也存储在S3上。

为了使用批处理转换作业，我们再次定义了TensorFlowModel，但这次我们还定义了入口点和源目录：

model＿batch ＝ TensorFlowModel（

name＝name＿from＿base（＇tf2－object－detection＇），

model＿data＝model＿artifact，

role＝role，

framework＿version＝＇2．2＇，

entry＿point＝＇inference．py＇，

source＿dir＝＇．＇，

）

inference．py代码转换模型的输入和输出数据，如文档中所述。此代码需要将请求负载（图像）更改为NumPy数组，并将其解析为列表对象。

从这个示例开始，我更改了代码，使其加载图像并将其转换为NumPy数组。inference．py中input＿handler函数更改为以下内容：

import io

import json

import numpy as np

from PIL import Image

def input＿handler（data， context）：

＂＂＂ Pre－process request input before it is sent to TensorFlow Serving REST API

Args：

data （obj）： the request data， in format of dict or string

context （Context）： an object containing request and configuration details

Returns：

（dict）： a JSON－serializable dict that contains request body and headers

＂＂＂

if context．request＿content＿type ＝＝＂application／x－image＂：

payload ＝ data．read（）
image ＝ Image．open（io．BytesIO（payload））
array ＝ np．asarray（image）
return json．dumps（｛＇instances＇：［array．tolist（）］｝）
raise ValueError（＇｛｛＂error＂：＂unsupported content type ｛｝＂｝｝＇．format（
context．request＿content＿type or ＂unknown＂））

注意，在上面的代码中排除了output＿handler函数。

此函数需要Python包NumPy和Pillow，它们未安装在运行批处理推断作业的机器上。

我们可以创建自己的镜像并使用该镜像（在TensorFlowModel对象初始化时使用image＿uri关键字）。

也可以提供requirements．txt并将其存储在笔记本所在的文件夹中（称为source＿dir＝“．”）。该文件在镜像引导期间用于使用pip安装所需的包。内容为：

numpy

pillow

首先，想使用OpenCV（就像在endpoint示例中一样），但该软件包不太容易安装。

我们现在使用模型创建transformer对象，而不是将模型部署为模型端点：

input＿path ＝＂s3：／／bucket／input＂

output＿path ＝＂s3：／／bucket／output＂

tensorflow＿serving＿transformer ＝ model＿batch．transformer（

instance＿count＝1，

instance＿type＝＂ml．m5．large＂，

max＿concurrent＿transforms＝1，

max＿payload＝5，

output＿path＝output＿path，

）

最后，使用transform：

tensorflow＿serving＿transformer．transform（

input＿path，

content＿type＝＂application／x－image＂，

）

图像由模型处理，结果将作为JSON文件最终在output＿path bucket中。命名等于输入文件名，后跟．out扩展名。你还可以调整和优化实例类型、最大负载等。

最后

这很可能不是最具成本效益的方法，因为我们将图像作为NumPy数组传递给转换器。

此外，我们还可以在inference．py中调整output＿handler函数压缩并存储在S3上的JSON，或仅返回相关检测。

世界人工智能论坛

作者yinhua

作者 yinhua

相关文章

黑客如何利用快速工程技术操纵代理人工智能

企业为何纷纷转向小AI模型？

如何使用Java设计一套多智能体系统

您错过的

黑客如何利用快速工程技术操纵代理人工智能

企业为何纷纷转向小AI模型？

如何使用Java设计一套多智能体系统

OpenAI：强化学习确实可显著提高LLM性能，DeepSeek R1、Kimi k1.5发现o1的秘密

世界人工智能论坛