Google Cloud Vision

GPT-4.iconnishio.icon

まず、Google Cloud SDKをインストールし、プロジェクトをセットアップして認証しておく必要があります。

Google Cloud SDKをインストールする: https://cloud.google.com/sdk/docs/install

gcloud initコマンドでプロジェクトをセットアップする: https://cloud.google.com/sdk/gcloud/reference/init

サービスアカウントキーを作成: https://cloud.google.com/docs/authentication/getting-started

プロジェクトへのアクセス権限を付与するには、サービスアカウントに Project > Owner ロールを付与します。

ロールを付与するには、ロールを選択リストを見つけて、Project > Owner を選択します。

翻訳されてて検索にヒットしない罠nishio.icon

https://gyazo.com/13aa452634eb1f91bad91b4e53896024

環境変数GOOGLE_APPLICATION_CREDENTIALSにパスを設定する:

以下のコードでGoogle Cloud Vision APIを使ってOCRを実行できます。

code:python

import os

from google.cloud import vision

from PIL import Image

os.environ"GOOGLE_APPLICATION_CREDENTIALS" = "/path/to/your/service-account-key.json"

def ocr_image(image_path):

client = vision.ImageAnnotatorClient()

with open(image_path, 'rb') as image_file:

content = image_file.read()

image = vision.Image(content=content)

response = client.text_detection(image=image)

texts = response.text_annotations

if texts:

return texts0.description

else:

return ""

def ocr_readable_images(images_dir):

ocr_texts = []

for image_name in sorted(os.listdir(images_dir)):

if image_name.startswith("readable_page"):

image_path = os.path.join(images_dir, image_name)

ocr_text = ocr_image(image_path)

ocr_texts.append(ocr_text)

return ocr_texts

if __name__ == '__main__':

images_dir = 'output/images'

ocr_texts = ocr_readable_images(images_dir)

for i, text in enumerate(ocr_texts):

print(f'Page {i+1}:\n{text}\n{"="*40}')

nishio.icon

$ pip install google-cloud-vision

PermissionDenied: 403 Cloud Vision API has not been used in project 573408197915 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/vision.googleapis.com/overview?project=... then retry.

https://gyazo.com/b87af84021d2ca24093e39f187394017

PermissionDenied: 403 This API method requires billing to be enabled. Please enable billing on project

input

https://gyazo.com/99abdce33c5963c4b4f9fbf061d31a2d

output

https://gyazo.com/dc950232861a0723e2c3ce96fba25ffc

price

https://gyazo.com/14463ed63ed3610cfbd680c0386e5b0b

大体1リクエスト1秒ちょい掛かる

レートリミットは1分に1800なので、30並列くらいできそう

Google Cloud Vision, Cloud Vision