文章/答案/技术大牛

发布

社区首页 >问答首页 >面向视频智能的谷歌云语音转录

问面向视频智能的谷歌云语音转录
EN

Stack Overflow用户

提问于 2021-01-24 04:48:58

回答 1查看 65关注 0票数 0

我打算将Google Cloud语音转录用于视频智能。以下代码仅对视频的部分片段进行分析。

video_uri = "gs://cloudmleap/video/next/JaneGoodall.mp4"
language_code = "en-GB"
segment = types.VideoSegment()
segment.start_time_offset.FromSeconds(55)
segment.end_time_offset.FromSeconds(80)
response = transcribe_speech(video_uri, language_code, [segment])

def transcribe_speech(video_uri, language_code, segments=None):
    video_client = videointelligence.VideoIntelligenceServiceClient()
    features = [enums.Feature.SPEECH_TRANSCRIPTION]
    config = types.SpeechTranscriptionConfig(
        language_code=language_code,
        enable_automatic_punctuation=True,
    )
    context = types.VideoContext(
        segments=segments,
        speech_transcription_config=config,
    )

    print(f'Processing video "{video_uri}"...')
    operation = video_client.annotate_video(
        input_uri=video_uri,
        features=features,
        video_context=context,
    )
    return operation.result()

如何自动分析整个视频，而不是定义特定的片段？

google-cloud-platform

speech-recognition

video-intelligence-api

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-01-25 10:04:36

您可以在Video Intelligence google doc中学习本教程。本教程展示了如何转录整个视频。您的输入应该存储在GCS存储桶中，我在您的示例代码中看到，您的视频确实存储在GCS存储桶中，因此您应该不会有任何问题。

只需确保您已经安装了latest Video Intelligence library.

pip install --upgrade google-cloud-videointelligence

这是转录音频的code snippet from the Video Intelligence doc：

"""Transcribe speech from a video stored on GCS."""
from google.cloud import videointelligence

path="gs://your_gcs_bucket/your_video.mp4"
video_client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.Feature.SPEECH_TRANSCRIPTION]

config = videointelligence.SpeechTranscriptionConfig(
    language_code="en-US", enable_automatic_punctuation=True
)
video_context = videointelligence.VideoContext(speech_transcription_config=config)

operation = video_client.annotate_video(
    request={
        "features": features,
        "input_uri": path,
        "video_context": video_context,
    }
)

print("\nProcessing video for speech transcription.")

result = operation.result(timeout=600)

# There is only one annotation_result since only
# one video is processed.
annotation_results = result.annotation_results[0]
for speech_transcription in annotation_results.speech_transcriptions:

    # The number of alternatives for each transcription is limited by
    # SpeechTranscriptionConfig.max_alternatives.
    # Each alternative is a different possible transcription
    # and has its own confidence score.
    for alternative in speech_transcription.alternatives:
        print("Alternative level information:")

        print("Transcript: {}".format(alternative.transcript))
        print("Confidence: {}\n".format(alternative.confidence))

        print("Word level information:")
        for word_info in alternative.words:
            word = word_info.word
            start_time = word_info.start_time
            end_time = word_info.end_time
            print(
                "\t{}s - {}s: {}".format(
                    start_time.seconds + start_time.microseconds * 1e-6,
                    end_time.seconds + end_time.microseconds * 1e-6,
                    word,
                )
            )

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/65864221

复制

相似问题

问面向视频智能的谷歌云语音转录
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问面向视频智能的谷歌云语音转录EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问面向视频智能的谷歌云语音转录
EN