首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >在C++ inception v3上运行TensorFlow时出现非法指令错误

在C++ inception v3上运行TensorFlow时出现非法指令错误
EN

Stack Overflow用户
提问于 2016-07-28 11:23:23
回答 1查看 783关注 0票数 0

我正在尝试运行使用C++ API教程进行图像识别,在使用Bazel进行TensorFlow编译之后,尝试执行label_image时出现了一个TensorFlow错误。

我做了以下步骤:

代码语言:javascript
复制
# After installing the bazel dependencies, I get the bazel installer
$ mkdir ~/bazel-download && cd ~/bazel-download
$ wget https://github.com/bazelbuild/bazel/releases/download/0.3.0/bazel-0.3.0-installer-linux-x86_64.sh -O bazel-0.3.0-installer-linux-x86_64.sh

$ chmod +x bazel-0.3.0-installer-linux-x86_64.sh
# Install bazel in ~/bin
$ ./bazel-0.3.0-installer-linux-x86_64.sh --user

# Add bazel to the path, if not done already
$ printf '\nexport PATH=$PATH:"~/bin/"\n' >> ~/.bashrc

# Before this, I create a new terminal to refresh the bash PATH
$ mkdir ~/inceptionV3 && cd ~/inceptionV3
# Get a stable version of TensorFlow
$ git clone https://github.com/tensorflow/tensorflow -b r0.9
$ cd tensorflow

# Add the InceptionV3 data/models for the C++ api
$ wget https://storage.googleapis.com/download.tensorflow.org/models/inception_dec_2015.zip -O tensorflow/examples/label_image/data/inception_dec_2015.zip
$ unzip tensorflow/examples/label_image/data/inception_dec_2015.zip -d tensorflow/examples/label_image/data/

# Configure tensorflow: set python path, no Google Cloud Platform support, no GPU support
$ ./configure
# Run bazel build with the allocated resources
$ bazel build -c opt --copt=-mavx --verbose_failures --local_resources 2048,2.0,1.0 -j 1 tensorflow/examples/label_image/...

# -- Here's the last log output from bazel --
INFO: From Compiling tensorflow/core/common_runtime/function.cc:
tensorflow/core/common_runtime/function.cc: In lambda function:
tensorflow/core/common_runtime/function.cc:392:60: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
                } else if (rets->size() != ctx->num_outputs()) {
                                                            ^
INFO: Elapsed time: 6929.927s, Critical Path: 69.23s

# Look like there's no error during the compilation, but now, if I run the generated executable:
$ ./bazel-bin/tensorflow/examples/label_image/label_image
Illegal instruction

另外,我在Docker上运行一个Ubuntu14.04.4LTS x86_64容器(gcc/g++版本为4.8.4)。

我尝试使用其他设置(比如使用apt-为bazel安装 )来运行这个程序,但是在使用新的编译运行可执行文件之后,仍然会得到一个Illegal instruction错误。

尽管如此,本教程的Python部分工作正常(使用python2.7.6)。知道如何解决C++ API的问题吗?

edit1:(添加更多关于cpu的信息)下面是我从/proc/cpuinfo获得的输出。

edit2:(尝试调试tensorflow)使用此命令编译:

代码语言:javascript
复制
$ bazel build -c dbg --strip=always --copt=-mavx --verbose_failures --local_resources 2048,2.0,1.0 -j 1 tensorflow/examples/label_image/...

我试着用gdb进行调试:

代码语言:javascript
复制
$ -q bazel-bin/tensorflow/examples/label_image/label_image
Reading symbols from bazel-bin/tensorflow/examples/label_image/label_image...(no debugging symbols found)...done.

(gdb) set disable-randomization off

(gdb) run
Starting program: /root/.cache/bazel/_bazel_root/b54d699ba1afcab684f4628c78701dbe/execroot/tensorflow/bazel-out/local-dbg/bin/tensorflow/examples/label_image/label_image
During startup program terminated with signal SIGILL, Illegal instruction.

(gdb) backtrace
No stack.

(gdb) handle SIGILL nostop
Signal        Stop      Print   Pass to program Description
SIGILL        No        Yes     Yes             Illegal instruction

(gdb) run
Starting program: /root/.cache/bazel/_bazel_root/b54d699ba1afcab684f4628c78701dbe/execroot/tensorflow/bazel-out/local-dbg/bin/tensorflow/examples/label_image/label_image
During startup program terminated with signal SIGILL, Illegal instruction.

(gdb) backtrace
No stack.

(gdb) info files
Symbols from "/root/.cache/bazel/_bazel_root/b54d699ba1afcab684f4628c78701dbe/execroot/tensorflow/bazel-out/local-dbg/bin/tensorflow/examples/label_image/label_image".
Local exec file:
        `/root/.cache/bazel/_bazel_root/b54d699ba1afcab684f4628c78701dbe/execroot/tensorflow/bazel-out/local-dbg/bin/tensorflow/examples/label_image/label_image', file type elf64-x86-64.
        Entry point: 0x434b10
        0x0000000000400270 - 0x000000000040028c is .interp
        0x000000000040028c - 0x00000000004002ac is .note.ABI-tag
        0x00000000004002ac - 0x00000000004002cc is .note.gnu.build-id
        0x00000000004002d0 - 0x0000000000400380 is .gnu.hash
        0x0000000000400380 - 0x00000000004027e0 is .dynsym
        0x00000000004027e0 - 0x0000000000404667 is .dynstr
        0x0000000000404668 - 0x0000000000404970 is .gnu.version
        0x0000000000404970 - 0x0000000000404b70 is .gnu.version_r
        0x0000000000404b70 - 0x0000000000431360 is .rela.dyn
        0x0000000000431360 - 0x00000000004334a8 is .rela.plt
        0x00000000004334a8 - 0x00000000004334c2 is .init
        0x00000000004334d0 - 0x0000000000434b10 is .plt
        0x0000000000434b10 - 0x00000000027cfe2f is .text
        0x00000000027cfe30 - 0x00000000027cfe39 is .fini
        0x00000000027cfe40 - 0x0000000003890ed0 is .rodata
        0x0000000003890ed0 - 0x0000000003acc1ec is .eh_frame_hdr
        0x0000000003acc1f0 - 0x000000000441fc2c is .eh_frame
        0x000000000441fc2c - 0x000000000444474f is .gcc_except_table
        0x0000000004644dd0 - 0x0000000004644de0 is .tdata
        0x0000000004644de0 - 0x0000000004644df8 is .tbss
        0x0000000004644de0 - 0x0000000004645a70 is .init_array
        0x0000000004645a70 - 0x0000000004645a78 is .fini_array
        0x0000000004645a78 - 0x0000000004645a80 is .jcr
        0x0000000004645a80 - 0x00000000046a5d50 is .data.rel.ro
        0x00000000046a5d50 - 0x00000000046a5f90 is .dynamic
        0x00000000046a5f90 - 0x00000000046a6000 is .got
        0x00000000046a6000 - 0x00000000046a6b30 is .got.plt
        0x00000000046a6b40 - 0x00000000046a70d0 is .data
        0x00000000046a70e0 - 0x00000000046aae18 is .bss

(gdb) break main
Breakpoint 1 at 0x436cc0

(gdb) run
Starting program: /root/.cache/bazel/_bazel_root/b54d699ba1afcab684f4628c78701dbe/execroot/tensorflow/bazel-out/local-dbg/bin/tensorflow/examples/label_image/label_image
During startup program terminated with signal SIGILL, Illegal instruction.

(gdb) backtrace
No stack.

到目前为止,由于Illegal instruction错误是由SIGILL信号引起的,所以我认为当前的体系结构与生成的机器代码不匹配。然而,我不知道如何处理这个特殊的问题。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2016-07-30 00:14:39

经过几次搜索,似乎--copt=-mavx实际上是传递给gcc的一个参数,以优化OSX机器就像这里指出的那样上的体系结构。所以它不可能在我的linux "PC“机器上工作。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/38634941

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档