首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >构造tensorflow双投影器错误

构造tensorflow双投影器错误
EN

Stack Overflow用户
提问于 2018-05-26 07:28:16
回答 1查看 129关注 0票数 0

我刚开始使用tensorflow。我想构造一个具有以下属性的双射射器:它采用n维概率分布p(x1,x2,...,xn),并且它只变换两个确定的维度i和j,使得xi‘= xi,xj’= xj*exp(s(xi)) + t(xj),其中s和t是使用神经网络实现的两个函数。我有一个基本代码,如下所示:

代码语言:javascript
复制
  def net(x, out_size, block_w_id, block_d_id, layer_id):
    x = tf.contrib.layers.fully_connected(x, 256, reuse=tf.AUTO_REUSE, scope='x1_block_w_{}_block_d_{}_layer_{}'.format(block_w_id, \
                                                                                                                       block_d_id,\
                                                                                                                       layer_id))
    x = tf.contrib.layers.fully_connected(x, 256, reuse=tf.AUTO_REUSE, scope='x2_block_w_{}_block_d_{}_layer_{}'.format(block_w_id,\
                                                                                                                       block_d_id,\
                                                                                                                       layer_id))
    y = tf.contrib.layers.fully_connected(x, out_size, reuse=tf.AUTO_REUSE, scope='y_block_w_{}_block_d_{}_layer_{}'.format(block_w_id,\
                                                                                                                           block_d_id,\
                                                                                                                           layer_id))
#     return layers.stack(x, layers.fully_connected(reuse=tf.AUTO_REUSE), [512, 512, out_size])
    return y

代码语言:javascript
复制
class NVPCoupling(tfb.Bijector):
    """NVP affine coupling layer for 2D units.
    """
    def __init__(self, input_idx1, input_idx2, block_w_id = 0, block_d_id = 0, layer_id = 0, validate_args = False\
                 , name="NVPCoupling"):
        """
        NVPCoupling only manipulate two inputs with idx1 & idx2.
        """
        super(NVPCoupling, self).__init__(\
                                         event_ndims = 1, validate_args = validate_args, name = name)
        self.idx1 = input_idx1
        self.idx2 = input_idx2
        self.block_w_id = block_w_id
        self.block_d_id = block_d_id
        self.layer_id = layer_id
        # create variables
        tmp = tf.placeholder(dtype=DTYPE, shape = [1, 1])
        self.s(tmp) 
        self.t(tmp)

    def s(self, xd):
        with tf.variable_scope('s_block_w_id_{}_block_d_id_{}_layer_{}'.format(self.block_w_id,\
                                                                              self.block_d_id,\
                                                                              self.layer_id),\
                              reuse = tf.AUTO_REUSE):
            return net(xd, 1, self.block_w_id, self.block_d_id, self.layer_id)
    def t(self, xd):
        with tf.variable_scope('t_block_w_id_{}_block_d_id_{}_layer_{}'.format(self.block_w_id,\
                                                                              self.block_d_id,\
                                                                              self.layer_id),\
                              reuse = tf.AUTO_REUSE):
            return net(xd, 1, self.block_w_id, self.block_d_id, self.layer_id)
    def _forward(self, x):
        x_left, x_right = x[:, self.idx1:(self.idx1 + 1)], x[:, self.idx2:(self.idx2 + 1)]
        y_right = x_right * tf.exp(self.s(x_left)) + self.t(x_left)

        output_tensor = tf.concat([ x[:,0:self.idx1], x_left, x[:, self.idx1+1:self.idx2]\
                                   , y_right, x[:, (self.idx2+1):]], axis = 1)
        return output_tensor
    def _inverse(self, y):
        y_left, y_right = y[:, self.idx1:(self.idx1 + 1)], y[:, self.idx2:(self.idx2 + 1)]
        x_right = (y_right - self.t(y_left)) * tf.exp(-self.s(y_left))
        output_tensor = tf.concat([ y[:, 0:self.idx1], y_left, y[:, self.idx1+1 : self.idx2]\
                                  , x_right, y[:, (self.idx2+1):]], axis = 1)
        return output_tensor
    def _forward_log_det_jacobian(self, x):
        event_dims = self._event_dims_tensor(x)
        x_left = x[:, self.idx1:(self.idx1+1)]
        return tf.reduce_sum(self.s(x_left), axis=event_dims)

但它并没有像我想象的那样工作。当我使用这个类时,它会弹出一个错误:

代码语言:javascript
复制
base_dist = tfd.MultivariateNormalDiag(loc=tf.zeros([2], DTYPE))
num_bijectors = 4
bijectors = []
bijectors.append(NVPCoupling(input_idx1=0, input_idx2=1, \
                             block_w_id=0, block_d_id=0, layer_id=0))
bijectors.append(NVPCoupling(input_idx1=1, input_idx2=0, \
                             block_w_id=0, block_d_id=0, layer_id=1))
bijectors.append(NVPCoupling(input_idx1=0, input_idx2=1, \
                             block_w_id=0, block_d_id=0, layer_id=2))
bijectors.append(NVPCoupling(input_idx1=0, input_idx2=1, \
                             block_w_id=0, block_d_id=0, layer_id=3))
flow_bijector = tfb.Chain(list(reversed(bijectors)))
dist = tfd.TransformedDistribution(
    distribution=base_dist,
    bijector=flow_bijector)
dist.sample(1000)

有错误:

代码语言:javascript
复制
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-04da05d30f8d> in <module>()
----> 1 dist.sample(1000)

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/distributions/distribution.pyc in sample(self, sample_shape, seed, name)
    708       samples: a `Tensor` with prepended dimensions `sample_shape`.
    709     """
--> 710     return self._call_sample_n(sample_shape, seed, name)
    711 
    712   def _log_prob(self, value):

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/distributions/transformed_distribution.pyc in _call_sample_n(self, sample_shape, seed, name, **kwargs)
    412       # returned result.
    413       y = self.bijector.forward(x, **kwargs)
--> 414       y = self._set_sample_static_shape(y, sample_shape)
    415 
    416       return y

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/distributions/distribution.pyc in _set_sample_static_shape(self, x, sample_shape)
   1220       shape = tensor_shape.TensorShape(
   1221           [None]*(ndims - event_ndims)).concatenate(self.event_shape)
-> 1222       x.set_shape(x.get_shape().merge_with(shape))
   1223 
   1224     # Infer batch shape.

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/tensor_shape.pyc in merge_with(self, other)
    671         return TensorShape(new_dims)
    672       except ValueError:
--> 673         raise ValueError("Shapes %s and %s are not compatible" % (self, other))
    674 
    675   def concatenate(self, other):

ValueError: Shapes (1000, 4) and (?, 2) are not compatible

我真的希望一些专家能帮助我了解我哪里做错了,以及如何纠正它。非常感谢!H.

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-05-30 07:16:20

我认为问题出在这里(为了清晰起见,稍微重新设置了格式):

代码语言:javascript
复制
output_tensor = tf.concat([
    x[:,0:self.idx1],
    x_left,
    x[:, self.idx1+1:self.idx2],
    y_right,
    x[:, (self.idx2+1):]
], axis = 1)

这假设为idx2 > idx1,这在您提供idx1=1idx2=0的情况下不是真的。这会导致你连接更多的东西,而不是2,而不是4的第二个维度。

我在_forward中打印形状,如下所示:

代码语言:javascript
复制
print("self.idx1: %s" % self.idx1)
print("self.idx2: %s" % self.idx2)
print("x[:,0:self.idx1]: %s" % x[:,0:self.idx1].shape)
print("x_left: %s" % x_left.shape)
print("x[:, self.idx1+1:self.idx2]: %s" %
      x[:, self.idx1+1:self.idx2].shape)
print("x_right.shape: %s" % x_right.shape)
print("y_right: %s" % y_right.shape)
print("x[:, (self.idx2+1):]: %s" % x[:, (self.idx2+1):].shape)
print("output_tensor.shape: %s" % output_tensor.shape)

并得到以下输出:

代码语言:javascript
复制
self.idx1: 0
self.idx2: 1
x[:,0:self.idx1]: (1000, 0)
x_left: (1000, 1)
x[:, self.idx1+1:self.idx2]: (1000, 0)
x_right.shape: (1000, 1)
y_right: (1000, 1)
x[:, (self.idx2+1):]: (1000, 0)
output_tensor.shape: (1000, 2)

self.idx1: 1
self.idx2: 0
x[:,0:self.idx1]: (1000, 1)
x_left: (1000, 1)
x[:, self.idx1+1:self.idx2]: (1000, 0)
x_right.shape: (1000, 1)
y_right: (1000, 1)
x[:, (self.idx2+1):]: (1000, 1)
output_tensor.shape: (1000, 4)

self.idx1: 0
self.idx2: 1
x[:,0:self.idx1]: (1000, 0)
x_left: (1000, 1)
x[:, self.idx1+1:self.idx2]: (1000, 0)
x_right.shape: (1000, 1)
y_right: (1000, 1)
x[:, (self.idx2+1):]: (1000, 2)
output_tensor.shape: (1000, 4)

self.idx1: 0
self.idx2: 1
x[:,0:self.idx1]: (1000, 0)
x_left: (1000, 1)
x[:, self.idx1+1:self.idx2]: (1000, 0)
x_right.shape: (1000, 1)
y_right: (1000, 1)
x[:, (self.idx2+1):]: (1000, 2)
output_tensor.shape: (1000, 4)

我认为,当idx1 > idx2时,您需要更仔细地考虑重新组装此块中的拼接部分。

希望这能让你重回正轨!

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/50537971

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档