前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >cnn+rnn+attention

cnn+rnn+attention

作者头像
MachineLP
发布2018-01-09 15:39:38
1.5K0
发布2018-01-09 15:39:38
举报
文章被收录于专栏:小鹏的专栏小鹏的专栏

下面是单层rnn+attention的代码,若考虑多层rnn请参考博主的:tf.contrib.rnn.static_rnn与tf.nn.dynamic_rnn区别

代码语言:javascript
复制
def attention(inputs, attention_size, time_major=False):
    if isinstance(inputs, tuple):
        # In case of Bi-RNN, concatenate the forward and the backward RNN outputs.
        inputs = tf.concat(inputs, 2)

    if time_major:
        # (T,B,D) => (B,T,D)
        inputs = tf.transpose(inputs, [1, 0, 2])

    inputs_shape = inputs.shape
    sequence_length = inputs_shape[1].value  # the length of sequences processed in the antecedent RNN layer
    hidden_size = inputs_shape[2].value  # hidden size of the RNN layer

    # Attention mechanism
    W_omega = tf.Variable(tf.random_normal([hidden_size, attention_size], stddev=0.1))
    b_omega = tf.Variable(tf.random_normal([attention_size], stddev=0.1))
    u_omega = tf.Variable(tf.random_normal([attention_size], stddev=0.1))

    v = tf.tanh(tf.matmul(tf.reshape(inputs, [-1, hidden_size]), W_omega) + tf.reshape(b_omega, [1, -1]))
    vu = tf.matmul(v, tf.reshape(u_omega, [-1, 1]))
    exps = tf.reshape(tf.exp(vu), [-1, sequence_length])
    alphas = exps / tf.reshape(tf.reduce_sum(exps, 1), [-1, 1])

    # Output of Bi-RNN is reduced with attention vector
    output = tf.reduce_sum(inputs * tf.reshape(alphas, [-1, sequence_length, 1]), 1)

    return output


# cnn的输出为B*4*8*128
chunk_size = 128  
chunk_n = 32  
rnn_size = 256  
attention_size = 50
n_output_layer = MAX_CAPTCHA*CHAR_SET_LEN   # 输出层  

# 定义待训练的神经网络  
def recurrent_neural_network(): 
    data = crack_captcha_cnn()    
   
    data = tf.reshape(data, [-1, chunk_n, chunk_size])  
    data = tf.transpose(data, [1,0,2])  
    data = tf.reshape(data, [-1, chunk_size])  
    data = tf.split(data,chunk_n)
    
    # 只用RNN
    #layer = {'w_':tf.Variable(tf.random_normal([rnn_size, n_output_layer])), 'b_':tf.Variable(tf.random_normal([n_output_layer]))} 
    #lstm_cell = tf.contrib.rnn.BasicLSTMCell(rnn_size) 
    #outputs, status = tf.contrib.rnn.static_rnn(lstm_cell, data, dtype=tf.float32)  
    #ouput = tf.add(tf.matmul(outputs[-1], layer['w_']), layer['b_'])  
    
    # RNN + Attention    
    lstm_cell = tf.contrib.rnn.BasicLSTMCell(rnn_size) 
    outputs, status = tf.contrib.rnn.static_rnn(lstm_cell, data, dtype=tf.float32)     
    attention_output = attention(outputs, attention_size, True)
    
    # output
    drop = tf.nn.dropout(attention_output, keep_prob)
    # Fully connected layer
    W = tf.Variable(tf.truncated_normal([rnn_size, n_output_layer], stddev=0.1), name="W")
    b = tf.Variable(tf.constant(0., shape=[n_output_layer]), name="b")
    
    ouput = tf.nn.xw_plus_b(drop, W, b, name="scores")
    return ouput
本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2017年11月10日,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档