跨NAT,防火墙(firewall)的RAC监听配置(ORA-12545)

    对于存在NAT或防火墙的RAC数据库,在启用了服务器端的load balance后,经常会碰到ORA-12545连接错误,这是因为服务器端转发客户端连接请求到其它节点后,客户端使用返回的IP再次发出连接请求而出现不可识别的IP地址或主机名而造成的。本文描述了这个问题并给出了解决方案。

  有关RAC监听配置请参考 ORACLE RAC 监听配置 (listener.ora tnsnames.ora) ORACLE RAC 下非缺省端口监听配置(listener.ora tnsnames.ora)

一、NAT下监听描述图

二、配置情况

1、服务器端的配置情况
 ################Server node 1 listener.ora########################
 LISTENER_VMDB01P =
   (DESCRIPTION_LIST =
     (DESCRIPTION =
       (ADDRESS = (PROTOCOL = TCP)(HOST = vmdb01pvip)(PORT = 1314)(IP = FIRST))
       (ADDRESS = (PROTOCOL = TCP)(HOST = vmdb01p)(PORT = 1314)(IP = FIRST))
     )
   )
 
 SID_LIST_LISTENER_VMDB01P =
   (SID_LIST =
     (SID_DESC =
       (SID_NAME = PLSExtProc)
       (ORACLE_HOME = /u01/oracle/db)
       (PROGRAM = extproc)
     )
   )
 
 ################Server node 2 listener.ora########################
 LISTENER_VMDB02P =
   (DESCRIPTION_LIST =
     (DESCRIPTION =
       (ADDRESS = (PROTOCOL = TCP)(HOST = vmdb02pvip)(PORT = 1314)(IP = FIRST))
       (ADDRESS = (PROTOCOL = TCP)(HOST = vmdb02p)(PORT = 1314)(IP = FIRST))
     )
   )
 
 SID_LIST_LISTENER_VMDB02P =
   (SID_LIST =
     (SID_DESC =
       (SID_NAME = PLSExtProc)
       (ORACLE_HOME = /u01/oracle/db)
       (PROGRAM = extproc)
     )
   )
 
 ################Server side tnsnames.ora, same for both nodes########################
 remote_lsnr_lm5330 =
   (ADDRESS_LIST =
     (ADDRESS = (PROTOCOL = TCP)(HOST = vmdb01pvip)(PORT = 1314))
     (ADDRESS = (PROTOCOL = TCP)(HOST = vmdb02pvip)(PORT = 1314))
   )
 
 local_lsnr_lm5330a =
   (ADDRESS_LIST =
     (ADDRESS = (PROTOCOL = TCP)(HOST = vmdb01pvip)(PORT = 1314))
   )
 
 local_lsnr_lm5330b =
   (ADDRESS_LIST =
     (ADDRESS = (PROTOCOL = TCP)(HOST = vmdb02pvip)(PORT = 1314))
   )
 
 ################Server side host table, same for both nodes######################## 
 127.0.0.1          localhost
 10.200.48.17    vmdb01p.oradb.com   vmdb01p
 10.200.48.18    vmdb02p.oradb.com   vmdb02p
 10.200.48.15    vmdb01pvip.oradb.com        vmdb01pvip
 10.200.48.16    vmdb02pvip.oradb.com        vmdb02pvip
 192.168.48.17   vmdb01pph.oradb.com vmdb01pph
 192.168.48.18   vmdb02pph.oradb.com vmdb02pph
 
 SQL> show parameter instance_name
 
 NAME                                 TYPE                              VALUE
 ------------------------------------ ------------------- ------------------
 instance_name                        string              LM5330A
 SQL> show parameter listener
 
 NAME                                 TYPE                VALUE
 ------------------------------------ ------------------- -------------------
 local_listener                       string              local_lsnr_lm5330a
 remote_listener                      string              remote_lsnr_lm5330
 
 SQL> show parameter instance_name
 
 NAME                                 TYPE                VALUE
 ------------------------------------ ------------------- ------------------
 instance_name                        string              LM5330B
 SQL> show parameter listener
 
 NAME                                 TYPE                VALUE
 ------------------------------------ ------------------- ------------------
 local_listener                       string              local_lsnr_lm5330b
 remote_listener                      string              remote_lsnr_lm5330

2、客户端tnsnames.ora配置 
 #For NAT client 
 LM5330TKO =
   (DESCRIPTION =
     (ADDRESS = (PROTOCOL = TCP)(HOST = 10.255.48.15)(PORT = 1314))
     (ADDRESS = (PROTOCOL = TCP)(HOST = 10.255.48.16)(PORT = 1314))
     (LOAD_BALANCE = yes)
     (CONNECT_DATA =
       (SERVER = DEDICATED)
       (SERVICE_NAME = LM5330)
        (FAILOVER_MODE =                    
         (TYPE = SELECT)                                                           
         (METHOD = BASIC)                            
         (RETRIES = 180)                                                           
         (DELAY = 5) 
          ) 
     )
   )
 
 #For non NAT client
 LM5330VIP =
   (DESCRIPTION =
     (ADDRESS = (PROTOCOL = TCP)(HOST = 10.200.48.15)(PORT = 1314))
     (ADDRESS = (PROTOCOL = TCP)(HOST = 10.200.48.16)(PORT = 1314))
     (LOAD_BALANCE = yes)
     (CONNECT_DATA =
       (SERVER = DEDICATED)
       (SERVICE_NAME = LM5330)
        (FAILOVER_MODE =                    
         (TYPE = SELECT)                                                           
         (METHOD = BASIC)                            
         (RETRIES = 180)                                                           
         (DELAY = 5) 
          ) 
     )
   )

3、客户端hosts文件配置
 c:\windows\system32\drivers\etc\hosts
 10.200.48.15    vmdb01pvip.oradb.com        vmdb01pvip
 10.200.48.16    vmdb02pvip.oradb.com        vmdb02pvip

三、ORA-12545错误信息

C:\Users\Administrator>sqlplus system/system@lm5330tko
 
 SQL*Plus: Release 10.2.0.3.0 - Production on 星期二 11月 27 16:01:23 2012
 
 Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.
 
 ERROR:
 ORA-12545: Connect failed because target host or object does not exist
 
 对于客户端发起的连接请求,有时候可以成功建立连接,有时候无法建立连接并伴随给出ORA-12545错误
 
 oracle@bo2dbp:~> oerr ora 12545
 12545, 00000, "Connect failed because target host or object does not exist"
 // *Cause: The address specified is not valid, or the program being 
 // connected to does not exist.
 // *Action: Ensure the ADDRESS parameters have been entered correctly; the
 // most likely incorrect parameter is the node name.  Ensure that the 
 // executable for the server exists (perhaps "oracle" is missing.)
 // If the protocol is TCP/IP, edit the TNSNAMES.ORA file to change the
 // host name to a numeric IP address and try again.
 
 从ORA-12545错误描述来看,主要是由于解析的问题产生该类错误

四、分析与解决

1、启用sqlnet trace
 为客户端的sqlnet.ora文件添加下列项来启用sqlnet trace,trace后,建议移除这些选项以避免额外的网络开销
 Trace_level_client=16
 Trace_directory_client=<path_to_the_trace_directory> # use the full path to the trace directory
 Trace_unique_client=on
 Trace_timestamp_client=on
 Diag_adr_enabled=off
 
 #Author : Robinson
 #Blog : http://blog.csdn.net/robinson_0612
 
 下面是trace文件提供的信息,关于如何做 oracle net trace 请参考:http://blog.csdn.net/robinson_0612/article/details/8254720
 文件名:cli_10648_1.trc
 
 [27-11-2012 17:59:01:694] nsc2addr: (ADDRESS=(PROTOCOL=TCP)(HOST=vmdb02pvip)(PORT=1314))
 [27-11-2012 17:59:01:694] nttbnd2addr: entry
 [27-11-2012 17:59:01:694] snlinGetAddrInfo: entry
 [27-11-2012 17:59:01:694] snlinGetAddrInfo: Invalid IP address string vmdb02pvip
 [27-11-2012 17:59:01:694] snlinFreeAddrInfo: entry
 [27-11-2012 17:59:01:694] snlinFreeAddrInfo: exit
 [27-11-2012 17:59:01:694] snlinGetAddrInfo: exit
 [27-11-2012 17:59:01:694] nttbnd2addr: looking up IP addr for host: vmdb02pvip
 [27-11-2012 17:59:01:694] snlinGetAddrInfo: entry
 [27-11-2012 17:59:04:259] snlinGetAddrInfo: Name resolution failed for vmdb02pvip
 [27-11-2012 17:59:04:259] snlinFreeAddrInfo: entry
 [27-11-2012 17:59:04:259] snlinFreeAddrInfo: exit
 [27-11-2012 17:59:04:259] snlinGetAddrInfo: exit
 [27-11-2012 17:59:04:259] nttbnd2addr: *** hostname lookup failure! ***
 [27-11-2012 17:59:04:259] nttbnd2addr: exit
 [27-11-2012 17:59:04:259] nserror: entry
 [27-11-2012 17:59:04:259] nserror: nsres: id=0, op=77, ns=12545, ns2=12560; nt[0]=515, nt[1]=1004, nt[2]=0; 
    ora[0]=0, ora[1]=0, ora[2]=0
 [27-11-2012 17:59:04:259] nsc2addr: error exit

2、故障分析
 #分析上面的trace信息,trace信息中指出了无法解析主机vmdb02pvip
 #客户端显示的12545错误的发生正式由于这个vmdb02pvip无法解析。
 #那有时候可以连接成功是怎么一回事呢? 
 #当客户端发起连接请求,由于客户端配置了load balance,因此客户端会随机从ADDRESS列表中挑选一个IP进行连接。
 #假定当前选择从NAT IP 10.255.48.15发出连接请求,10.255.48.15会被自动映射到相应的virtual ip,即10.200.48.15。
 #当10.200.48.15上的监听器捕获到该请求,服务器端的load balance生效,pmon进程根据监听信息发现当前节点并不繁忙,直接建立连接。
 #上面的描述就是可以成功建立连接的情形。
 
 #当服务器的监听发现其余的节点空闲时,而当前节点繁忙,监听器将实现转发。
 #也就是说服务器端返回一个新的IP地址给客户端,说我比较忙,你连接到10.200.48.16吧。
 #于是客户端根据所提供的IP地址重新发起连接请求,而此时客户端到10.200.48.16根本不可达,即使将ADDRESS列表改为该IP。
 #所以上面的分析是产生ORA-12545的原因。

3、解决
 #下面通过tracert命令来检查节点是否可达

 C:\Users\Administrator>tracert vmdb02pvip
 
 Tracing route to vmdb02pvip [10.200.48.16]
 over a maximum of 30 hops:
 
   1     1 ms     1 ms     1 ms  192.168.9.254
   2    <1 ms    <1 ms    <1 ms  10.97.5.1
   3     *        *        *     Request timed out.   #超时,节点不可达
   4     *        *        *     Request timed out.
   5     *        *        *     Request timed out.
   6     *        *        *     Request timed out.

 #考虑将客户端hosts文件作下列更改,即将映射关系的IP直接改为NAT的IP    
 c:\windows\system32\drivers\etc\hosts
 #10.200.48.15    vmdb01pvip.oradb.com        vmdb01pvip
 #10.200.48.16    vmdb02pvip.oradb.com        vmdb02pvip
 
 10.255.48.15    vmdb01pvip.oradb.com        vmdb01pvip
 10.255.48.16    vmdb02pvip.oradb.com        vmdb02pvip

 #再次tracert
 C:\Users\Administrator>tracert vmdb02pvip
 
 Tracing route to v2048db02pvip [10.255.48.16]
 over a maximum of 30 hops:
 
   1    <1 ms     1 ms     1 ms  192.168.9.254
   2    <1 ms    <1 ms    <1 ms  10.97.5.1
   3     8 ms     5 ms     4 ms  10.100.34.4
   4     6 ms     4 ms     4 ms  vmdb02pvip [10.255.48.16]
   5     8 ms     4 ms     4 ms  vmdb02pvip [10.255.48.16]
 
 Trace complete.
 
 #再次建立连接时,不再出现ORA-12545 
 C:\Users\Administrator>netstat
 
 Active Connections
 
   Proto  Local Address          Foreign Address        State
   TCP    192.168.9.12:59541     vmdb01pvip:1314     ESTABLISHED
   TCP    192.168.9.12:59558     vmdb02pvip:1314     ESTABLISHED

 其次也可以通过禁用服务器端的load balance来避免该错误,当然此方法不被推荐。
  alter system set remote_listener='' sid='*';

 也可以通过配置CMAN(Oracle Connection Manager)来解决该问题    

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏xingoo, 一个梦想做发明家的程序员

Oracle修改监听IP地址

oracle 11g断网安装时,没有检测net的功能,所以安装完后,netstat -an 发现自动监听的是127.0.0.1:1521,这样安装完成后,其他的...

20980
来自专栏乐沙弥的世界

TNS-01201: Listener cannot find executablen 错误

    最近在启动监听器的时候收到了TNS-01201: Listener cannot find executable...的错误提示。这个错误还真是一个一直...

11720
来自专栏杨建荣的学习笔记

一条细小的报警短信的处理(r6笔记第96天)

最近偶尔会收到一封报警短信,提示内容大体如下, xxxx,trc_directory (TNS-1190),log_directory(TNS-1190),Pl...

38480
来自专栏xingoo, 一个梦想做发明家的程序员

Oracle 监听器无法启动(TNS-12537,TNS-12560,TNS-00507)

Oracle启动监听报错,提示 连接中断 [oracle@localhost ~]$ lsnrctl start LSNRCTL for Linux: Ver...

316100
来自专栏有困难要上,没有困难创造困难也要上!

Linux下启动Oracle服务

1K20
来自专栏乐沙弥的世界

使用 SQLNET.EXPIRE_TIME 清除僵死连接

    数据库连接的客户端异常断开后,其占有的相应并没有被释放,如从v$session视图中依旧可以看到对应的session处于inactive,且对应的服务器...

32620
来自专栏乐沙弥的世界

ORA-00119,ORA-00132 错误处理

最近系统启动时,收到了ORA-00119以及ORA-00132的错误,该错误实际上跟LISTENER有关,通常的处理办法是将spfile转储为pfile然后从...

13120
来自专栏数据库新发现

通过SQLNET.ora文件限制Ip地址访问

链接:http://www.eygle.com/archives/2008/01/sqlnetora_ip_limit.html

19930
来自专栏乐沙弥的世界

记一次离奇的TNS-12545 TNS-12560 TNS-00515

      最近reportDB监听无法随系统自启动,现象比较怪异。因为该服务器上的另一个实例的监听可以正常启动,这个不能自启动实例的监听手动启动又是正常的。因...

14420
来自专栏杨建荣的学习笔记

一条执行4秒的sql语句导致的系统问题(r3笔记第10天)

一般来说一条sql语句执行个4秒钟是可以接受的,没有什么问题,但是如果应该执行1秒,却执行了4秒,问题就挺大的了。 今天查看数据库负载,发现在中午12:00 ...

38180

扫码关注云+社区

领取腾讯云代金券