记一次因oracle bug导致的登录数据库就hang的现象

今天记一次因oracle bug导致数据库登陆就hang住的情况,ORA—07445: exception encountered: core dump [kglic0()+788] [SIGSEGV] [ADDR:0xE49200000018] [PC:0x10123F4D4] [Address not mapped to object] []

今天八点,客户的其他业务部门说106数据库不可用了。无法登陆,在登陆的时候报错

此时,我已经确定数据库监听配置完全没有任何问题,而且lsnrct stat的监听状态一直处于正常的状态。

我通过plsql连接数据库发现可以连接,查看scanip,scanip飘在节点第二个节点上,登陆没有问题,plsql在节点2登陆的的时候也没有任何问题,但是在通过conn aqjc/12345 的时候,发现这个时候的登陆会完全hang死。查看节点2的alert日志没有任何有用的信息。

plsql登陆节点1

sqlplus / as sysdba

这个时候会完全卡死………………

查看alert日志

Wed Feb 07 09:12:29 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:13:29 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:14:59 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:15:59 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:17:40 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:18:40 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:20:10 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:21:10 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:22:41 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:23:41 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:25:21 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:26:21 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:27:52 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:28:51 2018

Errors in file /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/trace/nsbdzxdb1_ckpt_558022.trc (incident=1218553):

ORA-00445: background process "PZ98" did not start after 120 seconds

Wed Feb 07 09:28:52 2018

PMON failed to acquire latch, see PMON dump

这个报错开始于昨天夜里21:02:00

ORA-00445: background process "m001" did not start after 120 seconds

Incident details in: /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/incident/incdir_1218589/nsbdzxdb1_mmon_488078_i1218589.trc

Tue Feb 06 21:02:59 2018

Dumping diagnostic data in directory=[cdmp_20180206210259], requested by (instance=1, osid=488078 (MMON)), summary=[incident=1218589].

Tue Feb 06 21:03:39 2018

PMON failed to acquire latch, see PMON dump

Tue Feb 06 21:04:56 2018

Errors in file /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/trace/nsbdzxdb1_cjq0_385442.trc (incident=1218781):

ORA-00445: background process "J000" did not start after 120 seconds

Incident details in: /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/incident/incdir_1218781/nsbdzxdb1_cjq0_385442_i1218781.trc

Tue Feb 06 21:04:58 2018

Dumping diagnostic data in directory=[cdmp_20180206210458], requested by (instance=1, osid=385442 (CJQ0)), summary=[incident=1218781].

kkjcre1p: unable to spawn jobq slave process

Errors in file /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/trace/nsbdzxdb1_cjq0_385442.trc:

Tue Feb 06 21:05:09 2018

PMON failed to acquire latch, see PMON dump

Tue Feb 06 21:06:09 2018

PMON failed to acquire latch, see PMON dump

Tue Feb 06 21:06:59 2018

Errors in file /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/trace/nsbdzxdb1_cjq0_385442.trc (incident=1218782):

ORA-00445: background process "J000" did not start after 120 seconds

Incident details in: /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/incident/incdir_1218782/nsbdzxdb1_cjq0_385442_i1218782.trc

kkjcre1p: unable to spawn jobq slave process

Errors in file /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/trace/nsbdzxdb1_cjq0_385442.trc:

Tue Feb 06 21:07:02 2018

Dumping diagnostic data in directory=[cdmp_20180206210702], requested by (instance=1, osid=385442 (CJQ0)), summary=[incident=1218782].

Tue Feb 06 21:07:40 2018

PMON failed to acquire latch, see PMON dump

Tue Feb 06 21:08:40 2018

PMON failed to acquire latch, see PMON dump

由此看来貌似是后台进程出现了问题。PMON么有办法将他拉起来

之后再往上看发现一个很丑陋的报错

ORA-07445: exception encountered: core dump [kglic0()+788] [SIGSEGV] [ADDR:0xE49200000018] [PC:0x10123F4D4] [Address not mapped to object] []

Incident details in: /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/incident/incdir_1218805/nsbdzxdb1_m000_512814_i1218805.trc

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Tue Feb 06 19:01:00 2018

Dumping diagnostic data in directory=[cdmp_20180206190100], requested by (instance=1, osid=512814 (M000)), summary=[incident=1218805].

Tue Feb 06 19:01:01 2018

Sweep [inc][1218805]: completed

Sweep [inc2][1218805]: completed

Tue Feb 06 20:00:43 2018

Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x300000018] [PC:0x10123F4BC, kglic0()+764] [flags: 0x0, count: 1]

Errors in file /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/trace/nsbdzxdb1_m000_520268.trc (incident=1218877):

ORA-07445: exception encountered: core dump [kglic0()+764] [SIGSEGV] [ADDR:0x300000018] [PC:0x10123F4BC] [Address not mapped to object] []

Incident details in: /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/incident/incdir_1218877/nsbdzxdb1_m000_520268_i1218877.trc

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

根据这个报错查看MOS 411.1文档

这里告诉了一种解决方法,但是开发部门已经没有办法登录数据库了,时间紧任务重所以直接找一个有效的方法。

MOS当中搜索到ora-07445的报错。

这里显示为bug导致,因无法sqlplus登陆,所以现在整个操作的正常步骤是将当前节点的监听停掉,SQLplus如果还是hang住的,那么现在能解决的只有kill,等待下次启动时SMON的自动恢复。我将数据库SMON进程kill掉,之后启动数据库,问题解决,没有报错现象。

THAT'S ALL

BY CUI PEACE !!!!!!

  • 发表于:
  • 原文链接https://kuaibao.qq.com/s/20180611G11L9R00?refer=cp_1026
  • 腾讯「云+社区」是腾讯内容开放平台帐号(企鹅号)传播渠道之一,根据《腾讯内容开放平台服务协议》转载发布内容。
  • 如有侵权,请联系 yunjia_community@tencent.com 删除。

扫码关注云+社区

领取腾讯云代金券