This discussion is archived
0 Replies Latest reply: Dec 12, 2012 1:26 AM by CoolBond RSS

DBLINK,ORA-02050: 事务处理 10.30.100680 已回退, 某些远程数据库可能有问题;ORA-03135: 连接失去联系

CoolBond Newbie
Currently Being Moderated
------------------------------------------------------------------------------------

问题背景:有两套数据库服务器oracle11g RAC,版本是11.2.0.3,分别有两个用户user1,user2,user1通过跨库DBLINK(EASDB)去连接user2进行数据库查询。

偶尔会报错ORA-02050,定位了好几天,排除网线没插好、主机防火墙等因素了,就是不知道其他问题在哪儿.

奇怪的是反向的DBLINK,就没有类似问题

-----------------------------------------------------------------------------------
报错内容:

Mon Dec 10 09:58:21 2012
Error 3135 trapped in 2PC on transaction 10.30.100680. Cleaning up.
Error stack returned to user:
ORA-02050: 事务处理 10.30.100680 已回退, 某些远程数据库可能有问题
ORA-03135: 连接失去联系
ORA-02063: 紧接着 line (起自 EASDB)
Mon Dec 10 09:58:21 2012
DISTRIB TRAN BPPFRDB.59c536a9.10.30.100680
is local tran 10.30.100680 (hex=0a.1e.18948)
insert pending collecting tran, scn=13074744378110 (hex=be4.337e82fe)
Mon Dec 10 10:44:58 2012
DISTRIB TRAN BPPFRDB.59c536a9.31.13.2318
is local tran 31.13.2318 (hex=1f.0d.90e))
delete pending collecting tran, scn=13074744375475 (hex=be4.337e78b3)
Mon Dec 10 11:00:39 2012
DISTRIB TRAN BPPFRDB.59c536a9.10.30.100680
is local tran 10.30.100680 (hex=0a.1e.18948))
delete pending collecting tran, scn=13074744378110 (hex=be4.337e82fe)
----------------------------------------------------------------------

DBLINK创建脚本:

drop public database link EASDB;
create public database link EASDB connect to USER2 identified by "password"
using '(DESCRIPTION =(ADDRESS = (PROTOCOL = TCP)(HOST = 132.97.122.1)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 132.97.122.2)(PORT = 1521))(LOAD_BALANCE = yes)
(CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME = bppfedb)(FAILOVER_MODE =
(TYPE = SELECT)(METHOD = BASIC)(RETRIES = 180)(DELAY = 5))))';

IP没有使用RAC的vip,而是物理IP

---------------------------------------------------------------------

日志收集

1. reco后台进程的trace
bppfrdb1_reco_18133.trc如下:
*** 2012-12-07 18:25:11.899
DISTRIB TRAN BPPFRDB.59c536a9.49.11.975
is local tran 49.11.975 (hex=31.0b.3cf))
delete pending collecting tran, scn=13074738503199 (hex=be4.3324de1f)
DISTRIB TRAN BPPFRDB.59c536a9.31.13.2318

*** 2012-12-10 10:44:58.406
is local tran 31.13.2318 (hex=1f.0d.90e))
delete pending collecting tran, scn=13074744375475 (hex=be4.337e78b3)
DISTRIB TRAN BPPFRDB.59c536a9.10.30.100680

*** 2012-12-10 11:00:39.084
is local tran 10.30.100680 (hex=0a.1e.18948))
delete pending collecting tran, scn=13074744378110 (hex=be4.337e82fe)
2. alert 日志
alert_bppfrdb1.log如下:
Mon Dec 10 07:00:40 2012
Archived Log entry 3865 added for thread 1 sequence 1842 ID 0x9a632295 dest 1:
Mon Dec 10 07:59:53 2012
Time drift detected. Please check VKTM trace file for more details.
Mon Dec 10 09:58:21 2012
Error 3135 trapped in 2PC on transaction 10.30.100680. Cleaning up.
Error stack returned to user:
ORA-02050: 事务处理 10.30.100680 已回退, 某些远程数据库可能有问题
ORA-03135: 连接失去联系
ORA-02063: 紧接着 line (起自 EASDB)
Mon Dec 10 09:58:21 2012
DISTRIB TRAN BPPFRDB.59c536a9.10.30.100680
is local tran 10.30.100680 (hex=0a.1e.18948)
insert pending collecting tran, scn=13074744378110 (hex=be4.337e82fe)
Mon Dec 10 10:44:58 2012
DISTRIB TRAN BPPFRDB.59c536a9.31.13.2318
is local tran 31.13.2318 (hex=1f.0d.90e))
delete pending collecting tran, scn=13074744375475 (hex=be4.337e78b3)
Mon Dec 10 11:00:39 2012
DISTRIB TRAN BPPFRDB.59c536a9.10.30.100680
is local tran 10.30.100680 (hex=0a.1e.18948))
delete pending collecting tran, scn=13074744378110 (hex=be4.337e82fe)


***********************************************************************

Fatal NI connect error 12170.

VERSION INFORMATION:
TNS for Linux: Version 11.2.0.3.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.3.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.3.0 - Production
Time: 10-DEC-2012 11:41:59
Tracing not turned on.
Tns error struct:
ns main err code: 12535

TNS-12535: TNS:operation timed out
ns secondary err code: 12560
nt main err code: 505

TNS-00505: Operation timed out
nt secondary err code: 110
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=132.97.122.113)(PORT=47823))
Mon Dec 10 11:41:59 2012
***********************************************************************

Fatal NI connect error 12170.

VERSION INFORMATION:
TNS for Linux: Version 11.2.0.3.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.3.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.3.0 - Production
Time: 10-DEC-2012 11:41:59
Tracing not turned on.
Tns error struct:
ns main err code: 12535

TNS-12535: TNS:operation timed out
ns secondary err code: 12560
nt main err code: 505

TNS-00505: Operation timed out
nt secondary err code: 110
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=132.97.122.113)(PORT=47824))
Mon Dec 10 11:41:59 2012


***********************************************************************
3. ping测试网络日志

ping的输出信息如下:
[oracle@bppfrdb1 jc]$ ping -s 1024 -c 30 132.97.122.1
PING 132.97.122.1 (132.97.122.1) 1024(1052) bytes of data.
1032 bytes from 132.97.122.1: icmp_seq=1 ttl=64 time=0.473 ms
1032 bytes from 132.97.122.1: icmp_seq=2 ttl=64 time=0.543 ms
1032 bytes from 132.97.122.1: icmp_seq=3 ttl=64 time=0.622 ms
1032 bytes from 132.97.122.1: icmp_seq=4 ttl=64 time=0.538 ms
1032 bytes from 132.97.122.1: icmp_seq=5 ttl=64 time=0.570 ms
1032 bytes from 132.97.122.1: icmp_seq=6 ttl=64 time=0.544 ms
1032 bytes from 132.97.122.1: icmp_seq=7 ttl=64 time=0.511 ms
1032 bytes from 132.97.122.1: icmp_seq=8 ttl=64 time=0.550 ms
1032 bytes from 132.97.122.1: icmp_seq=9 ttl=64 time=0.469 ms
1032 bytes from 132.97.122.1: icmp_seq=10 ttl=64 time=0.574 ms
1032 bytes from 132.97.122.1: icmp_seq=11 ttl=64 time=0.568 ms
1032 bytes from 132.97.122.1: icmp_seq=12 ttl=64 time=0.632 ms
1032 bytes from 132.97.122.1: icmp_seq=13 ttl=64 time=0.522 ms
1032 bytes from 132.97.122.1: icmp_seq=14 ttl=64 time=0.420 ms
1032 bytes from 132.97.122.1: icmp_seq=15 ttl=64 time=0.474 ms
1032 bytes from 132.97.122.1: icmp_seq=16 ttl=64 time=0.506 ms
1032 bytes from 132.97.122.1: icmp_seq=17 ttl=64 time=0.573 ms
1032 bytes from 132.97.122.1: icmp_seq=18 ttl=64 time=0.582 ms
1032 bytes from 132.97.122.1: icmp_seq=19 ttl=64 time=0.496 ms
1032 bytes from 132.97.122.1: icmp_seq=20 ttl=64 time=0.520 ms
1032 bytes from 132.97.122.1: icmp_seq=21 ttl=64 time=0.517 ms
1032 bytes from 132.97.122.1: icmp_seq=22 ttl=64 time=0.609 ms
1032 bytes from 132.97.122.1: icmp_seq=23 ttl=64 time=0.505 ms
1032 bytes from 132.97.122.1: icmp_seq=24 ttl=64 time=0.527 ms
1032 bytes from 132.97.122.1: icmp_seq=25 ttl=64 time=0.532 ms
1032 bytes from 132.97.122.1: icmp_seq=26 ttl=64 time=0.613 ms
1032 bytes from 132.97.122.1: icmp_seq=27 ttl=64 time=0.537 ms
1032 bytes from 132.97.122.1: icmp_seq=28 ttl=64 time=0.500 ms
1032 bytes from 132.97.122.1: icmp_seq=29 ttl=64 time=0.565 ms
1032 bytes from 132.97.122.1: icmp_seq=30 ttl=64 time=0.519 ms

--- 132.97.122.1 ping statistics ---
30 packets transmitted, 30 received, 0% packet loss, time 29005ms
rtt min/avg/max/mdev = 0.420/0.537/0.632/0.048 ms

4. traceroute信息如下:
[oracle@bppfrdb1 jc]$ traceroute 132.97.122.1 -p 1521

traceroute to 132.97.122.1 (132.97.122.1), 30 hops max, 60 byte packets
1 132.97.122.1 (132.97.122.1) 0.803 ms 0.785 ms 0.772 ms



traceroute to 132.97.122.1 (132.97.122.1), 30 hops max, 60 byte packets
1 132.97.122.1 (132.97.122.1) 0.560 ms 0.543 ms 0.533 ms


traceroute to 132.97.122.1 (132.97.122.1), 30 hops max, 60 byte packets
1 132.97.122.1 (132.97.122.1) 0.462 ms 0.456 ms *


traceroute to 132.97.122.1 (132.97.122.1), 30 hops max, 60 byte packets
1 132.97.122.1 (132.97.122.1) 0.740 ms 0.741 ms 0.733 ms


traceroute to 132.97.122.1 (132.97.122.1), 30 hops max, 60 byte packets
1 132.97.122.1 (132.97.122.1) 0.350 ms * *


traceroute to 132.97.122.1 (132.97.122.1), 30 hops max, 60 byte packets
1 132.97.122.1 (132.97.122.1) 0.602 ms 0.587 ms 0.580 ms



traceroute to 132.97.122.1 (132.97.122.1), 30 hops max, 60 byte packets
1 132.97.122.1 (132.97.122.1) 0.757 ms 0.745 ms *


其他说明:
这个报错不是每次查询都出现,时有时无,一些查询会失败;

有达人说,是分布式事务,二阶段提交失败,但是没搞懂,为啥会提交失败?提交失败也是一个问题吧


谢谢各位了

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points