yangtingkun
===========================================================
ORA-7445(evaopn2)错误
===========================================================

alert文件中发现这个ORA-7445错误,而且错误可以重现。


Oracle 9204 for Solaris下执行下面的SQL报错:

SQL> SELECT COUNT(*)
2 FROM
3 (
4 SELECT A.ID,C.PLAT_NAME PROVINCENAME, A.PROJECT_NAME,
5 DECODE(A.REPORT_TYPE,'1','BUYER','2','ORG','3','OTHER') REPORT_TYPE,
6 A.LINKER_NAME,A.MOBILE,
7 SUM(B.MONEY) MONEY,A.MONEY_REMARK
8 FROM AUD_PROJECT A LEFT JOIN PLT_PLAT D
9 ON D.ID = A.CITY_ID
10 INNER JOIN AUD_PROJECT_ITEM B
11 ON A.ID = B.PROJECT_ID
12 INNER JOIN PLT_PLAT C
13 ON A.PLAT_ID = C.ID
14 GROUP BY A.ID,
15 C.PLAT_NAME,
16 A.PROJECT_NAME,
17 A.REPORT_TYPE,
18 A.LINKER_NAME,
19 A.MOBILE,
20 A.MONEY_REMARK
21 );
SELECT COUNT(*)
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel

对应的alert文件中错误如下:

Errors in file /u1/oracle/admin/repdb01/udump/repdb01_ora_20739.trc:
ORA-07445: exception encountered: core dump [00000001009AE204] [SIGSEGV] [Address not mapped to object] [0x000000000] [] []

trace文件中的详细信息如下:

bash-2.03$ more /u1/oracle/admin/repdb01/udump/repdb01_ora_20739.trc
/u1/oracle/admin/repdb01/udump/repdb01_ora_20739.trc
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production
ORACLE_HOME = /data/oracle/product/920
System name: SunOS
Node name: newreport
Release: 5.8
Version: Generic_117350-26
Machine: sun4u
Instance name: repdb01
Redo thread mounted by this instance: 1
Oracle process number: 18
Unix process pid: 20739, image: oracle@newreport (TNS V1-V3)

*** 2008-06-05 15:22:02.785
*** SESSION ID:(26.294) 2008-06-05 15:22:02.750
Exception signal: 11 (SIGSEGV), code: 1 (Address not mapped to object), addr: 0x0, PC: [0x1009ae204, 00000001009AE204]
*** 2008-06-05 15:22:02.788
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [00000001009AE204] [SIGSEGV] [Address not mapped to object] [0x000000000] [] []
Current SQL statement for this session:
SELECT COUNT(*)
FROM
(
SELECT A.ID,C.PLAT_NAME PROVINCENAME, A.PROJECT_NAME,
DECODE(A.REPORT_TYPE,'1','BUYER','2','ORG','3','OTHER') REPORT_TYPE,
A.LINKER_NAME,A.MOBILE,
SUM(B.MONEY) MONEY,A.MONEY_REMARK
FROM AUD_PROJECT A LEFT JOIN PLT_PLAT D
ON D.ID = A.CITY_ID
INNER JOIN AUD_PROJECT_ITEM B
ON A.ID = B.PROJECT_ID
INNER JOIN PLT_PLAT C
ON A.PLAT_ID = C.ID
GROUP BY A.ID,
C.PLAT_NAME,
A.PROJECT_NAME,
A.REPORT_TYPE,
A.LINKER_NAME,
A.MOBILE,
A.MONEY_REMARK
)
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp()+328 CALL ksedst()+0 00000000B ? 000000000 ?
000000000 ? 00000004A ?
FFFFFFFF7FFF9778 ?
1031D56C8 ?
ssexhd()+604 CALL ksedmp()+0 000000000 ? 000103400 ?
0001035D9 ? 000102C00 ?
1035D9000 ? 1035D9C28 ?
sigacthandler()+44 PTR_CALL 0000000000000000 1035E1000 ?
FFFFFFFF7FFFA710 ?
000000000 ? 000000001 ?
1035DEDD8 ? 00000000B ?
evaopn2()+196 PTR_CALL 0000000000000000 00000000B ?
FFFFFFFF7FFFA710 ?
FFFFFFFF7FFFA430 ?
00000000B ?
14D61000D0DE76BD ?
5E9E0F7A459B8AAE ?
qergsRowP()+1136 CALL evaopn2()+0 6AD0F99F0 ? 00000000C ?
1035E1000 ? 000000001 ?
080000002 ?
FFFFFFFF7CF69070 ?

对于这个错误函数evaopn2,查询了METALINK文档,发现文档Doc ID: Note:310099.1汇总了关于这个7445的已知的40多个bug

经过一个个的排除,发现当前的问题和这些bug都不一样。根据METALINKbug描述,这个错误函数相关的绝大部分bug和函数索引、星型连接、树型查询、分区表等问题有关。

看来从METALINK无法获取什么帮助了,只能自己来诊断,好在这个问题是可重现的,只要通过修改SQL,检查错误发生与否,就可以判断问题大概是由于什么引起的,经过简单的测试,发现问题和SQL标准连接写法、GROUP BYDECODE函数有关,去掉三个条件中的任意一个,问题不再重现:

SQL> SELECT COUNT(*)
2 FROM
3 (
4 SELECT A.ID,C.PLAT_NAME PROVINCENAME, A.PROJECT_NAME,
5 DECODE(A.REPORT_TYPE,'1','BUYER','2','ORG','3','OTHER') REPORT_TYPE,
6 A.LINKER_NAME,A.MOBILE,
7 SUM(B.MONEY) MONEY,A.MONEY_REMARK
8 FROM AUD_PROJECT A, PLT_PLAT D, AUD_PROJECT_ITEM B, PLT_PLAT C
9 WHERE D.ID(+) = A.CITY_ID
10 AND A.ID = B.PROJECT_ID
11 AND A.PLAT_ID = C.ID
12 GROUP BY A.ID,
13 C.PLAT_NAME,
14 A.PROJECT_NAME,
15 A.DATA_TYPE,
16 A.REPORT_TYPE,
17 A.LINKER_NAME,
18 A.MOBILE,
19 A.MONEY_REMARK
20 );

COUNT(*)
----------
4

SQL> SELECT COUNT(*)
2 FROM
3 (
4 SELECT A.ID,C.PLAT_NAME PROVINCENAME, A.PROJECT_NAME,
5 A.LINKER_NAME,A.MOBILE,
6 SUM(B.MONEY) MONEY, A.MONEY_REMARK
7 FROM AUD_PROJECT A LEFT JOIN PLT_PLAT D
8 ON D.ID = A.CITY_ID
9 INNER JOIN AUD_PROJECT_ITEM B
10 ON A.ID = B.PROJECT_ID
11 INNER JOIN PLT_PLAT C
12 ON A.PLAT_ID = C.ID
13 GROUP BY A.ID,
14 C.PLAT_NAME,
15 A.PROJECT_NAME,
16 A.REPORT_TYPE,
17 A.LINKER_NAME,
18 A.MOBILE,
19 A.MONEY_REMARK
20 );

COUNT(*)
----------
4

SQL> SELECT COUNT(*)
2 FROM
3 (
4 SELECT A.ID,C.PLAT_NAME PROVINCENAME, A.PROJECT_NAME,
5 DECODE(A.REPORT_TYPE,'1','BUYER','2','ORG','3','OTHER') REPORT_TYPE,
6 A.LINKER_NAME,A.MOBILE,
7 A.MONEY_REMARK
8 FROM AUD_PROJECT A LEFT JOIN PLT_PLAT D
9 ON D.ID = A.CITY_ID
10 INNER JOIN AUD_PROJECT_ITEM B
11 ON A.ID = B.PROJECT_ID
12 INNER JOIN PLT_PLAT C
13 ON A.PLAT_ID = C.ID
14 );

COUNT(*)
----------
19

上面三个SQL分别将SQL标准关联写法改为Oracle的默认写法、去掉了DECODE语句、去掉了SUMGROUP BY语句,无法哪种修改方法,Oracle都可以正常执行。

至于避免这个错误的最简单方法,莫过于使用Oracle的默认连接方式,而不要使用标准连接写法。虽然Oracle支持标准SQL连接写法,但是这种方式较少使用,因此出错的可能性会比较大,而且配合其他Oracle特有的功能,比如DECODE函数,就更容易出现异常了。

下面进一步精简SQL,看看问题到底是什么引起的:

SQL> SELECT COUNT(*)
2 FROM
3 (
4 SELECT A.ID,A.PROJECT_NAME,
5 DECODE(A.REPORT_TYPE,'1','BUYER','2','ORG','3','OTHER') REPORT_TYPE,
6 SUM(B.MONEY) MONEY
7 FROM AUD_PROJECT A LEFT JOIN PLT_PLAT D
8 ON D.ID = A.CITY_ID
9 INNER JOIN AUD_PROJECT_ITEM B
10 ON A.ID = B.PROJECT_ID
11 GROUP BY A.ID,
12 A.PROJECT_NAME,
13 A.REPORT_TYPE,
14 A.MOBILE
15 );
SELECT COUNT(*)
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel

4张关联表减少到3个,问题依旧:

SQL> SELECT COUNT(*)
2 FROM
3 (
4 SELECT A.ID,A.PROJECT_NAME,
5 DECODE(A.REPORT_TYPE,'1','BUYER','2','ORG','3','OTHER') REPORT_TYPE,
6 SUM(B.MONEY) MONEY
7 FROM AUD_PROJECT A INNER JOIN AUD_PROJECT_ITEM B
8 ON A.ID = B.PROJECT_ID
9 GROUP BY A.ID,
10 A.PROJECT_NAME,
11 A.REPORT_TYPE,
12 A.MOBILE
13 );

COUNT(*)
----------
4

SQL> SELECT COUNT(*)
2 FROM
3 (
4 SELECT A.ID,A.PROJECT_NAME,
5 DECODE(A.REPORT_TYPE,'1','BUYER','2','ORG','3','OTHER') REPORT_TYPE
6 FROM AUD_PROJECT A LEFT JOIN PLT_PLAT D
7 ON D.ID = A.CITY_ID
8 GROUP BY A.ID,
9 A.PROJECT_NAME,
10 A.REPORT_TYPE,
11 A.MOBILE
12 );

COUNT(*)
----------
5

去掉外连接或者将3张表改为两张表,问题消失。不过现在很难说是由于缺少了一张关联表导致问题消失,还是缺少了SUM聚集操作而使得问题消失:

SQL> SELECT COUNT(*)
2 FROM
3 (
4 SELECT A.ID,A.PROJECT_NAME,
5 DECODE(A.REPORT_TYPE,'1','BUYER','2','ORG','3','OTHER') REPORT_TYPE,
6 SUM(a.id) MONEY
7 FROM AUD_PROJECT A LEFT JOIN PLT_PLAT D
8 ON D.ID = A.CITY_ID
9 GROUP BY A.ID,
10 A.PROJECT_NAME,
11 A.REPORT_TYPE,
12 A.MOBILE
13 );
SELECT COUNT(*)
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel


SQL> CONN SHENJI2
Enter password:
Connected.
SQL> SELECT COUNT(*)
2 FROM
3 (
4 SELECT A.ID,A.PROJECT_NAME,
5 DECODE(A.REPORT_TYPE,'1','BUYER','2','ORG','3','OTHER') REPORT_TYPE,
6 COUNT(A.ID) MONEY
7 FROM AUD_PROJECT A LEFT JOIN PLT_PLAT D
8 ON D.ID = A.CITY_ID
9 GROUP BY A.ID,
10 A.PROJECT_NAME,
11 A.REPORT_TYPE,
12 A.MOBILE
13 );
SELECT COUNT(*)
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel

现在问题基本明确了,两张表外关联的标准写法加上DECODE语句和GROUP BY聚集操作公共构造了这个bug的产生。

根据上面的分析,应该在本地环境也可以成功的构造出这个问题:

SQL> CREATE TABLE T_7445_F (ID NUMBER, TYPE NUMBER);

表已创建。

SQL> CREATE TABLE T_7445_C (ID NUMBER);

表已创建。

SQL> INSERT INTO T_7445_F VALUES (1, 1);

已创建 1 行。

SQL> INSERT INTO T_7445_F VALUES (2, 2);

已创建 1 行。

SQL> INSERT INTO T_7445_C VALUES (1);

已创建 1 行。

SQL> INSERT INTO T_7445_C VALUES (2);

已创建 1 行。

SQL> COMMIT;

提交完成。

SQL> SELECT A.ID, DECODE(TYPE, 1, 'A', 2, 'B', 'C') TYPE, COUNT(*)
2 FROM T_7445_F A LEFT JOIN T_7445_C B
3 ON A.ID = B.ID
4 GROUP BY A.ID, TYPE
5 ;

ID T COUNT(*)
---------- - ----------
1 A 1
2 B 1

奇怪,基本的条件已经都具备了,为什么错误还是没有出现呢,莫非与平台有关,再次观察出错的SQL语句,发现外部包含了一层COUNT(*),莫非这也是构造错误的关键因素之一:

SQL> SELECT COUNT(*)
2 FROM
3 (
4 SELECT A.ID, DECODE(TYPE, 1, 'A', 2, 'B', 'C') TYPE, COUNT(*)
5 FROM T_7445_F A LEFT JOIN T_7445_C B
6 ON A.ID = B.ID
7 GROUP BY A.ID, TYPE
8 )
9 ;
SELECT COUNT(*)
*
1 行出现错误:
ORA-03113:
通信通道的文件结束

果然,外面的COUNT(*)也是构成错误的基本条件之一。

yangtingkun 发表于:2008.06.05 23:50 ::分类: ( ORACLE , Bug ) ::阅读:(183次) :: 评论 (0)

发表评论
标题

在此添加评论
表情符号: smile laughing tongue angry crying sad wassat wink

称呼

邮箱地址(可选)

个人主页(可选)

 authimage


切换风格
新闻聚合
博客日历
文章归档...
最新发表...
最新评论...
最多阅读文章...
最多评论文章...
博客统计...
Blog信息
网站链接...