发表于: 2008.04.16 23:53
分类: ORACLE
出处: http://yangtingkun.itpub.net/post/468/460018
---------------------------------------------------------------
前一段用SQL实现了一个复杂的功能,但是留下了一个问题一直没有真正的解决,那就是如何同时为表中的多条记录分别实现构造结果集。
利用MODEL语句来实现结果集构造。
问题源自:一个复杂问题的求解过程(一):http://yangtingkun.itpub.net/post/468/456641
如何为表中多条记录分别实现构造结果集(一):http://yangtingkun.itpub.net/post/468/459848
如何为表中多条记录分别实现构造结果集(二):http://yangtingkun.itpub.net/post/468/459938
上面一篇文章给出了利用自定义类型的SQL写法,那么完全使用SQL功能真的不能实现吗,这篇文章给出一个完全使用SQL实现的方法。
由于MODEL语句是10g新增语句,可能很多人对这个语句不熟,这里就不直接给出最终结果,而是分步解释一下。不过MODEL语法很复杂,功能很强大,这个SQL实现也基本上没有用到MODEL的主要功能,而是借用了MODEL语句中的一个小功能,因此这里不会对MODEL语句做多少解释,而主要解释这里用到的功能。
SQL> SELECT ID, VALUE, POWER, RESULT
2 FROM T
3 MODEL
4 PARTITION BY (ID, VALUE, POWER)
5 DIMENSION BY (0 RESULT)
6 MEASURES (0 V)
7 RULES
8 (
9 V[
10 FOR RESULT IN
11 (
12 SELECT ROWNUM FROM DUAL
13 CONNECT BY LEVEL <= (SELECT MAX(POWER) FROM T)
14 )
15 ] = 1
16 )
17 ORDER BY 1, 4;
ID VALUE POWER RESULT
---------- ---------- ---------- ----------
1 2 3 0
1 2 3 1
1 2 3 2
1 2 3 3
1 2 3 4
2 1 2 0
2 1 2 1
2 1 2 2
2 1 2 3
2 1 2 4
3 4 4 0
3 4 4 1
3 4 4 2
3 4 4 3
3 4 4 4
4 5 2 0
4 5 2 1
4 5 2 2
4 5 2 3
4 5 2 4
5 4 3 0
5 4 3 1
5 4 3 2
5 4 3 3
5 4 3 4
已选择25行。
说是分步介绍,其实也只分了两步,上面这个SQL基本上实现了整个SQL的核心部分。
首先介绍一下语法,PARTITION BY语句列出数据的分区列,这个分区的概念和分析函数中的PARTITION BY语句类似。
DIMENSION BY语句指定维度列,通过这个列来控制和设置需要修改的列。
MEASURES BY语句指定定制新增的列。
这个例子中主要是利用DIMENSION BY语句的构造记录功能,对ID, VALUE和POWER都不需要进行任何的处理,因此将这3列放到PARTITION BY语句中。
由于需要DIMENSION BY来构造结果集,因此构造一个RESULT列。
MEASURES BY语句指定的列在本例子中没有用处,只是为了实现的语法需要,因此随意构造一个V列。
这里用到了FOR LOOP IN SUBQUERY的方式构造维度列的值,由于没有办法为每个ID分别生成构造结果集,因此这里只能表中最大的POWER来构造结果集,这对于大部分记录都会存在重复数据的问题,不过好在这里的重复记录和第一篇文章中的不同,这里重复记录是成线形增长的,因此不会对性能带来太大的负担。
有了上面这个结果集,要得到期望的最终结果就很容易了,只需要在这个的基础上过滤掉重复记录就可以了:
SQL> WITH T1
2 AS
3 (
4 SELECT ID, VALUE, POWER, RESULT
5 FROM T
6 MODEL
7 PARTITION BY (ID, VALUE, POWER)
8 DIMENSION BY (0 RESULT)
9 MEASURES (0 V)
10 RULES
11 (
12 V[
13 FOR RESULT IN
14 (
15 SELECT ROWNUM FROM DUAL
16 CONNECT BY LEVEL <= (SELECT MAX(POWER) FROM T)
17 )
18 ] = 1
19 )
20 )
21 SELECT ID, VALUE, POWER, RESULT * VALUE RESULT
22 FROM T1
23 WHERE RESULT < POWER
24 ORDER BY 1, 4;
ID VALUE POWER RESULT
---------- ---------- ---------- ----------
1 2 3 0
1 2 3 2
1 2 3 4
2 1 2 0
2 1 2 1
3 4 4 0
3 4 4 4
3 4 4 8
3 4 4 12
4 5 2 0
4 5 2 5
5 4 3 0
5 4 3 4
5 4 3 8
已选择14行。
下面看看这条语句的性能能否满足要求:
SQL> INSERT INTO T SELECT 5 + ID, VALUE, POWER FROM T;
已创建5行。
SQL> INSERT INTO T SELECT 10 + ID, VALUE, POWER FROM T;
已创建10行。
SQL> INSERT INTO T SELECT 20 + ID, VALUE, POWER FROM T;
已创建20行。
SQL> SET TIMING ON
SQL> SET AUTOT TRACE
SQL> WITH T1
2 AS
3 (
4 SELECT ID, VALUE, POWER, RESULT
5 FROM T
6 MODEL
7 PARTITION BY (ID, VALUE, POWER)
8 DIMENSION BY (0 RESULT)
9 MEASURES (0 V)
10 RULES
11 (
12 V[
13 FOR RESULT IN
14 (
15 SELECT ROWNUM FROM DUAL
16 CONNECT BY LEVEL <= (SELECT MAX(POWER) FROM T)
17 )
18 ] = 1
19 )
20 )
21 SELECT ID, VALUE, POWER, RESULT * VALUE RESULT
22 FROM T1
23 WHERE RESULT < POWER
24 ORDER BY 1, 4;
已选择112行。
已用时间: 00: 00: 00.01
执行计划
----------------------------------------------------------
Plan hash value: 3959148217
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 260 | 4 (25)| 00:00:01 |
| 1 | SORT ORDER BY | | 5 | 260 | 4 (25)| 00:00:01 |
|* 2 | VIEW | | 5 | 260 | 3 (0)| 00:00:01 |
| 3 | SQL MODEL ORDERED | | 5 | 45 | | |
| 4 | TABLE ACCESS FULL | T | 5 | 45 | 3 (0)| 00:00:01 |
| 5 | BUFFER SORT | | 1 | | | |
| 6 | COUNT | | | | | |
| 7 | CONNECT BY WITHOUT FILTERING| | | | | |
| 8 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("RESULT"<"POWER")
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
14 consistent gets
0 physical reads
0 redo size
2675 bytes sent via SQL*Net to client
462 bytes received via SQL*Net from client
9 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
112 rows processed
将数据量增大一倍:
SQL> SET AUTOT OFF
SQL> SET TIMING OFF
SQL> INSERT INTO T SELECT 40 + ID, VALUE, POWER FROM T;
已创建40行。
SQL> SET TIMING ON
SQL> SET AUTOT TRACE
SQL> WITH T1
2 AS
3 (
4 SELECT ID, VALUE, POWER, RESULT
5 FROM T
6 MODEL
7 PARTITION BY (ID, VALUE, POWER)
8 DIMENSION BY (0 RESULT)
9 MEASURES (0 V)
10 RULES
11 (
12 V[
13 FOR RESULT IN
14 (
15 SELECT ROWNUM FROM DUAL
16 CONNECT BY LEVEL <= (SELECT MAX(POWER) FROM T)
17 )
18 ] = 1
19 )
20 )
21 SELECT ID, VALUE, POWER, RESULT * VALUE RESULT
22 FROM T1
23 WHERE RESULT < POWER
24 ORDER BY 1, 4;
已选择224行。
已用时间: 00: 00: 00.01
执行计划
----------------------------------------------------------
Plan hash value: 3959148217
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 260 | 4 (25)| 00:00:01 |
| 1 | SORT ORDER BY | | 5 | 260 | 4 (25)| 00:00:01 |
|* 2 | VIEW | | 5 | 260 | 3 (0)| 00:00:01 |
| 3 | SQL MODEL ORDERED | | 5 | 45 | | |
| 4 | TABLE ACCESS FULL | T | 5 | 45 | 3 (0)| 00:00:01 |
| 5 | BUFFER SORT | | 1 | | | |
| 6 | COUNT | | | | | |
| 7 | CONNECT BY WITHOUT FILTERING| | | | | |
| 8 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("RESULT"<"POWER")
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
14 consistent gets
0 physical reads
0 redo size
4756 bytes sent via SQL*Net to client
539 bytes received via SQL*Net from client
16 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
224 rows processed
从执行时间上看,也没有任何变化,下面将数据量增加到1000:
SQL> SET AUTOT OFF
SQL> SET TIMING OFF
SQL> INSERT INTO T SELECT 80 + ID, VALUE, POWER FROM T;
已创建80行。
SQL> INSERT INTO T SELECT 160 + ID, VALUE, POWER FROM T;
已创建160行。
SQL> INSERT INTO T SELECT 320 + ID, VALUE, POWER FROM T;
已创建320行。
SQL> INSERT INTO T SELECT 640 + ID, VALUE, POWER FROM T;
已创建640行。
SQL> SET TIMING ON
SQL> SET AUTOT TRACE
SQL> WITH T1
2 AS
3 (
4 SELECT ID, VALUE, POWER, RESULT
5 FROM T
6 MODEL
7 PARTITION BY (ID, VALUE, POWER)
8 DIMENSION BY (0 RESULT)
9 MEASURES (0 V)
10 RULES
11 (
12 V[
13 FOR RESULT IN
14 (
15 SELECT ROWNUM FROM DUAL
16 CONNECT BY LEVEL <= (SELECT MAX(POWER) FROM T)
17 )
18 ] = 1
19 )
20 )
21 SELECT ID, VALUE, POWER, RESULT * VALUE RESULT
22 FROM T1
23 WHERE RESULT < POWER
24 ORDER BY 1, 4;
已选择3584行。
已用时间: 00: 00: 00.09
执行计划
----------------------------------------------------------
Plan hash value: 3959148217
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 260 | 4 (25)| 00:00:01 |
| 1 | SORT ORDER BY | | 5 | 260 | 4 (25)| 00:00:01 |
|* 2 | VIEW | | 5 | 260 | 3 (0)| 00:00:01 |
| 3 | SQL MODEL ORDERED | | 5 | 45 | | |
| 4 | TABLE ACCESS FULL | T | 5 | 45 | 3 (0)| 00:00:01 |
| 5 | BUFFER SORT | | 1 | | | |
| 6 | COUNT | | | | | |
| 7 | CONNECT BY WITHOUT FILTERING| | | | | |
| 8 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("RESULT"<"POWER")
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
14 consistent gets
0 physical reads
0 redo size
70133 bytes sent via SQL*Net to client
3003 bytes received via SQL*Net from client
240 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
3584 rows processed
即使表中数据量达到了1000,这个SQL的性能也完全可以满足要求。
虽然对比三种方法,似乎这种方式速度最快,但是这种方法会产生不定数量的重复记录,在上面的这个例子中,所有记录的POWER值比较接近,因此重复记录不是很多,一旦表中个别记录所需构造结果远远大于平均水平,就可能产生大量的重复记录,从而影响这个SQL的性能。
举个简单的例子:
SQL> UPDATE T SET POWER = 1000 WHERE ID = 1;
已更新 1 行。
已用时间: 00: 00: 00.03
SQL> WITH T1
2 AS
3 (
4 SELECT ID, VALUE, POWER, RESULT
5 FROM T
6 MODEL
7 PARTITION BY (ID, VALUE, POWER)
8 DIMENSION BY (0 RESULT)
9 MEASURES (0 V)
10 RULES
11 (
12 V[
13 FOR RESULT IN
14 (
15 SELECT ROWNUM FROM DUAL
16 CONNECT BY LEVEL <= (SELECT MAX(POWER) FROM T)
17 )
18 ] = 1
19 )
20 )
21 SELECT ID, VALUE, POWER, RESULT * VALUE RESULT
22 FROM T1
23 WHERE RESULT < POWER
24 ORDER BY 1, 4;
已选择4581行。
已用时间: 00: 00: 44.26
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
14 consistent gets
0 physical reads
0 redo size
87549 bytes sent via SQL*Net to client
3740 bytes received via SQL*Net from client
307 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
4581 rows processed
而不产生重复记录的第二种方法则不存在这个问题:
SQL> WITH T1 AS
2 (
3 SELECT ID, VALUE, POWER,
4 CAST (MULTISET(
5 SELECT VALUE * (LEVEL - 1)
6 FROM DUAL CONNECT BY ROWNUM <= POWER
7 ) AS T_TYPE) RESULT
8 FROM T
9 )
10 SELECT ID, VALUE, POWER, T.COLUMN_VALUE RESULT
11 FROM T1 A, TABLE(SELECT RESULT FROM T1 WHERE T1.ID = A.ID) T
12 ;
已选择4581行。
已用时间: 00: 00: 00.34
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
16703 consistent gets
0 physical reads
0 redo size
87540 bytes sent via SQL*Net to client
3740 bytes received via SQL*Net from client
307 SQL*Net roundtrips to/from client
1280 sorts (memory)
0 sorts (disk)
4581 rows processed
因此选择何种方法需要根据具体情况进行判断,第三种方式值得一提的是这个方式是完全利用SQL功能来实现的。











