The Oracle Instructor

Uwe Hesse about Oracle Core Technology

Automatic DOP in 11gR2

with one comment

We have a probably very needful new feature introduced in 11g Release 2, related to parallel query: Automatically determined Degree of Parallelism (DOP). In earlier versions of the Oracle Database, we had to determine the DOP more or less manually, either with a parallel hint or by setting a parallel degree with alter table:

select /*+ parallel (sales,2) */ * from sales;

or

alter table sales parallel 2;

There was an automatic computation of the DOP available, derived from the simple formula CPU_COUNT * PARALLEL_THREADS_PER_CPU. That is what’s done internally if we would have said

select /*+ parallel (sales) */ * from sales;

or

alter table sales parallel;

The drawback with these approaches was always, that we could hardly be sure, whether the DOP is appropriate or not for the table, the statement and the hardware, we are running on. It was mostly a case of try & error. Especially problematic was the alter table approach, as this leads to the parallelization of each and every following select on those tables, even if totally inappropriate. A popular pitfall is the creation of tables with a parallel clause on OLTP-systems, because those tables inherit the parallel degree of their creation, which leads to parallel query for every statement accessing the tables afterwards – most likely not desirable for OLTP. In so far, parallelization was quite dumb (from a system-internal perspective) in versions before 11gR2. Now to the new feature:

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

SQL> select bytes/1024/1024 as mb from user_segments where segment_name='SALES';

 MB
----------
 563

SQL> select degree from user_tables where table_name='SALES';

DEGREE
----------------------------------------
 1

For my tiny machine, this table is huge. I allow automatic determination of the DOP with the following new dynamic parameter:

SQL> alter session set parallel_degree_policy=auto;

Session altered.

SQL> set autotrace on explain

SQL> select sum(amount_sold) from sales

SUM(AMOUNT_SOLD)
----------------
 1571293299

Execution Plan
----------------------------------------------------------
Plan hash value: 3130505568

-----------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |     1 |     5 | 10798   (1)| 00:02:10 |
|   1 |  SORT AGGREGATE        |          |     1 |     5 |            |          |
|   2 |   PX COORDINATOR       |          |       |       |            |          |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |     5 |            |          |
|   4 |     SORT AGGREGATE     |          |     1 |     5 |            |          |
|   5 |      PX BLOCK ITERATOR |          |    14M|    70M| 10798   (1)| 00:02:10 |
|   6 |       TABLE ACCESS FULL| SALES    |    14M|    70M| 10798   (1)| 00:02:10 |
-----------------------------------------------------------------------------------

Note
-----
 - automatic DOP: Computed Degree of Parallelism is 2 because of degree limit

I have got a moderate DOP, most likely appropriate for my relatively weak hardware, but still speeding up the query on the relatively big table. I could always override the automatic DOP determination by specifying a parallel hint as in earlier versions. Also, the parameter defaults to manual, so unless we change it, automatic parallelization will not take place. In order to demonstrate the quite intelligent computation of the DOP, compared to ealier versions, i will access another, much smaller table in the same session:

SQL> select count(*) from customers;

 COUNT(*)
----------
 30501

Execution Plan
----------------------------------------------------------
Plan hash value: 296924608

------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |     1 |   218   (0)| 00:00:03 |
|   1 |  SORT AGGREGATE    |           |     1 |            |          |
|   2 |   TABLE ACCESS FULL| CUSTOMERS | 30501 |   218   (0)| 00:00:03 |
------------------------------------------------------------------------

Note
-----
 - automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold

As you can see, in this case, the system does not think that parallelization is appropriate for the select. So it is much smarter than the old force parallel query:

SQL> alter session set parallel_degree_policy=manual;

Session altered.

SQL> alter session force parallel query;

Session altered.

SQL>  select count(*) from customers;

 COUNT(*)
----------
 30501

Execution Plan
----------------------------------------------------------
Plan hash value: 1221513835

----------------------------------------------------------------------------
| Id  | Operation              | Name      | Rows  | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |           |     1 |   121   (0)| 00:00:02 |
|   1 |  SORT AGGREGATE        |           |     1 |            |          |
|   2 |   PX COORDINATOR       |           |       |            |          |
|   3 |    PX SEND QC (RANDOM) | :TQ10000  |     1 |            |          |
|   4 |     SORT AGGREGATE     |           |     1 |            |          |
|   5 |      PX BLOCK ITERATOR |           | 30501 |   121   (0)| 00:00:02 |
|   6 |       TABLE ACCESS FULL| CUSTOMERS | 30501 |   121   (0)| 00:00:02 |
----------------------------------------------------------------------------

Also, we have a remedy now against inappropriate parallel degrees on tables:

SQL> alter session enable parallel query -- the default, no force;
Session altered.
SQL> alter table customers parallel -- would cause parallel query before 11gR2;
Table altered.
SQL> set autotrace on explain
SQL> select count(*) from customers;
 COUNT(*)
----------
 30501
Execution Plan
----------------------------------------------------------
Plan hash value: 296924608
------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |     1 |   218   (0)| 00:00:03 |
|   1 |  SORT AGGREGATE    |           |     1 |            |          |
|   2 |   TABLE ACCESS FULL| CUSTOMERS | 30501 |   218   (0)| 00:00:03 |
------------------------------------------------------------------------
Note
-----
 - automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold

Written by Uwe Hesse

November 24, 2009 at 19:22

Is count(col) better than count(*)?

with 9 comments

One question, that surfaces sometimes in my courses (especially in Performance Tuning) is, whether there is a benefit in avoiding count(*) in favor of count(col). The short answer is: No.

Dealing with Oracle DBAs and Developers has tought me, though, that these guys usually are not satisfied with the short answer; they require some kind of proof – which is a good thing, in my view. So here we go:

SQL> select count(*) from sales;

 COUNT(*)
----------
 14701488

Elapsed: 00:00:33.30
SQL> select count(cust_id) from sales;

COUNT(CUST_ID)
--------------
 14701488

Elapsed: 00:00:04.06
SQL> select count(*) from sales;

 COUNT(*)
----------
 14701488

Elapsed: 00:00:34.49
SQL> select count(cust_id) from sales;

COUNT(CUST_ID)
--------------
 14701488

Elapsed: 00:00:04.20

I think the above observation is responsible for the appearing of the myth that count(col) is superior. In fact, right now it is faster (about 10 times!) as the count(*). I did the two selects twice to show that caching has not much to say here. Unfortunately, the second select with count(col) is faster, but not necessarily correct! Watch it:

SQL> update sales set cust_id=null where rownum<2;
1 row updated.
Elapsed: 00:00:00.23
SQL> commit;
Commit complete.
Elapsed: 00:00:00.01
SQL> select count(*) from sales;
 COUNT(*)
----------
 14701488
Elapsed: 00:00:34.84
SQL> select count(cust_id) from sales;
COUNT(CUST_ID)
--------------
 14701487
Elapsed: 00:00:03.73

The count(cust_id) is still faster by far – but it shows a wrong result, should you be interested in the number of rows of the table  You probably now see already the point I am after: There is an index on the cust_id column that is used for count(cust_id), but not for count(*). If the column counted does not contain any NULL values, the result is identical, but the runtime is faster. The origin of the myth! The point is: If you would declare the indexed column as NOT NULL, the optimizer would know that it can use the index for the count(*):

SQL> update sales set cust_id=1  where rownum<2;
1 row updated.
Elapsed: 00:00:00.10
SQL> commit;
Commit complete.
Elapsed: 00:00:00.10
SQL> alter table sales modify (cust_id NOT NULL);
Table altered.
Elapsed: 00:00:38.72
SQL> set autotrace on explain
SQL> select count(*) from sales;
 COUNT(*)
----------
 14701488
Elapsed: 00:00:03.81
Execution Plan
----------------------------------------------------------
Plan hash value: 2989616816
-----------------------------------------------------------------------------------
| Id  | Operation             | Name              | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                   |     1 |  8499   (1)| 00:01:42 |
|   1 |  SORT AGGREGATE       |                   |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| SALES_CUST_ID_IDX |    14M|  8499   (1)| 00:01:42 |
-----------------------------------------------------------------------------------
SQL> alter table sales modify (cust_id NULL);
Table altered.
Elapsed: 00:00:00.27
SQL> select count(*) from sales;
 COUNT(*)
----------
 14701488
Elapsed: 00:00:36.00
Execution Plan
----------------------------------------------------------
Plan hash value: 1047182207
--------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Cost (%CPU)| Time     |
--------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |     1 | 19398   (1)| 00:03:53 |
|   1 |  SORT AGGREGATE    |       |     1 |            |          |
|   2 |   TABLE ACCESS FULL| SALES |    14M| 19398   (1)| 00:03:53 |
--------------------------------------------------------------------

As a general rule of thumb, you should always declare columns NOT NULL if you know that NULL values can’t be in that columns, in order to make the optimizer aware of that important information also. By the way, if you do count(1) instead of count(*), the outcome is the same:

SQL>  alter table sales modify (cust_id NOT NULL);
Table altered.
SQL> select count(1) from sales;
 COUNT(1)
----------
 14701488
Elapsed: 00:00:03.12
Execution Plan
----------------------------------------------------------
Plan hash value: 2989616816
-----------------------------------------------------------------------------------
| Id  | Operation             | Name              | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                   |     1 |  8499   (1)| 00:01:42 |
|   1 |  SORT AGGREGATE       |                   |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| SALES_CUST_ID_IDX |    14M|  8499   (1)| 00:01:42 |
-----------------------------------------------------------------------------------

Written by Uwe Hesse

October 29, 2009 at 16:34

Posted in TOI

Tagged with

Dropping a table during SELECT

with 7 comments

In my last course, I had one student asking the question: “What happens if we do select on a table and someone else drops it during it?” Of course, you cannot drop a table while an open transaction uses the table. But what for a select: will it go through? Although this question seems to be a little academic – which I responded immediately – there was no obvious answer, I was aware of. So I did setup a little test case for it.

I like to answer questions with practical examples if possible. The drop table has no effect at all on the select statement for small tables, of course, if done after the select started, because the select is just too fast to notice the drop. So my test table is more than 4 Gig in size and has got 117611904 rows in it. Surprisingly, there is a difference in the outcome, depending whether you drop the table with purge or not.

SQL> select count(*) from sales;

 COUNT(*)
----------
 117611904

Elapsed: 00:00:55.74

I have two sessions connected. In the first session, I do select. In the second session, I drop the table. Without dropping, the select needs about 1 minute to complete as shown above. Now the drop table without purge (basically just renaming the table in the data dictionary, enabling flashback to before drop since 10g):

SQL> select count(*) from sales;
 
 COUNT(*)
----------
 117611904
 
Elapsed: 00:01:01.25
SQL> drop table sales;

Table dropped.

The drop table statement completed much faster than the select above. Apparently, the select keeps running, fetching the rows from the “dropped” table in the recyclebin.

Now I really drop the table:

SQL> flashback table sales to before drop;
Flashback complete.

SQL> select count(*) from sales;
select count(*) from sales
 *
ERROR at line 1:
ORA-08103: object no longer exists

Elapsed: 00:00:05.17

SQL>  drop table sales purge;

Table dropped.

This interrupts the select statement in the first session. Well, I doubt that there is much practical implication of this, but I found it interesting in spite of. Thanks to Mr. Heinz Schmitz for raising that question :-)

Written by Uwe Hesse

October 27, 2009 at 14:53

Posted in TOI

Tagged with

“Total Recall”: Brief introduction into Flashback Data Archive

with 6 comments

With Oracle Database 11g, we have a new Option available, called Total Recall. This option extends the possibility to do Flashback Query, introduced in 9i already.

We can now designate particular tables for being able to track back all the changes on them even after years. This option addresses especially legal requirements to keep history for certain data for several years. With Total Recall resp. Flashback Data Archive, this can be achieved quite comfortable and efficient. In order to demonstrate that, I create a demo user, giving him the DBA role (kind of “quick & dirty” of course, but makes the demo easier):

SQL> grant dba to quaid identified by quaid;

Grant succeeded.

SQL> connect quaid/quaid
Connected.

SQL> create tablespace flatbs
 datafile '/u01/app/oracle/oradata/orcl/flatbs01.dbf' size 50m;

Tablespace created.

This tablespace is just an ordinary tablespace, but I am going to use it to hold the Flashback Archive, created now:

SQL> create flashback archive fla_10y
 tablespace flatbs retention 10 year;

Flashback archive created.

If I designate a table to fla_10y, I will be able to do Flashback Query on that table even after 10 years – provided I have the space to hold that history in the tablespace(s), associated to that Flashback Archive. Following creates two ordinary tables with ordinary rows in them:

SQL> create table norecall
 (id number, name varchar2(50));

Table created.

SQL> create table totalrecall
 (id number, name varchar2(50));

Table created.

SQL> insert into norecall values (1,'QUAID');

1 row created.

SQL> insert into  totalrecall values (1,'QUAID');

1 row created.

SQL> commit;

Commit complete.

I want to be able to keep the history of the second table for 10 years:

SQL> alter table totalrecall flashback archive fla_10y;

Table altered.

SQL> select to_char(systimestamp,'yyyy-mm-dd:hh24:mi:ss')
 from dual;

TO_CHAR(SYSTIMESTAM
-------------------
2009-10-21:10:55:47

I will do DML on the two tables now. As you probably know, this leads to the creation of before images in the undo tablespace, which we can use to look into the past of the tables – but most likely not after 10 years, because the before images in the undo tablespace are getting overwritten somewhen.

SQL> delete from norecall;

1 row deleted.

SQL> commit;

Commit complete.

SQL> select * from norecall as of timestamp
 to_timestamp('2009-10-21:10:55:47','yyyy-mm-dd:hh24:mi:ss');

 ID NAME
---------- --------------------------------------------------
 1 QUAID

SQL>  delete from  totalrecall;

1 row deleted.

SQL> commit;

Commit complete.

The select above is a Flashback Query, as it is possible since 9i. I will now make sure, that there are no before images left in the undo tablespace by creating a new one and dropping the old one.

SQL> create undo tablespace undonew datafile
 '/u01/app/oracle/oradata/orcl/undonew01.dbf' size 50m;
Tablespace created.
SQL> alter system set undo_tablespace='UNDONEW';
System altered.
SQL> select * from norecall as of timestamp
 to_timestamp('2009-10-21:10:55:47','yyyy-mm-dd:hh24:mi:ss');
 ID NAME
---------- --------------------------------------------------
 1 QUAID

As long as the old undo tablespace is still there, it may still get used for a conventional Flashback Query. But no longer after the drop tablespace. The before images of the history-tracked table, though, are saved into the Flashback Archive (getting compressed during the transfer), which would be the same, if the content of the old undo tablespace gets overwritten by new before images. That’s why I still can do Flashback Query with totalrecall:

SQL> connect / as sysdba
Connected.
SQL> shutdown immediate

SQL> startup

SQL> drop tablespace undotbs1 including contents and datafiles;

Tablespace dropped.
SQL> connect quaid/quaid
Connected.

SQL> select * from norecall as of timestamp
 to_timestamp('2009-10-21:10:55:47','yyyy-mm-dd:hh24:mi:ss');  
select * from norecall as of timestamp
 *
ERROR at line 1:
ORA-01555: snapshot too old: rollback segment number  with name "" too small

SQL> select * from totalrecall as of timestamp
 to_timestamp('2009-10-21:10:55:47','yyyy-mm-dd:hh24:mi:ss');   

 ID NAME
---------- --------------------------------------------------
 1 QUAID

I will insert a new row into the ordinary table to demonstrate further differences between it and the history tracked table:

SQL> insert into norecall values (2,'John Doe');
1 row created.
SQL> commit;
Commit complete.
SQL> select to_char(sysdate,'yyyy-mm-dd:hh24:mi:ss') from dual;
TO_CHAR(SYSDATE,'YY
-------------------
2009-10-23:02:14:28

Upto this point of the story, the shown features are the same in 11g release 1 already available. Following are new features of 11g release 2: We are now able to do DDL statements on the history-tracked tables and can still do Flashback Query!

SQL> alter table norecall drop column name;

Table altered.

SQL> select * from norecall  as of timestamp
 2  to_timestamp('2009-10-23:02:14:28','yyyy-mm-dd:hh24:mi:ss');
select * from norecall  as of timestamp
 *
ERROR at line 1:
ORA-01466: unable to read data - table definition has changed

SQL> select to_char(sysdate,'yyyy-mm-dd:hh24:mi:ss') from dual;

TO_CHAR(SYSDATE,'YY
-------------------
2009-10-23:02:17:06

SQL> truncate table norecall;

Table truncated.

SQL> select * from norecall  as of timestamp
 2  to_timestamp('2009-10-23:02:17:06','yyyy-mm-dd:hh24:mi:ss');
select * from norecall  as of timestamp
 *
ERROR at line 1:
ORA-01466: unable to read data - table definition has changed

Above demonstrated that DDL on an ordinary table makes Flashback Query behind the DDL impossible. That was nothing new. New is:

SQL> alter table totalrecall drop column name;

Table altered.

SQL> select * from totalrecall as of timestamp
 to_timestamp('2009-10-21:10:55:47','yyyy-mm-dd:hh24:mi:ss');  

 ID NAME
---------- --------------------------------------------------
 1 QUAID

SQL> truncate table totalrecall;

Table truncated.

SQL> select * from totalrecall as of timestamp
 to_timestamp('2009-10-21:10:55:47','yyyy-mm-dd:hh24:mi:ss');   

 ID NAME
---------- --------------------------------------------------
 1 QUAID

With 11g release 2, we can do Flashback Query behind DDL on tables in the Flashback Archive! In 11g release 1, DDL on history-tracked tables was prohibited, which was a show-stopper for many customers.  One DDL, we still can’t get through is of course:

SQL> drop table totalrecall;
drop table totalrecall
 *
ERROR at line 1:
ORA-55610: Invalid DDL statement on history-tracked table

Written by Uwe Hesse

October 23, 2009 at 15:35

Posted in TOI

Tagged with , ,

Real-Time SQL Monitoring with Database Control

with 4 comments

Another Oracle Database 11g Release 2 “Enterprise Manager New Feature”  is Real-Time SQL Monitoring. On the command line, that was already possible with 11g Release 1, where the views V$SQL_MONITOR and V$SQL_PLAN_MONITOR have been introduced. Now with the new Release, we can do that with the GUI. I have prepared a little demonstration for that by creating a larger sales table (560 M), so that the select statement takes more than 30 seconds on my tiny system. That gives me enough time to watch it real-time and capture the screen:

First we go to the Performance page, Top Activity as usual to spot our High Load statements. Notice the additional card named SQL Monitoring

RT_SQL1The statement did run several times before, so there are already statistics from previous runs on the screen. Now while the statement is running, we can leave the refresh rate to 15 seconds or refresh manually:

RT_SQL2The statement is shown with a duration of 20 seconds and still running. After completion, the screen looks like that:

RT_SQL3

Now isn’t that a nice feature to observe the most critical and long running statements in Real-Time?

P.S.: As I just got aware (thanks to Niall Litchfield) this was already possible with Database Control in Release 1 – just from a less obvious place in that tool. So just take it as a 11g New Feature, not as 11g R2 New Feature :-)

Written by Uwe Hesse

October 1, 2009 at 15:15

Posted in TOI

Tagged with