Index statistics on complete refreshed materialized views. What is going on?

A few days ago I discovered some (to my feeling) buggy behavior during a COMPLETE refresh of a -partitioned- materialized view. It was an unfortunate situation, because it was discovered the day after going into production. It was not noticed during the weeks of development and pre-production execution. Seems like we have some improvement to do there. It had to do with the index statistics on GLOBAL indexes of a partitioned materialized view. The first time our ETL ran with some new materialized views implemented on them, the ETL did its job and COMPLETE refreshed a  new MV. So far, so good, that was the intention. However, countries started complaining about huge performance problems with some specific reports and another ETL part (only run on Production, which is one of those things I did not know and we need to improve a next time we test a new implementation) was taking hours and hours where it normally would take a few minutes.

First discovery

We soon found out the ETL and reports problems all had to do with this new materialized view, which was an ‘exact’ replacement of a manually refreshed aggregate table with the exact same indexes and columns. The statistics had also been gathered on the MV. I looked at the execution plans of the badly behaving ETL first and saw that Oracle was using a GLOBAL index defined on the new MV. The cardinality estimates in the execution plan were 1, however, the number of records to be retrieved were in the millions. So it had to be  a statistic issue. I checked and saw that all the indexes were last analyzed somewhere during the normal ETL and to be exact: they were analyzed as part of the COMPLETE refresh. At first I wondered: who did an analyze of that table’s indexes during the ETL, is it defined in the ETL somewhere? It took me a few moments to realize that the way we do the COMPLETE refresh is a so called ‘non-atomic’ one:

begin
  dbms_mview.REFRESH('SOME_JOIN_MV','COMPLETE',atomic_refresh=>FALSE);
end;
/

This effectively does a COMPLETE refresh by doing a TRUNCATE, followed by a direct path insert and an efficient rebuild of all the indexes. And it was this rebuild of the indexes that lead to the analyze of those indexes: Oracle 10g and above automatically (re-)analyzes the new or rebuilt indexes on a table when this table already has statistics gathered. So OK, I found out where the analyze came from. But why was the cardinality so wrong? I checked and to my surprise the num_rows statistic was set to zero (not NULL) for the GLOBAL indexes only. The partitioned indexes however where analyzed fine: their statistics were accurate. So I went testing this behavior (after first correcting the statistics on those indexes in Production to have performance return to normal) and I saw what Oracle did (among others by taking a SQLtrace during the COMPLETE refresh). And this was the situation on HP-UX Oracle 10.2.0.4 64bit:

  1. Oracle did a TRUNCATE of the MV
  2. Oracle set the indexes to unusable AND set the statistics of the GLOBAL indexes to zero!
  3. Oracle did the Direct Path Insert
  4. Oracle rebuilt the indexes –> automatic analyze of the LOCAL partitioned indexes, but not of the GLOBAL ones

So after the refresh the statistics on the GLOBAL indexes were zero and the LAST ANALYZE was from the moment after the TRUNCATE, but the LOCAL indexes statistics were fine: they seemed to have gathered normally during the rebuild. To me that sounds like a bug. I cannot explain why this behavior would be like that. And that was the cause of the performance problems: with index statistics set to zero, Oracle considered these GLOBAL indexes as extremely cheap and used them in about every query. But that was completely misleading information. I adjusted the materialized view refresh stored procedure by adding an explicit dbms_stats.gather_index_stats on those GLOBAL indexes and everything went back to normal.

But then…

 …I wanted to wrote this post and started retesting this on my Windows XP 64bit 10.2.0.4. The same version (I may be off by a minor patchlevel, I need to check) and the same kind of MV (not the same one, but also a join MV). To my surprise during test Oracle did a good job at the COMPLETE refresh for the GLOBAL indexes: now these were gathered also and to be most specific: as one of the last steps of the refresh. Seems like the thing that was forgotten on the HP-UX version. But now another thing seems wrong: the TABLE level statistics of the LOCAL partitioned indexes are not gathered. The example I prepared was:

DROP TABLE some_fact;
DROP TABLE some_domain;

CREATE TABLE some_fact AS SELECT * FROM all_objects;
CREATE TABLE some_domain AS SELECT * FROM all_tables;

CREATE UNIQUE INDEX pk_some_domain ON some_domain (owner,table_name);

CREATE MATERIALIZED VIEW some_join_mv
PARTITION BY LIST (object_type)
(   PARTITION part_table VALUES ('TABLE'),  
    PARTITION part_index VALUES ('INDEX'),  
    PARTITION part_proc VALUES ('PROCEDURE','PACKAGE','FUNCTION'),  
    PARTITION part_remaining VALUES (DEFAULT))
BUILD IMMEDIATE REFRESH
COMPLETE ON DEMAND AS
SELECT f.object_name
     , f.object_type
     , f.owner
     , f.status
     , f.last_ddl_time
     , d.avg_row_len
     , d.blocks
     , d.empty_blocks
     , d.partitioned
     , d.tablespace_name
  FROM some_fact f, some_domain d
 WHERE f.object_name = d.table_name(+) AND f.owner = d.owner(+);

BEGIN
   DBMS_STATS.gather_table_stats (ownname               => USER
                                , tabname               => 'SOME_JOIN_MV'
                                , estimate_percent      => 100
                                , granularity           => 'ALL'
                                , method_opt            => 'FOR ALL COLUMNS SIZE 1'
                                , CASCADE               => TRUE
                                 );
END;
/

CREATE  INDEX idx_some_join_mv_p1 ON some_join_mv (object_name, object_type, owner) GLOBAL;
CREATE  INDEX idx_some_join_mv_type_local ON some_join_mv (object_type) LOCAL;

SELECT   index_name
       , table_name
       , partition_name
       , leaf_blocks
       , distinct_keys
       , clustering_factor
       , num_rows
       , last_analyzed
    FROM all_ind_statistics
   WHERE table_name = 'SOME_JOIN_MV'
ORDER BY index_name, partition_name NULLS FIRST;

Results were (the TABLE level statistics of the LOCAL partitioned index in Red):

index_name table_name partition_name leaf_blocks distinct_keys clustering_factor num_rows last_analyzed
IDX_SOME_JOIN_MV_P1 SOME_JOIN_MV   589 50675 23994 77427 14-3-2009 21:13:36
IDX_SOME_JOIN_MV_TYPE_LOCAL SOME_JOIN_MV   242 29 1061 77427 14-3-2009 21:13:36
IDX_SOME_JOIN_MV_TYPE_LOCAL SOME_JOIN_MV PART_INDEX 7 1 20 2804 14-3-2009 21:13:36
IDX_SOME_JOIN_MV_TYPE_LOCAL SOME_JOIN_MV PART_PROC 4 3 27 1234 14-3-2009 21:13:36
IDX_SOME_JOIN_MV_TYPE_LOCAL SOME_JOIN_MV PART_REMAINING 218 24 971 68115 14-3-2009 21:13:36
IDX_SOME_JOIN_MV_TYPE_LOCAL SOME_JOIN_MV PART_TABLE 13 1 43 5274 14-3-2009 21:13:36

Then I deleted almost all records in the Fact table, did a COMPLETE refresh and see what the index statistics were after that:

DELETE FROM some_fact
      WHERE ROWNUM <= 77000;
COMMIT ;

BEGIN
   dbms_mview.REFRESH ('SOME_JOIN_MV', 'COMPLETE', atomic_refresh => FALSE);
END;
/

See the inaccuracy:

index_name table_name partition_name leaf_blocks distinct_keys clustering_factor num_rows last_analyzed
IDX_SOME_JOIN_MV_P1 SOME_JOIN_MV   3 421 18 427 14-3-2009 21:20:50
IDX_SOME_JOIN_MV_TYPE_LOCAL SOME_JOIN_MV   242 29 1061 77427 14-3-2009 21:13:36
IDX_SOME_JOIN_MV_TYPE_LOCAL SOME_JOIN_MV PART_INDEX 1 1 1 2 14-3-2009 21:20:50
IDX_SOME_JOIN_MV_TYPE_LOCAL SOME_JOIN_MV PART_PROC 1 3 1 8 14-3-2009 21:20:50
IDX_SOME_JOIN_MV_TYPE_LOCAL SOME_JOIN_MV PART_REMAINING 1 4 1 18 14-3-2009 21:20:50
IDX_SOME_JOIN_MV_TYPE_LOCAL SOME_JOIN_MV PART_TABLE 1 1 4 399 14-3-2009 21:20:50

All indexes have accurate statistics, but the TABLE level statistics on IDX_SOME_JOIN_MV_TYPE_LOCAL are still the old ones.

So I witnessed two strange things in INDEX statistics with COMPLETE refreshed materialized views. Can anyone tell me the reasons…? At least I learned to pay carefull attention to the statistics when dealing with materialized views refreshes, that is for sure!

About these ads

3 thoughts on “Index statistics on complete refreshed materialized views. What is going on?

  1. Did you compare behaviour with an atomic refresh?
    The atomic refresh will do a delete rather than a truncate.
    Was the non-atomic refresh option chosen deliberately, e.g. for performance?

    • Hi Dominic, thanks for the comment!

      Yes, it is certainly intentional to do the TRUNCATE version (atomic=>FALSE), because of the huge amount of records in the table (73 million currently), the existence of about 15 indexes on the table and the fact that the ETL runs during the night and the TRUNCATE is accepted by everyone (acidental atomic COMPLETE refreshes during tests of other MV’s wre dramatic: extremely long refresh times so if you can do the TRUNCATE version, I would advise it, but be aware of other side effects). I wanted to do it with a FAST refresh, but as it turned out one of the underlying MV definitions has a construction in it that lead to data errors during testing. We still have to find out exactly what happens. Since a COMPLETE refresh (with atomic=>FALSE) was still by far superior to the previous implementation with the manual “refresh” of a table using an ETL tool, we choose the COMPLETE refresh to at least gain about 2-3 hours. As an atomic refresh is purely DML (indeed with a DELETE instead of a TRUNCATE), I expected that there would be no statistics calculation during the refresh and I just tested it: the statistics are completely untouched with an atomic refresh as expected. That is also certainly something to take into account: when there are huge dataloads happening or huge deletes/updates, you may want to gather stats after the atomic refresh to keep them in sync with the real data.

  2. FYI – Oracle acknowledged this as a bug (#5632128). I find your post very helpful though in my initial research into a similar issue we faced here in my company.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s