Dave Abercrombie, Principal
		Database Architect, Convio 
©
		2008
Northern California Oracle Users' Group (NoCOUG) Summer Conference 2008, August 21
		2008
Introduced in version 10g, Oracle's Automatic Workload Repository (AWR) provides diagnostic information for performance and scalability studies, automatically recording a rich variety of database performance statistics.
What's the best way to leverage this wealth of data? While you can run Oracle-supplied AWR reports, or use Oracle features such as the Automatic Database Diagnostic Monitor (ADDM), each Oracle database presents its own unique tuning challenges. In this paper, you'll learn how to work directly with AWR tables, using customized queries to improve insight into your own particular scalability issues.
Topics include:
Important AWR tables, their contents, how to join them, and their quirks and limitations.
Sample queries that can be easily adapted to focus on your own unique set of problems.
Estimating the "Average Active Session" metric.
Simple statistical techniques to find spikes and other types of anomalous behavior.
A comparison of techniques used for historical scalability studies with those used for real-time performance crisis resolution..
Use of DBMS_APPLICATION_INFO and JDBC end-to-end metrics.
Useful tips on configuring AWR.
This paper also applies some industrial and quality engineering approaches recently described by Robyn Sands to the use of AWR tables. These ideas are also combined with use of the DB time metric championed by Kyle Hailey. I show below the outline for this paper, and a Microsoft PowerPoint version is also available.
The Automatic Workload Repository (AWR) takes "snapshots" of database performance statistics periodically, and records these data for later review. It is covered by Oracle's Diagnostic Pack License.
These AWR data are available in about 80 "tables" whose names begin with
		"DBA_HIST". Oracle uses the AWR data internally for its self-tuning feature
		called Automatic Database Diagnostic Monitor (ADDM),
		which includes a variety of reports and other tools. The ADDM will not be
		discussed further in this paper. Instead, this paper will focus on interactive
		use of custom queries using the AWR DBA_HIST tables.
The AWR records performance data from several perspectives, some of which are shown below. This paper primarily focuses on the SQL perspective, however the techniques presented here can be easily adapted to other perspectives.
AWR is not used for real-time performance monitoring like the V$ tables. Instead, it is used for historical analysis of performance. AWR
		complements, but does not replace, real-time monitoring. Example uses include
		the following:
Databases that you rarely, if ever, review
DBAs are often
			 responsible for more databases than they can watch closely at all times.
			 Consultants are often called into to diagnose and fix a database they have
			 never seen. A company might acquire another company's databases, suddenly
			 becoming responsible for them. A DBA might not have instant access to a
			 suffering database. In all these cases, AWR can provide vital data that is
			 otherwise not available.
Find and drill down into recent load spikes
A real-time
			 monitor might detect a database problem in time to alert DBAs, but diagnoses
			 might take a while. AWR automatically preserves vital performance data to
			 support thorough diagnosis over a period of days.
Prioritize developer and DBA resources
Many database
			 applications have a backlog of performance problems, all requiring limited
			 developer and DBA resources to fix. AWR can provide vital objective and
			 quantitative data to help prioritize such fixes, ensuring the most effective
			 use of these limited human resources.
Load testing
It is often possible to organize load testing
			 projects so that each test run fits within a set of AWR snapshots, with each
			 snapshot containing only one test run. When this approach is used, the detailed
			 performance data gathered automatically by AWR provides vital insight that is
			 simply not visible when using only application-level metrics. AWR can also be
			 used to assess comparability of the various test runs: it can tell you, for
			 example, if a given SQL statement executed the same number of times in each
			 test.
Capacity planning
The long-term view supported by AWR can
			 help assess overall database capacity. Some applications can use multiple
			 databases, and AWR can show which of these databases have the most remaining
			 capacity, thereby informing decisions about where to put new data or
			 customers.
Real-time monitors might be inadequate
It is not uncommon
			 for real-time database monitors to be poor, missing, or unreliable. Or perhaps
			 monitoring takes place only at the application level. AWR can make up for these
			 deficiencies by providing a complete record of the full range of Oracle
			 internal diagnostics.
An investigation into a database performance problem might involve a series of goals similar to the ones outlined below. Each of these goals calls for a slightly different set of AWR tables or query design. Although this outline focuses on an SQL perspective, these goals can be adapted to other perspectives, such as an alternative focus on segments. This paper presents example AWR queries for each of these different goals.
Find time ranges of load spikes
Often, the first step in
			 an investigation is to determine the specific time ranges of the problem. This
			 is usually the case even if external, application-level monitors have alerted
			 you to the problem. It is especially true if you are on a "fishing expedition",
			 with no knowledge of any particular problem or time range. One of the best
			 tools for this goal is to examine the history of the Average Active Session (AAS) metric.
Find specific problem SQLs
Once the time range of the
			 performance problem has been determined, the next step is to determine the
			 specific SQL statements associated with the problem. Several techniques are
			 available as shown below. These techniques can also be used as the starting
			 point of an investigation.
Active Session History (ASH) and its AWR version
ASH
				  provides a wealth of detail about active sessions and their problems. It
				  records session-level details that are not always preserved by AWR. However,
				  ASH does not include many of the performance statistics that are preserved by
				  AWR, and so these two Oracle features are complementary. This paper does not go
				  into detail about ASH.
Sort by aggregated statistics
Since AWR stores
				  performance statistics aggregated per snapshot for each SQL statement, a very
				  natural and obvious technique is to simply sum these up, sort them, then look
				  for statements with the highest sums. Usually, this is done with some
				  consideration of time range, such as "since the deployment of the last code
				  version" or "since addition of a new dataset", etc.
Non-uniform statistics
Many of the most interesting
				  and severe problems with database performance involve queries that normally
				  execute well, but occasionally create problems. For example, a query might
				  usually consume only a second or two of DB time per hour, then suddenly
				  take over the CPUs and cause loss of application functionality. If these spikes
				  are brief, they might not be visible through a review of gross aggregate data.
				  So a method of finding non-uniformities is needed. One such method, the use of
				  statistical "variance," is described below. 
Characterize a problem SQLs behavior over time
It is
			 often extraordinarily useful to examine a SQL statement's behavior over time.
			 Factors that can be easily seen include:
One reason that these time-related factors are so very useful is that they can tie internal database diagnostics to the applictaion itself. For example, a statement might execute only at night, which can provide clues to its source within the aplplication.
By default, AWR will take snapshots once per hour, at the "top" of each hour. Also, by default, AWR will retain only a week's worth of snapshots.
In my experience, the hourly interval is appropriate. However, I much prefer to retain a full month of data. Most of us have workweeks that are very busy and filled with crises, so being able to save AWR data for more than a week is very important. Also, some trends, or other changes, are much easier to spot when a full month of data are available.
Of course, storage needs increase along with snapshot frequency or length of retention. Storage needs will also probably vary with level of activity and application behavior. Oracle claims that about 300 megabytes are needed for a one-week retention of hourly snapshots; more space will be needed for longer retion periods.
The default AWR settings are modified using Oracle supplied package
		method
		DBMS_WORKLOAD_REPOSITORY.MODIFY_SNAPSHOT_SETTINGS().
		Example syntax is shown below. Both retention and interval arguments expect units of minutes. See the Oracle documentation for
		more details.
DBMS_WORKLOAD_REPOSITORY.MODIFY_SNAPSHOT_SETTINGS( retention IN NUMBER DEFAULT NULL, interval IN NUMBER DEFAULT NULL, topnsql IN NUMBER DEFAULT NULL, dbid IN NUMBER DEFAULT NULL);
The "Average Active Session" (AAS) metric is an extraordinarily simple and useful measure of the overall health of an Oracle database. This metric has been championed by Kyle Hailey, and some of his material can be found at the links below in the reference section.
The AAS metric can be defined as "DB time" divided by "elapsed time".
In turn, DB time is defined as the sum of time spent by all sessions both on the CPU and stuck in non-idle wait states. In other words, DB time can be thought of as the sum of time spent by all active sessions.
For example, let's say a database had four sessions that were active for the duration of a one-minute observation. Each session contributes one minute to the sum of DB time in the numerator, giving a total DB time value of four minutes. In this example, the "elapsed time" denominator is one minute. Doing the division gives an AAS of four. In this trivial example, it is easy to see how an AAS metric of four relates to having four active sessions. In this example, there was an average of four active sessions.
If we generalize the example to include more sessions that have various periods of active status, the calculation of the AAS metric still gives a sense of the average number of active sessions during the observation period. This explains the name of the metric, even though it is not explicitly defined based on counts or averages of sessions.
The AAS metric is most useful when compared to the number of CPUs available. If the AAS metric far exceeds the number of CPUs, then database performance will suffer. On the other hand, if the AAS metric is significantly less than the number of CPUs, then database performance should be fine. The threshold value of the AAS metric above which database performance suffers depends upon the application behavior and the expectations of the end users. Therefore, this threshold AAS value should be determined empirically within the context of the application. Once this threshold AAS value has been determined, this metric can serve as a very reliable, and readily available, indicator of overall database performance problems. In fact, reducing the DB time metric is the main goal of Oracle's built-in ADDM self-tuning tool.
The AAS metric can be calculated exactly, or it can be estimated, as described below. All methods, both using estimates or exact calculations, depend upon the choice the "elapsed time" denominator. The choice of the "elapsed time" denominator relates to issues of non-uniformity, as explained next.
The AAS is inherently an aggregate, or averaged, metric. Average-based metrics always hide any non-uniformity of the underlying data. Often, it is exactly this hidden non-uniformity that is most important for understanding an Oracle performance problem. Therefore, effective use of the metric requires that we understand and detect non-uniformity.
Note: several Oracle authors use the term "skew" to refer to any type of non-uniformity. The statistical definition of skew actually is limited to one particular type of non-uniformity. However, due to these existing precedents within the Oracle literature, this paper also uses the term "skew" to refer to any type of non-uniformity.
As will be shown with examples below, the choice of the "elapsed time" denominator is crucial to detection of skew. A large elapsed time (e.g., one hour) is convenient to use, since there relatively few rows to look at or plot. Also, an hourly average might be the most appropriate for many applications or studies. However, such large elapsed time intervals might mask brief spikes of vital interest. Using a small elapsed time instead (e.g., one minute) might reveal such brief spikes, but at the expense of more visual clutter, data volume, etc. The best approach, as described in detail below, is to use a large elapsed time when possible for convenience, but to also know when to focus in, using small elapsed times to identify brief spikes.
Robyn Sands recently presented a paper that outlines effective ways to detect skew (see reference section below). She suggests calculating the ratio of the variance to the mean of the observations: a "high" ratio indicates skew. This approach is easy to incorporate into estimates of AAS, and is demonstrated below.
Oracle includes DB time as one of the statistics listed in
		V$STATNAME,
		and its cumulative value is available in
		V$SYSSTAT.
		Unfortunately, the cumulative value is nearly worthless for most investigative
		purposes. Conceivably, to calculate the AAS metric, one could periodically
		record the DB time statistic in
		V$SYSSTAT,
		calculate the difference between successive observations, then divide by
		elapsed time. However, this is probably not worth the hassle, since there are
		easier methods, outlined next, that leverage Oracle's built-in AWR
		snapshotting.
A simpler exact calculation technique would be to use the
		DBA_HIST_SYSSTAT
		table, which is the AWR version of
		V$SYSSTAT.
		With this approach, Oracle is doing the periodic recording of DB time
		for us. But we are still left with the hassle of calculating the difference
		between successive observations, since this table stores cumulative, rather
		that incremental statistics. Moreover, the default AWR recording frequency, one
		hour intervals, is almost certainly too large for most AAS diagnostic purposes,
		hiding the skew that is usually of greatest interest.
An example query that calculates AAS exactly from
		DBA_HIST_SYSSTAT
		is available in the script section below as aas-exact.sql. I show here some example output
		that illustrates the relationship between DB time and AAS. Since the AWR
		snapshots are one hour intervals, we can divide the incremental DB
		time by 3600 to obtain AAS.
   SNAP_ID BEGIN_HOUR       SECONDS_PER_HOUR        AAS
---------- ---------------- ---------------- ----------
      4196 2008-07-09 06:00             3821        1.1
      4197 2008-07-09 07:00            12839        3.6
      4198 2008-07-09 08:00            76104       21.1
      4199 2008-07-09 09:00             6435        1.8
      4200 2008-07-09 10:00            15178        4.2
      4201 2008-07-09 11:00             7850        2.2
      4202 2008-07-09 12:00            11482        3.2
      4203 2008-07-09 13:00            14014        3.9
      4204 2008-07-09 14:00             8855        2.5
      4205 2008-07-09 15:00            31272        8.7
      4206 2008-07-09 16:00             4939        1.4
      4207 2008-07-09 17:00            28983        8.1
      4208 2008-07-09 18:00             4171        1.2
      4209 2008-07-09 19:00             2518         .7
      4210 2008-07-09 20:00             7044          2 
	 To clarify the method used to estimate the AAS metric from AWR data, the logic and math are explained below incrementally.
Oracle records key facts about active sessions about once per second,
		and maintains a historical buffer of its observations in a table called
		V$ACTIVE_SESSION_HISTORY.
		Each observation sample is identified by the integer column SAMPLE_ID. The count of rows in this table for a given SAMPLE_ID is essentially the count of active sessions for that observation. An
		example is shown in the query results below, where the count of active sessions
		ranges from 2 to 12 per ASH observation.
Although not essential to estimating AAS, the query below also distinguishes between sessions in a wait state from those that think they are on the CPU. This additional diagnostic detail is often helpful, but is not necessary.
column sample_time format a25 select sample_id, sample_time, sum(decode(session_state, 'ON CPU', 1, 0)) as on_cpu, sum(decode(session_state, 'WAITING', 1, 0)) as waiting, count(*) as active_sessions from v$active_session_history where -- last 15 seconds sample_time > sysdate - (0.25/1440) group by sample_id, sample_time order by sample_id ; SAMPLE_ID SAMPLE_TIME ON_CPU WAITING ACTIVE_SESSIONS ---------- ------------------------- ---------- ---------- --------------- 50667633 24-JUL-08 08.56.03.078 PM 3 9 12 50667634 24-JUL-08 08.56.04.085 PM 1 6 7 50667635 24-JUL-08 08.56.05.095 PM 0 4 4 50667636 24-JUL-08 08.56.06.105 PM 1 2 3 50667637 24-JUL-08 08.56.07.115 PM 0 3 3 50667638 24-JUL-08 08.56.08.125 PM 0 2 2 50667639 24-JUL-08 08.56.09.135 PM 3 1 4 50667640 24-JUL-08 08.56.10.155 PM 0 4 4 50667641 24-JUL-08 08.56.11.165 PM 1 2 3 50667642 24-JUL-08 08.56.12.175 PM 0 4 4 50667643 24-JUL-08 08.56.13.185 PM 1 2 3 50667644 24-JUL-08 08.56.14.195 PM 1 3 4 50667645 24-JUL-08 08.56.15.205 PM 1 4 5 50667646 24-JUL-08 08.56.16.215 PM 1 2 3 50667647 24-JUL-08 08.56.17.225 PM 0 2 2 50667648 24-JUL-08 08.56.18.235 PM 0 2 2 16 rows selected.
As an aside, be aware that the ASH definition of an active session does
		not necessarily correspond exactly to the value of
		V$SESSION.STATE
		(Shee 2004, pp. 253). Also, sometimes ASH will
		record sessions that are in an "Idle" wait state, even though we would not
		normally consider these to be "active" (I sometimes, extremely rarely, see ASH
		sessions with the "Idle" event='virtual circuit status'). However, neither of these very minor considerations impact the
		usefulness of this approach.
The "Step 1" query above provided session counts for every observation recorded by ASH in the time interval. To compute the average number of active sessions, we can turn the "Step 1" query above into an inline view subquery, then wrap it in an outer query that does the averaging. In this example, I also round the averages to a single decimal point, and associate the averages with the earliest timestamp in the subquery.
column sample_time format a19
select
   to_char(min(sub1.sample_time), 'YYYY-MM-DD HH24:MI:SS') as sample_time,
   round(avg(sub1.on_cpu),1) as cpu_avg,
   round(avg(sub1.waiting),1) as wait_avg,
   round(avg(sub1.active_sessions),1) as act_avg
from
   ( -- sub1: one row per second, the resolution of SAMPLE_TIME
     select
        sample_id,
        sample_time,
        sum(decode(session_state, 'ON CPU', 1, 0))  as on_cpu,
        sum(decode(session_state, 'WAITING', 1, 0)) as waiting,
        count(*) as active_sessions
     from
        v$active_session_history
     where
        sample_time > sysdate - (0.25/1440)
     group by
        sample_id,
        sample_time
   ) sub1
;
SAMPLE_TIME            CPU_AVG   WAIT_AVG    ACT_AVG
------------------- ---------- ---------- ----------
2008-07-24 20:56:03         .8        3.3        4.1
1 row selected.
 
	 As an aside, the averaging shown above does not include any data from
		ASH snapshots taken when the database was idle. ASH does not record a row for
		an observation that found no active sessions (i.e., such SAMPLE_ID values are "missing" from ASH). Therefore, the averages thus calculated
		will be artificially too high for those intervals that include observations
		without active sessions (because the N=samples denominator for the avg() function is artificially too low). However, this is not a problem in
		practice, since periods of interest to us usually involve performance crises,
		during which it is unlikely that ASH will observe an idle database. Obviously,
		most periods of interest involve plenty of active sessions, typically many more
		active sessions than CPUs.
To the above "Step 3" query, I have added both variance and the ratio of variance to mean. This allows us to use the techniques championed by Robyn Sands to find
		skew (as described above). A "high" ratio indicates skew.
select
   to_char(min(sub1.sample_time), 'YYYY-MM-DD HH24:MI:SS') as sample_time,
   round(avg(sub1.on_cpu),1) as cpu_avg,
   round(avg(sub1.waiting),1) as wait_avg,
   round(avg(sub1.active_sessions),1) as act_avg,
   round(variance(sub1.active_sessions),1) as act_var,
   round( (variance(sub1.active_sessions)/avg(sub1.active_sessions)),1) as act_var_mean
from
   ( -- sub1: one row per second, the resolution of SAMPLE_TIME
     select
        sample_id,
        sample_time,
        sum(decode(session_state, 'ON CPU', 1, 0))  as on_cpu,
        sum(decode(session_state, 'WAITING', 1, 0)) as waiting,
        count(*) as active_sessions
     from
        v$active_session_history
     where
        sample_time > sysdate - (0.25/1440)
     group by
        sample_id,
        sample_time
   ) sub1
;
SAMPLE_TIME            CPU_AVG   WAIT_AVG    ACT_AVG    ACT_VAR ACT_VAR_MEAN
------------------- ---------- ---------- ---------- ---------- ------------
2008-07-24 20:56:03         .8        3.3        4.1        6.1          1.5
1 row selected.
 
	 To the "Step 3" query above, I made the following changes to extend this approach to multiple, sequential time intervals. This query is now identical to the final aas-per-min.sql script below.
Used round(sub1.sample_time, 'MI') as the basis of GROUP BY, 
Added a sqlplus substitution variable to specify the overall time range of
			 interest,
Removed the variance column, since it is only interesting for debugging.
column sample_minute format a20
select
   to_char(round(sub1.sample_time, 'MI'), 'YYYY-MM-DD HH24:MI:SS') as sample_minute,
   round(avg(sub1.on_cpu),1) as cpu_avg,
   round(avg(sub1.waiting),1) as wait_avg,
   round(avg(sub1.active_sessions),1) as act_avg,
   round( (variance(sub1.active_sessions)/avg(sub1.active_sessions)),1) as act_var_mean
from
   ( -- sub1: one row per second, the resolution of SAMPLE_TIME
     select
        sample_id,
        sample_time,
        sum(decode(session_state, 'ON CPU', 1, 0))  as on_cpu,
        sum(decode(session_state, 'WAITING', 1, 0)) as waiting,
        count(*) as active_sessions
     from
        v$active_session_history
     where
        sample_time > sysdate - (&minutes/1440)
     group by
        sample_id,
        sample_time
   ) sub1
group by
   round(sub1.sample_time, 'MI')
order by
   round(sub1.sample_time, 'MI')
;
old  18:         sample_time > sysdate - (&minutes/1440)
new  18:         sample_time > sysdate - (10/1440)
SAMPLE_MINUTE           CPU_AVG   WAIT_AVG    ACT_AVG ACT_VAR_MEAN
-------------------- ---------- ---------- ---------- ------------
2008-07-25 19:05:00         1.7        3.2        4.9           .4
2008-07-25 19:06:00           1        3.4        4.4           .3
2008-07-25 19:07:00          .7        2.9        3.5           .2
2008-07-25 19:08:00           1        3.6        4.6           .4
2008-07-25 19:09:00          .8        3.3        4.2           .3
2008-07-25 19:10:00          .8        3.4        4.2           .3
2008-07-25 19:11:00           1        2.6        3.6           .3
2008-07-25 19:12:00          .5        1.9        2.4           .3
2008-07-25 19:13:00           1         .9        1.9           .6
2008-07-25 19:14:00          .8        1.8        2.6           .8
2008-07-25 19:15:00          .6        1.4          2           .4
11 rows selected.
 
	 To the "Step 4" query above, I made the following changes to extend this
		approach to time intervals older than maintained by
		V$ACTIVE_SESSION_HISTORY.
		This query is now identical to the final aas-per-hour.sql script below.
Changed the rounding to be hourly rather than by minute,
Switched to
			 DBA_HIST_ACTIVE_SESS_HISTORY
			 to review older data.
column sample_hour format a17
select
   to_char(round(sub1.sample_time, 'HH24'), 'YYYY-MM-DD HH24:MI') as sample_hour,
   round(avg(sub1.on_cpu),1) as cpu_avg,
   round(avg(sub1.waiting),1) as wait_avg,
   round(avg(sub1.active_sessions),1) as act_avg,
   round( (variance(sub1.active_sessions)/avg(sub1.active_sessions)),1) as act_var_mean
from
   ( -- sub1: one row per second, the resolution of SAMPLE_TIME
     select
        sample_id,
        sample_time,
        sum(decode(session_state, 'ON CPU', 1, 0))  as on_cpu,
        sum(decode(session_state, 'WAITING', 1, 0)) as waiting,
        count(*) as active_sessions
     from
        dba_hist_active_sess_history
     where
        sample_time > sysdate - (&hours/24)
     group by
        sample_id,
        sample_time
   ) sub1
group by
   round(sub1.sample_time, 'HH24')
order by
   round(sub1.sample_time, 'HH24')
;
 
	 As an aside, Oracle seems to use a sampling method to extract some of
		the rows from
		V$ACTIVE_SESSION_HISTORY
		for longer-term storage in
		DBA_HIST_ACTIVE_SESS_HISTORY.
		Based on empirical evidence, it appears that this sampling method selects a
		subset of SAMPLE_ID values for archiving, while obtaining all ASH observations for the
		chosen SAMPLE_ID values. This is very fortuitous, since it preserves the ability to
		estimate active sessions by counting rows per SAMPLE_ID value (the basis of all the queries presented here). If Oracle had
		implemented a different subsetting method, such as selecting random rows from
		V$ACTIVE_SESSION_HISTORY,
		then the count(*) based method here would break down.
Running the aas-per-hour.sql AWR
		script showed only low and moderate values of the hourly-average AAS metric (ACT_AVG column) over the previous twelve hours (from 1.8 to 6.4). Experience
		had shown that application users would start to suffer only when the AAS metric
		was greater than about 20. So based on only this hourly-average AAS metric, it
		would seem that there had been no performance problems over the twelve hour
		period.
However, the variance/mean value (ACT_VAR_MEAN) spiked very high (95.4) at around 14:00. This indicates a large amount
		of variability over that hour, perhaps a brief but intense spike in the AAS
		metric. Without considering the variance/mean, such a spike would have gone
		unnoticed.
SQL> @aas-per-hour Enter value for hours: 12 old 18: sample_time > sysdate - (&hours/24) new 18: sample_time > sysdate - (12/24) SAMPLE_HOUR CPU_AVG WAIT_AVG ACT_AVG ACT_VAR_MEAN ----------------- ---------- ---------- ---------- ------------ 2008-04-16 07:00 1.4 .4 1.8 .6 2008-04-16 08:00 1.8 .5 2.3 1 2008-04-16 09:00 2.3 .5 2.8 1.3 2008-04-16 10:00 2.6 .6 3.2 2.3 2008-04-16 11:00 3.5 .6 4.1 2.3 2008-04-16 12:00 2.4 .6 3 1.1 2008-04-16 13:00 2.3 .6 2.9 1 2008-04-16 14:00 3.7 2.7 6.4 95.4 <== spike in variance 2008-04-16 15:00 3.1 .7 3.8 1.9 2008-04-16 16:00 2.9 .7 3.6 1.6 2008-04-16 17:00 2.3 .4 2.7 .9 2008-04-16 18:00 2.1 .6 2.7 2.6
Since the apparent peak in the AAS metric occurred at a time older than
		was still available in
		V$ACTIVE_SESSION_HISTORY,
		the AWR historical data was necessary. Running the
		aas-per-min-awr.sql AWR script showed
		that the AAS metric spiked to a very high value (130.3) at around 14:08. The
		high value of the AAS metric indicated severe performance problems at that
		time, sure to impact the application end users. This knowledge of the time of
		the transient, but severe, spike enabled investigation and resolution of the
		problem. This brief problem might not have been visible without this approach,
		but finding it here was very quick: just two simple queries.
SQL> aas-per-min-awr Enter value for minutes: 300 old 18: SAMPLE_TIME > sysdate - (&minutes/1440) new 18: SAMPLE_TIME > sysdate - (300/1440) SAMPLE_MINUTE CPU_AVG WAIT_AVG ACT_AVG ACT_VAR_MEAN -------------------- ---------- ---------- ---------- ------------ 2008-04-16 13:54:00 3 1 4 0 2008-04-16 13:55:00 3.2 .3 3.5 .7 2008-04-16 13:56:00 4.2 .3 4.5 3.4 2008-04-16 13:57:00 3.8 .8 4.7 .7 2008-04-16 13:58:00 6.3 1 7.3 1.6 2008-04-16 13:59:00 3.4 .4 3.8 .2 2008-04-16 14:00:00 8.3 .5 8.8 1.8 2008-04-16 14:01:00 10.7 2.2 12.8 .5 2008-04-16 14:02:00 3.5 .7 4.2 .5 2008-04-16 14:03:00 2.6 1.2 3.8 1.5 2008-04-16 14:04:00 3.3 1.2 4.5 1.3 2008-04-16 14:05:00 8.2 .7 8.8 2.1 2008-04-16 14:06:00 6.7 1.3 8 1.1 2008-04-16 14:07:00 4.7 3.2 7.8 3.7 2008-04-16 14:08:00 20.5 109.8 130.3 170 <== spike in AAS 2008-04-16 14:09:00 6 1.3 7.3 10.3 2008-04-16 14:10:00 2.6 .4 3 .8 2008-04-16 14:11:00 4 .3 4.3 1.1 2008-04-16 14:12:00 5.7 .8 6.5 1.6 2008-04-16 14:13:00 3 .3 3.3 .7 2008-04-16 14:14:00 1.8 .7 2.5 .6 2008-04-16 14:15:00 3.3 2 5.3 2.2 2008-04-16 14:16:00 3.6 .6 4.2 .4 2008-04-16 14:17:00 3.2 2 5.2 1.9 2008-04-16 14:18:00 1 .8 1.8 .4 2008-04-16 14:19:00 2.2 .5 2.7 .2 2008-04-16 14:20:00 5.5 .8 6.3 .5 2008-04-16 14:21:00 2 .7 2.7 .7 2008-04-16 14:22:00 2.6 0 2.6 .3 2008-04-16 14:23:00 2.8 .5 3.3 .2 2008-04-16 14:24:00 1.7 1 2.7 .9 2008-04-16 14:25:00 1 .8 1.8 .7 2008-04-16 14:26:00 1.4 1.2 2.6 .3 2008-04-16 14:27:00 4.5 .5 5 .8 2008-04-16 14:28:00 2.3 1 3.3 1.6 2008-04-16 14:29:00 3.5 .5 4 1.6 2008-04-16 14:30:00 2.7 .8 3.5 .1 2008-04-16 14:31:00 5.3 1.7 7 .9 ...
The AWR table DBA_HIST_SQLSTAT records aggregate performance statistics for each combination of SQL
		statement and execution plan. It is "snapshot" based, and is easy to join to DBA_HIST_SNAPSHOT to get time interval details. It contains an excellent variety of basic
		performance statistics such as execution rate and buffer gets, as well as time
		spent in wait events, as detailed in the "Selected
		AWR tables" section below.
Once you have a time interval of interest (load spike, new code deployment, load test, etc.), you can often find interesting or significant SQL statements by aggregating these statistics, then sorting to find the biggest contributors to resource consumption.
Running the find-expensive.sql script while sorting by elapsed time gave the results shown below. These SQL statements were the largest consumers of DB time, and would probably benefit from a closer look. It is pretty easy to tell at a glance that some of these statements were big consumers of time due to high execution rate, while some others were relatively expensive with only a very few executions.
The script is very easy to modify to include different metrics, sort orders, or time ranges.
SQL> @find-expensive.sql
Enter value for start_yyyymmdd: 2008-08-01
old  16:     begin_interval_time > to_date('&&start_YYYYMMDD','YYYY-MM-DD')
new  16:     begin_interval_time > to_date('2008-08-01','YYYY-MM-DD')
SQL_ID        SECONDS_SINCE_DATE EXECS_SINCE_DATE GETS_SINCE_DATE
------------- ------------------ ---------------- ---------------
1wc4bx0qap8ph              30617            21563       284059357
6hudrj03d3w5g              23598         20551110       472673974
6tccf6u9tf891              18731            33666       457970700
2u874gr7qz2sk              15175            29014       370715705
fpth08dw8pyr6              14553             2565        36018228
1jt5kjbg7fs5p              11812            12451      2004271887
2f75gyksy99zn              10805            21529       567776447
ccp5w0adc6xx9               5222             6167       222949142
gn26ddjqk93wc               3568        114084711       248687700
b6usrg82hwsa3               2888                2       165621244
ctaajfgak033z               2391                4        66644336
7zwhhjv42qmfq               2197           592377        31495833
96v8tzx8zb99s               2152             6167       117875813
cxjsw79bprkm4               1526           396277       137413869
f2awk3951dcxv               1500             3462        35853709
fzmzt8mf2sw16               1421              311        44067742
01bqmm3gcy9yj               1329           299778        23504806
 
	 As explained above, aggregate statistics hide underlying skew. Short spikes in resource consumption often have severe impacts on application usability, but can go unnoticed in a review of aggregated data. Many database performance problems are related to skew: insight into problems, and their solutions often require finding or recognizing skew.
An excellent way to find skew is to use the statistical measure of
		non-uniformity called variance. This statistic is usually easier to use when it
		is normalized by dividing it by the mean. This technique was previously
		described for finding load spikes, and the
		script below, high-var-sql.sql, illustrates how
		it can be adapted to the SQL performance history in DBA_HIST_SQLSTAT.
Once you understand the general technique, this use of variance can be
		easily adapted to many other contexts, such as DBA_HIST_SEG_STAT, etc.
Running high-var-sql.sql over a week's
		work of data gave the following results. Notice how SQL_ID='g3176qdxahvv9' (third from the bottom) had only a moderate amount of elapsed time, but
		a variance much higher that its mean (ratio of 383). Subsequent investigation
		revealed a significant, although transient, problem with this query that was
		adversely impacting the application, but would not have been noticed by looking
		only at aggregate performance statistics.
SQL> @high-var-sql.sql Enter value for days_back: 7 old 17: snap.BEGIN_INTERVAL_TIME > sysdate - &&days_back new 17: snap.BEGIN_INTERVAL_TIME > sysdate - 7 old 32: count(*) > ( &&days_back * 24) * 0.50 new 32: count(*) > ( 7 * 24) * 0.50 SQL_ID AVG_SECONDS_PER_HOUR VAR_OVER_MEAN CT ------------- -------------------- ------------- ---------- 72wuyy9sxdmpx 41 7 167 bgpag6tkxt34h 29 12 167 crxfkabz8atgn 14 14 167 66uc7dydx131a 16 16 167 334d2t692js2z 36 19 167 6y7mxycfs7afs 23 20 167 36vs0kyfmr0qa 17 21 129 fp10bju9zh6qn 45 22 167 fas56fsc7j9u5 10 22 167 61dyrn8rjqva2 17 22 129 4f8wgv0d5hgua 31 23 167 7wvy5xpy0c6k5 15 23 151 8v59g9tn46y3p 17 24 132 9pw7ucw4n113r 59 27 167 41n1dhb0r3dhv 32 32 120 8mqxjr571bath 35 38 117 8jp67hs2296v3 46 154 128 afdjq1cf8dpwx 34 184 150 6n3h2sgxpr78g 454 198 145 g3176qdxahvv9 42 383 92 b72dmps6rp8z8 209 1116 167 6qv7az2048hk4 3409 50219 167
The techniques above will help you find SQL statements that are
		associated with load spikes, high resource consumption, or unstable
		performance. Once you have some suspect SQL statements to investigate, it is
		often very helpful to review performance over time. By using DBA_HIST_SQLSTAT to examine the time behavior of an SQL statement, it is often easy to
		spot trends or patterns that point towards causes and solutions. This approach
		can also help identify parts ofthe application using the SQL. The
		sql-stat-hist.sql is one way to spot these
		trends, and was used for the following examples.
The SQL statement with the time behavior shown below had sustained high execution rates, as high as 44 times per second (158739 per hour). It was very efficient, at a steady four gets per execution. However, it would occasionally completely consume the CPUs, with over 45,000 seconds per hour (12.6 hours per hour, averaged over a whole hour!). This was due to concurrency-related wait event pile-ups. The data shown below was vital for resolution of this problem, and these time-series data would have been hard to obtain without AWR.
   SNAP_ID BEGIN_HOUR       EXECS_PER_HOUR GETS_PER_HOUR GETS_PER_EXEC SECONDS_PER_HOUR
---------- ---------------- -------------- ------------- ------------- ----------------
      1978 2008-04-07 20:00         140449        540639             4               11
      1979 2008-04-07 21:00         124142        477807             4               17
      1980 2008-04-07 22:00          90568        347286             4               20
      1981 2008-04-07 23:00          83287        323100             4               30
      1982 2008-04-08 00:00          57094        221166             4               49
      1983 2008-04-08 01:00          43925        170594             4                7
      1984 2008-04-08 02:00          38596        150277             4                4
      1985 2008-04-08 03:00          35710        139576             4                4
      1986 2008-04-08 04:00          29700        115429             4                4
      1987 2008-04-08 05:00          43666        170520             4                5
      1988 2008-04-08 06:00          50755        197116             4                6
      1989 2008-04-08 07:00          80371        310652             4                9
      1990 2008-04-08 08:00         111924        431470             4               11
      1991 2008-04-08 09:00         127154        489649             4               27
      1992 2008-04-08 10:00         139270        536962             4               25
      1993 2008-04-08 11:00         128697        496013             4               18
      1994 2008-04-08 12:00         158739        613554             4            45287
      1995 2008-04-08 13:00         152515        587605             4               40
      1996 2008-04-08 14:00         144389        555770             4            37589
      1997 2008-04-08 15:00         149278        575827             4               26
      1998 2008-04-08 16:00         140632        542580             4               12
      1999 2008-04-08 17:00         120113        462665             4               11
      2000 2008-04-08 18:00         121394        468684             4               12
      2001 2008-04-08 19:00         127948        493084             4               13
 
	 The SQL statement with the time behavior shown below had nightly high execution rates, but it was not executed during the day. As shown by the last column, the database seemed to be able to handle this high execution rate for this efficient query (all values well under 3600). Nevertheless, these data pointed to a flaw in the application that needed fixing.
   SNAP_ID BEGIN_HOUR       EXECS_PER_HOUR GETS_PER_HOUR GETS_PER_EXEC SECONDS_PER_HOUR
---------- ---------------- -------------- ------------- ------------- ----------------
      1811 2008-03-31 21:00          98550        893916             9               28
      1812 2008-03-31 22:00           9794         89386             9                2
      1823 2008-04-01 09:00           3038         27604             9                1
      1824 2008-04-01 10:00           4360         39362             9                1
      1825 2008-04-01 11:00           3608         32759             9                1
      1859 2008-04-02 21:00          17369        156840             9                3
      1883 2008-04-03 21:00          79566        717500             9               22
      1884 2008-04-03 22:00         207334       1871430             9               38
      1885 2008-04-03 23:00         276997       2500938             9               39
      1886 2008-04-04 00:00         258505       2329526             9               36
      1887 2008-04-04 01:00         190127       1710001             9               27
      1888 2008-04-04 02:00         188449       1695215             9               24
      1907 2008-04-04 21:00         102162        923998             9               20
      1930 2008-04-05 20:00          17437        158213             9                3
      1931 2008-04-05 21:00         196100       1768306             9               30
      1932 2008-04-05 22:00         207867       1875544             9               40
      1933 2008-04-05 23:00         230548       2079470             9               32
      1934 2008-04-06 00:00         216352       1946824             9               31
      1935 2008-04-06 01:00         207935       1871111             9               28
      1936 2008-04-06 02:00         118544       1065785             9               15
 
	 The SQL statement with the time behavior shown below had sporadically high execution rates. As shown by the last column, the database seemed to be able to handle this high execution rate for this efficient query (all values well under 3600). Nevertheless, these data pointed to a flaw in the application that needed fixing.
   SNAP_ID BEGIN_HOUR       EXECS_PER_HOUR GETS_PER_HOUR GETS_PER_EXEC SECONDS_PER_HOUR
---------- ---------------- -------------- ------------- ------------- ----------------
      1790 2008-03-31 00:00           6710         20340             3                0
      1791 2008-03-31 01:00             83           253             3                0
      1792 2008-03-31 02:00             18            54             3                0
      1793 2008-03-31 03:00             18            54             3                0
      1794 2008-03-31 04:00              1             3             3                0
      1795 2008-03-31 05:00             16            48             3                0
      1796 2008-03-31 06:00        1943358       5901783             3               85
      1797 2008-03-31 07:00           5633         17195             3                0
      1798 2008-03-31 08:00         927016       2815340             3               35
      1799 2008-03-31 09:00        5843023      17744104             3              252
      1800 2008-03-31 10:00        2929624       8896969             3              131
      1801 2008-03-31 11:00         988709       3002649             3               45
      1802 2008-03-31 12:00        1959757       5951342             3              108
      1803 2008-03-31 13:00          10767         32728             3                1
      1804 2008-03-31 14:00         997451       3028890             3               70
      1805 2008-03-31 15:00        1000944       3039948             3               49
      1806 2008-03-31 16:00           5166         15861             3                0
      1807 2008-03-31 17:00           4821         14616             3                0
      1808 2008-03-31 18:00          11639         35243             3                1
      1809 2008-03-31 19:00           8346         25421             3                1
      1810 2008-03-31 20:00           4731         14380             3                1
      1811 2008-03-31 21:00        1975147       5998626             3              160
      1812 2008-03-31 22:00          27361         83023             3                3
      1813 2008-03-31 23:00            521          1589             3                0
 
	 The SQL statement with the time behavior shown below had sporadically
		high execution rates. As shown by the last column, the database was often
		struggling with this execution rate. For example, during the hour of 2008-04-03
		10:00 it was essentially consuming more than a whole CPU all by itself (4502
		> 3600). Also, it would switch execution plans, with the plans having
		different efficiencies (primary key plan_hash_value is not shown here, but notice how the hour of 2008-04-02 23:00 has two
		rows). Again, these AWR data were critical into characterizing this SQL
		statement's behavior so that a fix could be designed.
   SNAP_ID BEGIN_HOUR       EXECS_PER_HOUR GETS_PER_HOUR GETS_PER_EXEC SECONDS_PER_HOUR
---------- ---------------- -------------- ------------- ------------- ----------------
      1848 2008-04-02 10:00        1028451       3155807             3               39
      1849 2008-04-02 11:00        1015627       3116830             3               35
      1850 2008-04-02 12:00         957525       2941788             3               34
      1851 2008-04-02 13:00           7740         23486             3                0
      1852 2008-04-02 14:00        2039987       6260065             3               86
      1853 2008-04-02 15:00        1017857       3123548             3               33
      1854 2008-04-02 16:00           3692         11286             3                0
      1855 2008-04-02 17:00           8700         26482             3                0
      1856 2008-04-02 18:00           5895         17937             3                0
      1857 2008-04-02 19:00           7296         22103             3                0
      1858 2008-04-02 20:00           2156          6526             3                0
      1859 2008-04-02 21:00           2686          8186             3                0
      1860 2008-04-02 22:00           5439         74432            14               14
      1861 2008-04-02 23:00         227644       3152747            14              848
      1861 2008-04-02 23:00             80           283             4                0
      1862 2008-04-03 00:00         792146       7807033            10             1215
      1865 2008-04-03 03:00            829          7464             9                1
      1867 2008-04-03 05:00            432          3889             9                0
      1868 2008-04-03 06:00            388          2720             7                0
      1869 2008-04-03 07:00           1273          9142             7                1
      1870 2008-04-03 08:00          28277        804514            28              190
      1871 2008-04-03 09:00         399722       5372737            13             1461
      1872 2008-04-03 10:00        1563634      17540545            11             4503
      1873 2008-04-03 11:00            232           717             3                0
 
	 The various 10g databases I have seen all contained 79 AWR "tables"
		(i.e., tables whose names begin with "DBA_HIST_"). Of course, these are not really tables, but SYS-owned views with public synonyms. Many of the underlying objects seem
		to have names starting with "WRH$_" and their segments seem to reside in the SYSAUX tablespace. However, this paper is not a detailed look at the
		underlying structure of the AWR tables.
This paper discusses only a small fraction of the approximately 79 AWR tables. The focus here is application SQL performance diagnostics, rather than topics of more interest to the DBA such as undo segments, SGA, etc.
The
		DBA_HIST_SNAPSHOT
		table defines the time interval for each AWR snapshot (SNAP_ID). Its effective primary key apparently includes these columns:
SNAP_IDDBID and INSTANCE_NUMBER (irrelevant for single, non-RAC database)Name Null? Type ----------------------------------------- -------- ---------------------------- SNAP_ID NOT NULL NUMBER DBID NOT NULL NUMBER INSTANCE_NUMBER NOT NULL NUMBER ... BEGIN_INTERVAL_TIME NOT NULL TIMESTAMP(3) END_INTERVAL_TIME NOT NULL TIMESTAMP(3) ...
The
		DBA_HIST_SQLSTAT
		table records aggregate performance statistics for each SQL statement and
		execution plan. Its effective primary key apparently includes these
		columns:
SNAP_IDSQL_IDPLAN_HASH_VALUE(DBID and INSTANCE_NUMBER, irrelevant for single, non-RAC database)It includes basic statistics such as executions, gets, and reads. as well as wait times in classes of IO, concurrency, application (in microseconds). It also include CPU time and elapsed time. This is a very comprehensive set of statistics.
Name Null? Type ----------------------------------------- -------- ---------------------------- SNAP_ID NUMBER DBID NUMBER INSTANCE_NUMBER NUMBER SQL_ID VARCHAR2(13) PLAN_HASH_VALUE NUMBER ... MODULE VARCHAR2(64) ACTION VARCHAR2(64) ... PARSING_SCHEMA_NAME VARCHAR2(30) ... EXECUTIONS_TOTAL NUMBER EXECUTIONS_DELTA NUMBER ... DISK_READS_DELTA NUMBER ... BUFFER_GETS_DELTA NUMBER ... CPU_TIME_DELTA NUMBER ... ELAPSED_TIME_DELTA NUMBER ... IOWAIT_DELTA NUMBER ... CLWAIT_DELTA NUMBER ... APWAIT_DELTA NUMBER ... CCWAIT_DELTA NUMBER ...
Most of these statistics are available in both cumulative (i.e., since
		parsing) and incremental (i.e., for the snapshot only) aggregates. The
		incremental aggregates, with names ending in DELTA, are much more useful, since they allow you to calculate sums for
		specific snapshots. In fact, the TOTAL cumulative versions can be horribly misleading, since they can actually
		decrease, presumably if it aged out then brought back into the library cache.
		The following example illustrates this severe problem with TOTAL versions:
select snap_id, to_char(begin_interval_time,'YYYY-MM-DD HH24:MI') as begin_hour, executions_total, executions_delta from dba_hist_snapshot natural join dba_hist_sqlstat where sql_id = 'gk8sdttq18sxw' order by snap_id ; SNAP_ID BEGIN_HOUR EXECS_TOTAL EXECS_DELTA ------- ---------------- ----------- ----------- 4571 2008-07-24 21:00 52647 52647 4572 2008-07-24 22:00 63756 11109 4691 2008-07-29 21:00 27602 27576 4739 2008-07-31 21:00 77292 77280 4756 2008-08-01 14:00 79548 2256 4757 2008-08-01 15:00 109722 30174 4758 2008-08-01 16:00 137217 27495 4759 2008-08-01 17:00 155265 18048 4763 2008-08-01 21:00 237432 82167 4823 2008-08-04 09:00 97036 19744 4824 2008-08-04 10:00 11232 11232 4835 2008-08-04 21:00 2016 2016
The
		DBA_HIST_SYSSTAT
		table records hourly snapshots of
		V$SYSSTAT.
		It includes almost 400 values of STAT_NAME.
It includes only cumulative data, not incremental, so you need to calculate the deltas yourself. These cumulative statistic counters get reset with an Oracle bounce, which complicates the calculation of deltas.
Its DB time values are in units of centiseconds, unlike some other AWR tables, which complicates things.
Many of its statistics can be used a basis for comparison, for example calculating the percentage of all DB time consumed by a particular query as a function of time.
Its effective primary key apparently includes these columns:
SNAP_IDSTAT_IDDBID and INSTANCE_NUMBER (irrelevant for single, non-RAC database)Name Null? Type ----------------------------------------- -------- ---------------------------- SNAP_ID NUMBER DBID NUMBER INSTANCE_NUMBER NUMBER STAT_ID NUMBER STAT_NAME VARCHAR2(64) VALUE NUMBER
The
		DBA_HIST_SEG_STAT
		table provides a very useful alternative perspective from usual SQL focus. In
		some cases, the database objects themselves must be redesigned, since SQL
		tuning can only go so far. This table can help you identify objects associated
		with the greatest resource consumption or with frequent occurrences of
		spikes.
This table includes basic statistics such as logical reads, physical reads, and block changes, as well as wait counts such as buffer busy and row locks. Both "delta" and "total" values are available: use the "delta" versions for easier aggregation within time intervals.
You should join to DBA_HIST_SEG_STAT_OBJ to get segment characteristics.
Its effective primary key apparently includes these columns:
SNAP_IDOBJ# DATAOBJ#DBID and INSTANCE_NUMBER (irrelevant for single, non-RAC database)Name Null? Type ----------------------------------------- -------- ---------------------------- SNAP_ID NUMBER DBID NUMBER INSTANCE_NUMBER NUMBER TS# NUMBER OBJ# NUMBER DATAOBJ# NUMBER ... LOGICAL_READS_DELTA NUMBER ... BUFFER_BUSY_WAITS_DELTA NUMBER ... DB_BLOCK_CHANGES_DELTA NUMBER ... PHYSICAL_READS_DELTA NUMBER ... PHYSICAL_WRITES_DELTA NUMBER ... PHYSICAL_READS_DIRECT_DELTA NUMBER ... ROW_LOCK_WAITS_DELTA NUMBER ... GC_BUFFER_BUSY_DELTA NUMBER ... SPACE_USED_TOTAL NUMBER SPACE_USED_DELTA NUMBER SPACE_ALLOCATED_TOTAL NUMBER SPACE_ALLOCATED_DELTA NUMBER ... TABLE_SCANS_DELTA NUMBER
The
		DBA_HIST_SEG_STAT_OBJ
		table contains segment level details for objects tracked by DBA_HIST_SEG_STAT. These details include name, owner, type and tablespace name. Several
		segment types are included:
Its effective primary key apparently includes these columns:
OBJ#DATAOBJ#DBID (irrelevant for single, non-RAC database)Name Null? Type ----------------------------------------- -------- ---------------------------- DBID NOT NULL NUMBER TS# NUMBER OBJ# NOT NULL NUMBER DATAOBJ# NOT NULL NUMBER OWNER NOT NULL VARCHAR2(30) OBJECT_NAME NOT NULL VARCHAR2(30) SUBOBJECT_NAME VARCHAR2(30) OBJECT_TYPE VARCHAR2(18) TABLESPACE_NAME NOT NULL VARCHAR2(30) PARTITION_TYPE VARCHAR2(8)
The
		DBA_HIST_SQLTEXT
		table contains the full text of SQL statements for (nearly all) SQL_ID values included in other AWR tables. A SQL statement can often be found
		here even when it is no longer in
		V$SQL
		and friends.
Its effective primary key apparently includes these columns:
SQL_IDDBID and INSTANCE_NUMBER (irrelevant for single, non-RAC database)Name Null? Type ----------------------------------------- -------- ---------------------------- DBID NOT NULL NUMBER SQL_ID NOT NULL VARCHAR2(13) SQL_TEXT CLOB COMMAND_TYPE NUMBER
The
		DBA_HIST_ACTIVE_SESS_HISTORY
		table contains a subset of the active session data sampled about once per
		second in
		V$ACTIVE_SESSION_HISTORY,
		and is therefore a part of ASH as well as AWR.
It is one of the few AWR tables that is not based on the AWR snapshots, since it has a much smaller time resolution. It is not uncommon to see resolution of about ten seconds. In other words, perhaps about one out of every ten ASH once-per-second samples is included in AWR.
See also the AAS example above.
column sample_hour format a16
select
   to_char(round(sub1.sample_time, 'HH24'), 'YYYY-MM-DD HH24:MI') as sample_hour,
   round(avg(sub1.on_cpu),1) as cpu_avg,
   round(avg(sub1.waiting),1) as wait_avg,
   round(avg(sub1.active_sessions),1) as act_avg,
   round( (variance(sub1.active_sessions)/avg(sub1.active_sessions)),1) as act_var_mean
from
   ( -- sub1: one row per second, the resolution of SAMPLE_TIME
     select
        sample_id,
        sample_time,
        sum(decode(session_state, 'ON CPU', 1, 0))  as on_cpu,
        sum(decode(session_state, 'WAITING', 1, 0)) as waiting,
        count(*) as active_sessions
     from
        dba_hist_active_sess_history
     where
        sample_time > sysdate - (&hours/24)
     group by
        sample_id,
        sample_time
   ) sub1
group by
   round(sub1.sample_time, 'HH24')
order by
   round(sub1.sample_time, 'HH24')
;
 
	 See also the AAS example above.
column sample_minute format a16
select
   to_char(round(sub1.sample_time, 'MI'), 'YYYY-MM-DD HH24:MI') as sample_minute,
   round(avg(sub1.on_cpu),1) as cpu_avg,
   round(avg(sub1.waiting),1) as wait_avg,
   round(avg(sub1.active_sessions),1) as act_avg,
   round( (variance(sub1.active_sessions)/avg(sub1.active_sessions)),1) as act_var_mean
from
   ( -- sub1: one row per second, the resolution of SAMPLE_TIME
     select
        sample_id,
        sample_time,
        sum(decode(session_state, 'ON CPU', 1, 0))  as on_cpu,
        sum(decode(session_state, 'WAITING', 1, 0)) as waiting,
        count(*) as active_sessions
     from
        v$active_session_history
     where
        sample_time > sysdate - (&minutes/1440)
     group by
        sample_id,
        sample_time
   ) sub1
group by
   round(sub1.sample_time, 'MI')
order by
   round(sub1.sample_time, 'MI')
;
 
	 See also the AAS example above.
column sample_minute format a16
select
   to_char(round(sub1.sample_time, 'MI'), 'YYYY-MM-DD HH24:MI') as sample_minute,
   round(avg(sub1.on_cpu),1) as cpu_avg,
   round(avg(sub1.waiting),1) as wait_avg,
   round(avg(sub1.active_sessions),1) as act_avg,
   round( (variance(sub1.active_sessions)/avg(sub1.active_sessions)),1) as act_var_mean
from
   ( -- sub1: one row per sampled ASH observation second
     select
        sample_id,
        sample_time,
        sum(decode(session_state, 'ON CPU', 1, 0))  as on_cpu,
        sum(decode(session_state, 'WAITING', 1, 0)) as waiting,
        count(*) as active_sessions
     from
        dba_hist_active_sess_history
     where
        sample_time > sysdate - (&minutes/1440)
     group by
        sample_id,
        sample_time
   ) sub1
group by
   round(sub1.sample_time, 'MI')
order by
   round(sub1.sample_time, 'MI')
;
 
	 column BEGIN_HOUR format a16
select
   stat_start.snap_id,
   to_char(snap.begin_interval_time,'YYYY-MM-DD HH24:MI') as begin_hour,
   -- DB time is in units of centiseconds in DBA_HIST_SYSSTAT.VALUE
   round( (stat_end.value - stat_start.value)/100 , 0) as seconds_per_hour,
   -- also assumes hourly snapshots, hence divided by 3600
   round( (stat_end.value - stat_start.value)/(100*3600) , 1) as aas
from
   dba_hist_sysstat stat_start,
   dba_hist_sysstat stat_end,
   dba_hist_snapshot snap
where
   -- assumes the snap_id at the end of the interval is 
   -- one greater than the snap_id at teh start ofthe interval
   --
   stat_end.snap_id = stat_start.snap_id + 1
and
   -- otherwise, we join stat_end and stat_start
   -- on exact matches of the remaining PK columns
   --
   ( stat_end.dbid = stat_start.dbid
     and
     stat_end.instance_number = stat_start.instance_number
     and
     stat_end.stat_name = stat_start.stat_name
   )
and
   -- filter for the statistic we are interested in
   --
   stat_end.stat_name = 'DB time'
and
   -- join stat_start to snap on FK
   -- 
   ( stat_start.snap_id = snap.snap_id
     and
     stat_start.dbid = snap.dbid
     and
     stat_start.instance_number = snap.instance_number
   )
order by
   stat_start.snap_id
; 
	 See also the aggregate example above.
This script looks at three metrics only, but it is easy to use other
		metrics stored by the DBA_HIST_SQLSTAT table. For the order-by clause, I suggest using the numeric column
		position style so that it is easy to change interactively.
-- gets most expensive queries 
-- (by time spent, change "order by" to use another metric)
-- after a specific date
select
   sub.sql_id,
   sub.seconds_since_date,
   sub.execs_since_date,
   sub.gets_since_date
from
   ( -- sub to sort before rownum
     select
        sql_id,
        round(sum(elapsed_time_delta)/1000000) as seconds_since_date,
        sum(executions_delta) as execs_since_date,
        sum(buffer_gets_delta) as gets_since_date
     from
        dba_hist_snapshot natural join dba_hist_sqlstat
     where
        begin_interval_time > to_date('&&start_YYYYMMDD','YYYY-MM-DD')
     group by
        sql_id
     order by
        2 desc
   ) sub
where
   rownum < 30
; 
	 See also the high-variance example above.
-- high-var-sql.sql
undefine days_back
select
   sub1.sql_id,
   round( avg(sub1.seconds_per_hour) ) as avg_seconds_per_hour,
   round( variance(sub1.seconds_per_hour)/avg(sub1.seconds_per_hour) ) as var_over_mean,
   count(*) as ct
from
   ( -- sub1
     select
        snap_id,
        sql_id,
        elapsed_time_delta/1000000 as seconds_per_hour
     from
        dba_hist_snapshot natural join dba_hist_sqlstat
     where
        -- look at recent history only
        begin_interval_time > sysdate - &&days_back;
     and
        -- must have executions to be interesting
        executions_delta > 0
   ) sub1
group by 
   sub1.sql_id
having 
   -- only queries that consume 10 seconds per hour on the average
   avg(sub1.seconds_per_hour) > 10
and 
   -- only queries that run 50% of the time
   -- assumes hourly snapshots too
   count(*) > ( &&days_back * 24) * 0.50
order by
   3
;
undefine days_back 
	 See also example scenarios 1, 2, 3, and 4 above.
-- gets basic DBA_HIST_SQLSTAT data for a single sql_id
-- assumes that each AWR snap is one-hour (used in names, not math)
column BEGIN_HOUR format a16
select
   snap_id,
   to_char(begin_interval_time,'YYYY-MM-DD HH24:MI') as begin_hour,
   executions_delta as execs_per_hour,
   buffer_gets_delta as gets_per_hour,
   round(buffer_gets_delta/executions_delta) as gets_per_exec,
   round(elapsed_time_delta/1000000) as seconds_per_hour
from
   dba_hist_snapshot natural join dba_hist_sqlstat
where
   begin_interval_time between to_date('&start_hour', 'YYYY-MM-DD HH24:MI')
                           and to_date('&end_hour',   'YYYY-MM-DD HH24:MI')
and
   sql_id = '&sql_id'
and
   executions_delta > 0
order by
   snap_id
;
 
	 AWR enables study of historical database performance statistics. This information complements, but does not replace, real-time monitoring. However, AWR tables provide many benefits that are not otherwise easy to obtain, in a wide variety of contexts:
AWR tables are easy to use, and encourage interactive exploration. The tables are easy to join, and their information is relevant, well organized and clearly documented.
The industrial/quality engineering concept of using variance to find skew is easy to incorporate into AWR projects. This approach illuminates anomalies that might otherwise remain unnoticed, such as very short spikes. These anomalies often point to limits of database scalability that need to be addressed.
Kyle Hailey has championed the Average Active Session metric (AAS), and some of his material can be found at the following links:
Robyn Anderson Sands, a System Design Architect with Cisco Systems, wrote "An Industrial Engineer's Approach to Managing Oracle Databases", which describes the usefulness of the ratio of variance to mean for finding skew:
Oracle Wait Interface: A Practical Guide to
		Performance Diagnostics & Tuning
By Richmond Shee, Kirtikumar
		Deshpande, K. Gopalakrishnan
Published 2004 McGraw-Hill Professional,
		2004
ISBN:007222729X