DBA
Querying the alert log via SQL
Quick tip regarding the Oracle database alert log (from 11g onwards). There is a fixed table X$DBGALERTEXT:
SQL> select message_text from X$DBGALERTEXT where rownum <= 30; MESSAGE_TEXT ----------------------------------------------------------------------------------------------------------------------------------------- Starting ORACLE instance (normal) LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 Initial number of CPU is 2 Number of processor cores in the system is 2 Number of processor sockets in the system is 1 Shared memory segment for instance monitoring created CELL communication is configured to use 0 interface(s): CELL IP affinity details: NUMA status: non-NUMA system cellaffinity.ora status: N/A CELL communication will use 1 IP group(s): Grp 0: Picked latch-free SCN scheme 3 Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST Autotune of undo retention is turned on. IMODE=BR ILAT =27 LICENSE_MAX_USERS = 0 SYS auditing is disabled Starting up: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options. ORACLE_HOME = /u01/app/oracle/product/11.2.0/orcl System name:Linux Node name:ODIGettingStarted Release:2.6.39-400.17.1.el6uek.x86_64 Version:#1 SMP Fri Feb 22 18:16:18 PST 2013 Machine:x86_64 Using parameter settings in client-side pfile /u01/app/oracle/admin/orcl/pfile/init.ora on machine ODIGettingStarted System parameters with non-default values: 30 rows selected.
My personal opinion? This can be useful if you're looking to create some custom alert log monitoring. However I still prefer to monitor my alert logs using shell scripts since accessing this X$ table requires the instance to be up and operational. But if you don't have access to the OS then this could be useful.
I also found the following Metalink note:
High CPU for Queries on X$DBGALERTEXT (Doc ID 2056666.1)
APPLIES TO:
Oracle Database – Enterprise Edition – Version 11.2.0.1 and later
Information in this document applies to any platform.
SYMPTOMS
- Query on X$DBGALERTEXT consumes high CPU taking a long time to complete.For example:
FROM X$DBGALERTEXT
WHERE to_date(to_char(originating_timestamp, ‘dd-mon-yyyy hh24:mi’), ‘dd-mon-yyyy hh24:mi’) > to_date(to_char(systimestamp – .00694, ‘dd-mon-yyyy hh24:mi’), ‘dd-mon-yyyy hh24:mi’) /* last 10 minutes */
AND (
message_text = ‘ORA-00600’
OR message_text LIKE ‘útal%’
OR message_text LIKE ‘%error%’
OR message_text LIKE ‘%ORA-%’
OR message_text LIKE ‘%terminating the instance%’
);
- It can also cause ORA-700 [dbgrfafr_1].
Comments in your SPFILE
Consider using the COMMENT= clause to document why a particular change was made the next time you make a change to a parameter using an SPFILE:
SQL> alter system set pga_aggregate_target=512m comment='Changed 04-JUN-2018, AWR recommendation, MR'; System altered. SQL> select value, update_comment from v$parameter where name = 'pga_aggregate_target';SQL> select value, update_comment from v$parameter where name = 'pga_aggregate_target'; VALUE UPDATE_COMMENT -------------------- --------------------------------------------------------------------------------------------------- 536870912 Changed 04-JUN-2018, AWR recommendation, MR
Oracle Database 11g Administrator Certified Professional
Introduction to Database Normalisation
Background
There are two approaches in relational database design.
* TOP DOWN: From Data Modeling (eg. ER Model) to Relational Logical Model for implementation.
* BOTTOM UP: Normalization of Relations
Normalisation is a formal process for deciding which attributes should be grouped together in a relation so that all anomalies are removed. Hence the aim is to successively reduce relations to produce smaller, well structured relations.
Dependencies
Functional Dependency:
The simplest kind of dependency is called functional dependency (FD). The dependencies are best explained through examples.
For example, LecturerID -> LecturerName
is a valid FD because:
For each LecturerID there is at most one LecturerName, or
LecturerName is determined by LecturerID , or
LecturerName is uniquely determined by LecturerID , or
LecturerName depends on LecturerID .
Each of the above statements is equivalent.
Formally;
The FD X -> Y is a full dependency if no attribute can be removed from X.
LabDate, SubjectCode -> Tutor is a full dependency, that is, Tutor is fully dependent on both LabDate AND SubjectCode.
Partial Dependency:
The FD X -> Y is a partial dependency if an attribute can be removed from X.
LecturerID, SubjectCode -> LecturerName is a partial dependency, that is, LecturerName is partially dependent on LecturerID AND SubjectCode.
(to determine LecturerName, I only need to know LecturerID).
Transitive Dependency:
Dependencies can be transitive.
For example, if one lecturer can teach one subject and each subject only has one tutor, then we might have the dependencies:
LecturerID -> SubjectCode
SubjectCode -> Tutor
and, transitively LecturerID -> Tutor.
Normal Forms:
Functional dependencies can be used to decide whether a schema is well designed.
For example, in the following relation:
LecturerSubject (LecturerID, LecturerName, SubjectCode, SubjectName)
Anomalies?
If there is a new subject which has not been allocated a lecturer, can you record the details of this subject in the above table? (Insert Anomaly)
If an existing subject changes the name, can you do the changes to one instance only? (Update Anomaly)
If a lecturer resigns and the details are to be deleted, would there be a chance that some subjects will be removed permanently and we won’t have any track record of those subjects anymore? (Delete Anomaly)
Design errors in relations, such as the potential for certain kinds of anomalies, can be categorised. These categories of error can be successively eliminated by decomposing relations into normal forms.
The major/main normal forms are first (1NF), second (2NF), third (3NF), and Boyce Codd (BCNF). Higher/advanced normal forms including fourth (4NF), and fifth (5NF). Because problems with 4NF and 5NF rarely occur, moreover database designers in industry normally do not need to use the highest possible NF for practical reasons. We will focus on satisfying 3NF level.
First Normal Form (1NF):
A relation is in 1NF if:
- There are no repeating groups
- A primary key has been defined, which uniquely identifies each row in the relation.
- All attributes are functionally dependent on all or part of the key.
- Attributes should be stored as atomic values -> Each field entry can only contain one piece of data. E.g. A name field containing “Fred Smith” has surname and first name, violating 1NF.
Second Normal Form (2NF):
A relation is in 2NF if:
- The relation is in 1 NF
- All non-key attributes are fully functionally dependent on the entire key (partial dependency has been removed).
Third Normal Form (3NF):
A relation is in 3NF if:
- The relation is in 2NF
- All transitive dependencies have been removed. (Transitive dependency: non-key attribute dependent on another non-key attribute.)
Example
Normalize the ORDER form below:

From the ORDER FORM (user view) we can derive ORDER relation:
Currently in UNF (Un-normalized Form)
ORDER
(Order #, Customer #, Customer Name, Customer Address, City, State, PostCode, Order Date, (Product #, Description, Quantity, Unit Price))
Note that the order form is not in 1NF because there is a repeating group
(Product#, Description….).
To convert the above relation into 1NF, the repeating group must be removed by creating a new relation based on the repeating group along with the primary key of the main relation.
1NF:
ORDER
(Order#, Customer#, Customer Name, Address, City, State, PostCode, OrderDate)
ORDER_PRODUCT
(Order#, Product#, Description, Quantity, Unit Price) —-> Note that Order# is also a foreign key as well as a PK
Anomalies:
Insertion Anomalies: cannot insert a new product until there is an order for that product.
Deletion Anomalies: if an order is deleted the whole detail of the product will also be deleted.
Update Anomalies: if the detail of a particular product needs to be updated, each order that contains that product has to be updated.
2NF (Partial dependencies):
The ORDER_PRODUCT relation is not in 2NF because not all non-key attributes are fully dependent on the entire key (e.g. the PK is the combination of order# and product#. But description and unit price depend on product#, not order#)
To convert the ORDER_PRODUCT relation into 2NF, a new relation must be created which consists of part of the keys (becomes the primary key of the new relation) and all non key attributes that are dependent on the partial key.
ORDER_PRODUCT
(Order#, Product#, Quantity)
PRODUCT
(Product#, Description, Unit_Price)
The ORDER relation is already in 2NF as there are no non key attributes that are dependent on partial key (ORDER only has a single key).
Anomalies:
Insert Anomalies: a new customer cannot be inserted until he/she has an order.
Delete Anomalies: if an order is deleted, the whole information of the customer is also deleted.
Update Anomalies: if a customer detail is to be updated, all orders for that customer need to be updated.
3NF (Transitive Dependencies):
The ORDER relation is not in 3NF because there is a transitive dependency (non-key attribute dependent on another non-key attribute). e.g. customerName, city, Address, etc. all depend on customer#, which is currently a non-key attribute.
To convert the relation into 3NF, a new relation must be created for the non-key attributes that are dependent to another non-key attribute.
CUSTOMER
(Customer#, Customer Name, Customer Address, City, State, PostCode)
ORDER
(Order#, Customer#, Order Date) —-> Remember to always maintain FK links
Both the order ORDER_PRODUCT and the PRODUCT relations are already in 3NF.
ORDER_PRODUCT(Order#, Product#, Quantity)
PRODUCT (Product#, Description, Unit Price)
Example Solution: Final Relations in 3NF and BCNF
ORDER
(Order#, Customer#, OrderDate)
CUSTOMER
(Customer#, CustomerName, CustomerAddress, City, State, PostCode)
ORDER_PRODUCT
(Order#, Product#, Quantity)
PRODUCT
(Product#, Description, UnitPrice)
Now an example for everyone at home to try:
Here is a suggested solution:
1NF: Identify the PK.
PATIENTHISTORY(PatientNo, name, address, suburb, date, time, drNo, drName, visitCode, description)
2NF: Remove partial dependencies. Notice that currently the PK is the combination of PatientNo, date, time. name, address, and suburb only depend on PART OF THE KEY (patientNo). patientNo, date, time -> drNo so this is not a partial dependency. Same as visitcode. We will deal with drName and description shortly.
PATIENT (PatientNo, name, address, suburb)
PATIENT_HISTORY(PatientNo, date, time, drNo, drName, visitCode, description) —> Note that PatientNo is now a foreign key as well as part of the key.
3NF: Remove transitive dependency. Notice how drName is dependency on the non key drNo. Same principle for description. description is determined by visitCode.
DOCTOR (drNo, drName)
ILLNESS (visitCode, description)
PATIENT (PatientNo, name, address, suburb)
PATIENT_HISTORY (PatientNo, date, time, DrNo, visitCode) —> DrNo and visitCode are foreign keys (pointing to the doctor and illness tables, respectively).
Relational Databases
We were hiring developers for a large, complicated application. I happened to mention to one of the more promising candidates that the application was fairly write-heavy and we might experience some performance concerns later. This developer, with years of experience and skills across multiple technologies that might have been of use to us, replied “you’re going to either need to shard your database or switch to NoSQL.”
That was enough to guarantee they didn’t get the job.
Before relational databases, all databases were NoSQL. That’s why we have relational databases.
Genesis of a DBA Universe
In the beginning was the disk array, and all was empty and raw, and Unix moved over the face of the platters. And the DBA said: Let there be Oracle. And there was Oracle. And the environmental variables were set and the disks were striped and mirrored and the OFA was established, and behold spindle was rent asunder from spindle. And the DBA saw that all was in spec.
And it was day and it was evening of the first day.
And the DBA said: Let there be scripts. And sql.bsq brought forth myriad crawling things upon the face of the array. And catalog.sql brought forth all manner of tables and views that swim unseen beneath the waters. And catproc.sql brought forth all the built-in programs and all the hosts of the air, that the users might be given wings and take fight over the data.
And it was day and it was evening of the second day.
And the DBA said: Let there be tablespaces. And there were tablespaces. And the network administrator looked upon the disk array and did see what the tablespaces had wrought upon the disk arrays, and he did gnash his teeth and seek a new work upon the Internet with an engine of search.
And it was day and it was evening of the third day.
And the DBA created users. Male and female he created them. And he said unto the users: Thou mayest create tables and views as thou wilt. Yea, though mayest create even indexes upon the data. Only meddle not with the system tablespace, for it is a holy place, and on the day wherein thou treadest upon it, on that day thy create session shall surely be revoked. And the serpent crept among the users and whispered to them, saying: Thine roles shall not be revoked. Taste ye all of the system tablespace, for ye shall know of b-trees and hints and ye shall be as DBAs. And the users heeded the serpent and filled the system tablespace with crap. And the instance did crash and the client did wax wroth at the DBA. And the DBA did gnash his teeth and partake of the fruit of the vine, for behold the users were permanent employees and the DBA was but a contractor and could not revoke their create session.
And it was day and it was evening of the fourth day.
And the DBA did set default tablespaces and temporary tablespaces and did lock down all that was upon the face of the array with roles and profiles and all manner of quotas, yea even from the rollback segments even unto the archived redo logs.
And it was day and it was evening of the fifth day.
And the DBA created synonyms and links and did tune the server and apply patches upon the face of the database.
And it was day and it was evening of the sixth day.
And on the seventh day the DBA did rest from all the labors of the creation. And his pager did ring and he ceased from resting and did spend his sabbath on the telephone with Oracle support. And by the time the DBA got through to someone who knew whereof they spake behold it was day and it was evening of the eighth day.
And the DBA waxed wroth.
Approximate Query Processing in Oracle Database (12.2)
About Approximate Query Processing

Oracle provides a set of SQL functions that enable you to obtain approximate results with negligible deviation from the exact result. There are additional approximate functions that support materialized view based summary aggregation strategies. The functions that provide approximate results are as follows:
- APPROX_COUNT_DISTINCT
- APPROX_COUNT_DISTINCT_DETAIL
- APPROX_COUNT_DISTINCT_AGG
- TO_APPROX_COUNT_DISTINCT
- APPROX_MEDIAN
- APPROX_PERCENTILE
- APPROX_PERCENTILE_DETAIL
- APPROX_PERCENTILE_AGG
- TO_APPROX_PERCENTILE
Running Queries Containing Exact Functions Using SQL Functions that Return Approximate Values
The interesting part is that you can utilize the approximate functions without changing code. There are three initialization parameters introduced to control which functions should be treated as an approximate function during runtime:
The initialization parameters are:
approx_for_aggregation
approx_for_count_distinct
approx_for_percentile
To replace only the COUNT(DISTINCT)
function in queries with the APPROX_COUNT_DISTINCT
function, set the approx_for_count_distinct
initialization parameter to TRUE
for the current session or for the entire database.
To replace percentile functions with the corresponding functions that return approximate results, set approx_for_percentile
to PERCENTILE_CONT
, PERCENTILE_DISC
, or ALL
(replaces all percentile functions) for the current session or for the entire database.
To run all queries using the corresponding SQL functions that return approximate results instead of the specified SQL functions, set the approx_for_aggregation
initialization parameter to TRUE
for the current session or for the entire database.
When Should Approximate Query Processing (AQP) Be Used?
If you work in a billing department, stop reading right now 🙂
1. Use AQP when you care about interactive response times
Exploratory data analytics is typically an interactive and iterative process: you form an initial hypothesis (e.g., by visualizing and looking at the data), running some queries, modifying your queries based on the previous answers, and so on until you find a satisfactory explanation. You cannot be productive if you have to wait half an hour every time you run a new query to test a hypothesis.
2. Use AQP when you can make perfect decisions with imperfect answers
Exact results have no advantage over approximate ones if they both lead to the exact same conclusion/decision. This obviously depends on the quality of your approximation and the application logic.
3. Use AQP when your data is incomplete or noisy
Believe it not, most of the data that is collected and used in the real world is extremely noisy. So the idea that processing your entire data gives you 100% accurate answers is usually an illusion anyway. In other words, if my data is noisy to begin with, I might as well take advantage of approximation, which uses a controlled degree of noise in my computation in exchange for significant speed ups. This noise in your data can be because of many reasons – human error, missing values, white noise, data extraction errors, data conversion errors.
4. Use AQP when your goal is to predict something
In general, there are many cases where the output of database queries are consumed by predictive analytics or machine learning algorithms, and in those cases, returning smaller samples of the original data can be a great solution for gaining considerable performance benefits.
Additional Reading: Oracle Database New Features Guide 12c Release 2 (12.2)
APPROX_COUNT_DISTINCT– Quick Distinct Count (12c)
Many organisations like to count things. Counting the number of distinct elements (i.e. the cardinality) of a set is a challenge when the cardinality of the set is large. I see many use cases where an approximate result for a given query would be sufficient, rather than a completely accurate result. For example, you may have a query: “How many distinct individuals visited our website last week?”. Do you need an exact number? I would like to bring your attention to a new SQL function introduced in 12c – APPROX_COUNT_DISTINCT()
The APPROX_COUNT_DISTINCT function was added, but not documented, in Oracle 11g to improve the speed of calculating the number of distinct values (NDV) when gathering statistics using the DBMS_STATS package. Oracle Database 12c (12.1.0.2) now includes the function in the documentation so we are free to use it in our applications as a supported SQL function.
Syntax
Purpose
APPROX_COUNT_DISTINCT
returns the approximate number of rows that contain distinct values of expr
.
This function provides an alternative to the COUNT
(DISTINCT
expr
)
function, which returns the exact number of rows that contain distinct values of expr
. APPROX_COUNT_DISTINCT
processes large amounts of data significantly faster than COUNT
, with negligible deviation from the exact result.
For expr
, you can specify a column of any scalar data type other than BFILE
, BLOB
, CLOB
, LONG
, LONG
RAW
, or NCLOB
.
APPROX_COUNT_DISTINCT
ignores rows that contain a null value for expr
. This function returns a NUMBER
.
Basic Usage
In previous database versions, if we wanted to perform a count of distinct values, we would probably have done something like the following.
SELECT COUNT(DISTINCT object_name) AS obj_count FROM all_objects; OBJ_COUNT ---------- 47171 1 row selected.
This query gives us the exact count of the distinct values based on Oracle’s read-consistency model. As such, we see all the committed data, along with any uncommitted changes made by the current session.
In contrast, the new APPROX_COUNT_DISTINCT
function does not provide exact results, but should give “negligible deviation from the exact result” according to the documentation.
SELECT APPROX_COUNT_DISTINCT(object_name) AS obj_count FROM all_objects; OBJ_COUNT ---------- 46788 1 row selected.
Performance
DROP TABLE t1 PURGE; CREATE TABLE t1 AS SELECT level AS data FROM dual CONNECT BY level <= 10000; INSERT /*+ APPEND */ INTO t1 SELECT a.data FROM t1 a CROSS JOIN t1 b; COMMIT; EXEC DBMS_STATS.gather_table_stats(USER,'T1');
Now we have a table with a little over 100 million rows and 10,000 distinct values. We can see the difference in performance of the two methods is more significant.
SET TIMING ON SELECT COUNT(DISTINCT data) AS data_count FROM t1; DATA_COUNT ---------- 10000 1 row selected. Elapsed: 00:00:19.66 SQL> SELECT APPROX_COUNT_DISTINCT(data) AS data_count FROM t1; DATA_COUNT ---------- 10050 1 row selected. Elapsed: 00:00:10.46
Conceptually, for some reason this technique reminds me of a similar problem in the web development arena. Sometimes, close enough is just enough.
The “Blur Up” Technique for Loading Background Images
12 Days of Denormalization
On the first day of denormalization, my design gave to me
a really fast-running query.
On the second day of denormalization, my design gave to me
2 less tables.
On the third day of denormalization, my design gave to me
3 more indexes.
On the fourth day of denormalization, my design gave to me
4 larger disks.
On the fifth day of denormalization, my design gave to me
5 brand new reqs.
On the sixth day of denormalization, my design gave to me
6 times the locking.
On the seventh day of denormalization, my design gave to me
7 longer updates.
On the eighth day of denormalization, my design gave to me
8 more requirements.
On the ninth day of denormalization, my design gave to me
9 invalid rows.
On the tenth day of denormalization, my design gave to me
10 delays deleting.
On the eleventh day of denormalization, my design gave to me
11 questionable queries.
On the twelfth day of denormalization, my design gave to me
12 lessons learned.