Database Application Developer and DBA
ASK TOM BY TOM K Y TE
On Consistent Results,
Different Constraints,
and Unlearning
Our technologist reads as of a consistent point in time,
uses the right constraint, and revises old learning on
unindexed foreign keys.
ORACLE DATABASE
I understand read consistency—the fact hat a result set returned by a single
query is entirely consistent with respect
to some point in time—but recently I’ve
noticed something strange. A certain query
is returning inconsistent results: the rows
returned contain values from the database
as of different points in time. Why is this?
The query is a simple query against a single
table, and it calls a PL/SQL function, which
in turn runs another query.
This is an interesting question. First I’ll
make sure everyone is on the same page
and demonstrate what read consistency
means. Then I’ll demonstrate this issue and
suggest some approaches to correct it.
First, read consistency is an attribute
of every query ever executed in Oracle
Database. It ensures that every row in
a result set presented to an application
comes from the database as of the same
point in time. If a simple query such as
SELECT FROM a_really_big_table
takes one hour to complete, the first
row returned from the query will be
consistent timewise with respect to the
last row returned from the query. For
example, suppose the “really big table”
is an ACCOUNTS table and has columns
ACCOUNT# and BALANCE. Further, assume
that the data in the table looks like this
before the query is executed:
So, I have a very large table (ACCOUNTS)
and a query (SELECT FROM ACCOUNTS)
that will take an hour to read from row 1
(with ACCOUNT# 123) to the last row (with
ACCOUNT# 789). All the rows are not read
at exactly the same instant, of course, but
instead—during the course of the full table
scan—over time. The results should be some
set of results that existed in the database at
some point in time. (That seems reasonable,
doesn’t it? To get an answer that exists?)
$400
ACCOUNT#
123
456
… millions of rows here
789
BALANCE
$500
$200
….
$300
When your query—which has been
running for an hour—reaches this last
record, what should it do? Should it read
and print out $400 for ACCOUNT# 789?
According to Oracle Database, no, it should
not (and I happen to agree 100 percent with
that decision). If the query did return $400
for ACCOUNT# 789, you would have a report
that shows the account balances to be $100
more than ever existed in the bank at any
time—because the first row and the last row
in the report would have double-counted
that same money. Upon encountering
the data that has been changed since the
query began, Oracle Database will roll back
the change—rolling it back as many times
as necessary (because it could have been
changed many times in the last hour)—until
it contains the data that was committed in
the database when the query began. That is
what gets returned to your application: the
value of the last row that is consistent with
the value of the first row returned to you.
This is true regardless of the type of query
you execute.
Suppose you had issued SELECT
SUM(BALANCE) FROM ACCOUNTS instead
of selecting every row individually. You
would get only one row back from this query
(not millions), but it still would take some
measurable time to execute (to read and
sum up all of those BALANCE values). In
this case, read consistency is perhaps even
more relevant. What if during the processing
of the SELECT SUM query—after you read
the first record with a $500 BALANCE but