|
Lazy Loading
Optimizing SQL queries enhances the performance of any
data-centric application. Some of the common approaches used to
optimize SQL queries are:
SQL queries are generated and executed by the underlying ORM
engine. Therefore, Hibernate provides various hooks to optimize
SQL. The fetching strategy is one of the most important hooks, which
defines the data that to be fetched, the time of fetching the data,
and the way in which it needs to be fetched.
There are four strategies for loading an object and its associations.
Immediate fetching
Lazy fetching
Eager fetching
Batch fetching
Note: If memory tracking is enabled on a server, it accesses each
field of the object to compute its size. As a result, even lazy
fields are accessed causing the lazy fields to get loaded immediately.
Immediate fetchingIn this strategy, the associated object is fetched immediately
after the owning entity is fetched, either from the database using
a separate SQL query or from the secondary cache. This is not an
efficient strategy to use, unless the associated object is cached
in the secondary cache or when separate queries are more efficient
than a Join query. You can define this strategy by setting lazy="false" and fetch="select" for
the relationship property definition in the CFC.
<cfproperty name="art" fieldtype="one-to-many" cfc="ART" fkcolumn="ARTISTID" lazy="false" fetch="select">
With this strategy, on loading the artists object, its art object
is loaded immediately using a separate SQL query. As a result, this
strategy is extremely vulnerable to 'N+1 Select problem'.
Lazy fetchingIn this strategy, the associated object or collection is
fetched only when required. Although you need to send a new request
to the database each time you need data, this strategy controls
how much of data is loaded and when is it loaded. This helps in
reducing the database load.
When you load an entity, by default, ColdFusion ORM loads the
entity’s data but relations and any mapped collections and are not
loaded. They are loaded only when you want to load them by calling
the getter method. Therefore, the relations and collection mappings
are lazily loaded. For example, when the artist object
is loaded, all its artworks are not loaded and they are loaded only when getarts() is
called.
ColdFusion ORM provides three types of lazy loading for relationships:
lazy: This is the default lazy loading that applies to
collection mapping, one-to-many and many-to-many relationship. In
this case, when you call the accessor for the collection/relation,
the collection is fully loaded. So, when you call EntityLoad() for
a particular artist, its artworks are not loaded at that time. When
you call artist.getarts(), all the art object belonging to the artist
will get loaded. This is achieved by setting lazy="true" on
the relationship property definition in the CFC.
Example:
In
artist.cfc
<cfproperty name="art" fieldtype="one-to-many" cfc="ART" fkcolumn="artistId" lazy="true">
Extra lazy: This applies to one-to-many and many-to-many
relationships. This type of lazy loading goes one step ahead of
lazy and does not load all the associated objects when the accessor
for that relation is called. It just loads the primary keys for
those objects and keeps a proxy object for them. When you call any
method on the wrapper object, that object's data is loaded from
the database.
For example, when you call artist.getarts(),
it executes a query on the database to fetch the primary key of
the related artwork objects and creates a proxy artwork object.
So, you do not load the data for all the artwork objects in memory.
When you access a particular artwork object and invoke any method
on it, then it fires another query to the database to load that
particular artwork object. This is achieved by setting lazy="extra" on
the relationship property definition in the CFC.
Example:
In
artist.cfc
<cfproperty name="art" fieldtype="one-to-many" cfc="art" fkcolumn="artistId" lazy="extra" >
proxy: This applies to one-to-one and many-to-one relationships.
When the owner object is loaded, the related object is not loaded
from the database. ColdFusion only creates a proxy object for the
related object and when any method is invoked on the related object,
the data for the proxy object is loaded from the database and populated
in the proxy object.
For example, if the art-artist table
relation is lazy, when the art object is loaded, the artists object
is not loaded and when you call art.getartist(),
you would only get a proxy object. When you call any method on the
proxy object, query gets executed on the database to load artist
object's data. This is achieved by setting lazy="true" on
the relationship property definition in the CFC
Example:
In
ART.cfc
<cfproperty name="artist" fieldtype="many-to-one" cfc="artist" fkcolumn="artistId" lazy="true">
Important: An entity is loaded only once in the request
(in Hibernate session) and there is always only one copy of it in
the request. So, for artwork and artist relationship, which is lazy,
if the artist is already loaded, calling art.getartist() will
not create a proxy object and will return the loaded artist object.
Lazy loading can be disabled by setting lazy="false" on
the relationship property definition in the CFC.
Choosing an appropriate lazy loading option is very important
for the performance of your application. Extra lazy means more number
of trips to the database (each trip to the database is expensive)
but less data in memory whereas no lazy loading means a huge object
graph in the memory. So, you need to balance the approach based
on the application need.
Eager fetchingIn this strategy, the associated object or collection is
fetched together with the owning entity using a single SQL Join
query. This strategy reduces the number of trips to the database
and is a good optimization technique when you always access the
associated object immediately after loading the owning entity. You can
define this strategy by setting fetch="join" for
the relationship property definition in the CFC.
Batch fetchingThis strategy tells Hibernate to optimize the second SQL
select in Immediate fetching or lazy fetching to load batch of objects
or collections in a single query. This allows you to load a batch
of proxied objects or uninitilized collections that are referenced
in the current request. This is generally useful in nested tree loading.
You can specify this using "batchsize" attribute
for CFC or relationship property.
There are two ways you can tune batch fetching:
Batch fetching at CFC level: This allows batch
fetching of the proxied objects and is applied to one-to-one and
many-to-one relationship. For example, consider artwork and artist
example where there are 25 art instances loaded in the request (ORM
session). Each artwork has a reference to the artist and the relationship
is lazy. Therefore, art objects contain the proxied object for artist. If
you now iterate through all the art objects and call getartist() on
each, by default 25 SELECT statements are executed to retrieve the
proxied owners, one for each artist proxy object. This can be batched
by specifying the 'batchsize' attribute on the
artist CFC:
<cfcomponent table="artist" batchsize="10" ...>
When
you call getartist() on the first art object, it
batch fetches 10 artist objects that are proxied in the current
request.
So for 25 art objects, this type of batch fetching
makes Hibernate execute a maximum of three queries in batches of
10, 10, and 5.
Batch fetching at collections: This allows batch fetching
of value collections, one-to-many or many-to-many relationships
that are uninitialized. For example, consider artist-art one-to-many
relationship where there are 25 artists loaded and each artist has
a lazy collection of artworks. If you now iterate through the artists
and call getarts() on each, by default 25 SELECT statements
are executed, one for each artist to load its art objects. This
can be optimized by enabling batch fetching, which is done by specifying "batchsize" on
the relationship property:
Example:
In artist.cfc:
<cfproperty name="art" fieldtype="one-to-many" cfc="art" fkcolumn="artistId" lazy="true" batchsize="10">
One
important thing to understand here is that batchsize here
does not mean that 10 artworks are loaded at one time for a artist.
It actually means that 10 artwork collections (artworks for 10 artists)
are loaded together.
When you call getarts() on
the first artist, artworks for 9 other artists are also fetched
along with the one that was asked for.
The value for batchsize attribute
should be chosen based on the expected number of proxied objects
or uninitialized collections in the session.
|