Exalytics:
Loading Data
There are three primary use cases
for Exalytics/OBIEE/TimesTen:
1. Normal BI use case
2. DataMart use case
3. Operational Data Store
(ODS) use case
In terms of how TimesTen is used (1)
is the most common case (maybe 70-80% of customers) and (2) and (3) are very
similar to each other.
For (1), only aggregates are stored
in TimesTen. These aggregates are recommended by the BI software based on the
various reports and dashboards that have been configured there, the queries
that BI generates as a result of them and the way the data is organised in the
backend database. BI generates scripts to compute these aggregates and to load
them into TimesTen.
This is all completely controlled and managed by the BI
software. We should not be getting involved in any recommendations relating to
data loading for this aspect (since it is all under the control of BI and is
closely coupled with who BI is setup and used). Any 'advice' for customers in
this area is the job of BI SCs, BI consulting and maybe BI support.
For (2) and (3) a subset of the
backend database DataMart / ODS is typically loaded into TimesTen. This usually
consists of one or more very large 'fact' tables plus a larger number of
significantly smaller (but still often large in absolute terms) 'dimension'
tables. The decision as to which tables need to be moved into TimesTen and what
subset of the data is needed is still really driven by BI considerations and
has to be carefully evaluated on a case by case basis. We should not get
involved in that aspect; that is firmly the province of BI SCs / consultants to
assist the customer with.
Once the customer already nows the exact set of
tables etc. that they want to move into TimesTen then we can potentially
provide some advice, though again it should be in conjunction with the BI folk
to ensure we are all aligned on the best option(s) for a given use case.
For initial data loading (initial
population):
If the backend DB is non-Oracle then
potential options are:
1. Manual; create
compatible TT schemas and create tables in TT, unload data from backend DB in a
format that ttBulkCp can read, load data into TimesTen using ttBulkCp.
2. Use OBIEE. This has
some limited data loading capabilities.
3. Use Oracle Data
Integrator (if the backend DB is supported). This requires a license so it is
not a free option.
4. Use Oracle
Datawarehouse Administration Console (DAC). This also requires a licence so is
not a free option.
If the backend database is Oracle
10gR2 or later then there is an additional option:
5. Use
ttImportFromOracle to perform type mapping, data compression etc. and generate
TT compatible DDL files to create the tables in Oracle. Use the (new) parallel
data load feature of ttImportFromOracle (as from TT 11.2.2.5.0) to load data to
TT tables quickly and with high throughput.
For Oracle DB backends in general
(5) would be the recommendation (since it is free and flexible) but customers
may of course choose any of these options.
For
incremental refresh loads:
Typically, once the initial
population has been done then from time to time (might be hourly but more often
daily or less frequently) there is a need to add new data to the TT datamart /
ODS. Usually this is simply adding new data but sometimes it may also require
updating existing data. Due to the limited capacity of memory there will also
need to be a mechanism to periodically purge 'old' data.
This is a more complex area and as
of today we do not have any really good solutions (but they are begin worked
on). Post 11.2.2.5.0 we hope to have some better options (but that is internal
information not for external consumption at present). The two main challenges
here are:
1. All queries that
originate from BI use table level locking for performance reasons. Thus they
are incompatible leith any concurrent DML against the tables being queried.
2. What does it mean
anyway, in business terms, to run report and queries against data that is
changing (given that TimesTen does not support the same query consistency
models as e.g. Oracle DB).
So today options for incremental
refresh require either:
a) Some BI application
downtime to allow the TT tables to be refreshed. This maybe can be selective if
there are different sets of tables that can be refreshed together.
b) Some form of double
buffering whereby two copies of all relevant tables are maintained in TT. App
runs against copy 1 while copy 2 is refreshed then app switches to copy 2.
Later copy 1 can be refreshed and app switched back to copy 1 afterwards. Bi
can support this but the memory requirements in TimesTen often make it
impractical.
The 'tools' that can be used for
this are pretty much the same as for initial loading…
For non-Oracle backend DBs:
1. Manual; Unload data
from backend DB in a format that ttBulkCp can read, load data into TimesTen
using ttBulkCp. If anything other than INSERT is needed then may need top use
staging tables and PL/SQL processing in TimesTen etc.
2. Use Oracle Data
Integrator (if the backend DB is supported). This requires a license so it is
not a free option.
3. Use Oracle
Datawarehouse Administration Console (DAC).
DAC is free and supported for
customers with license of any of the BI Applications.
If the backend database is Oracle
10gR2 or later then there is an additional option:
4. Use the
ttLoadFromOracle building to load data directly into the TT tables, or into
staging tables for further processing using e.g. PL/SQL. The LoadData.sql script
generated by ttImportFromOracle could be used as a basis to construct suitable
scripts.
No comments:
Post a Comment