WhereScape

WhereScape’s data warehouse automation software solves one of IT’s biggest problems: How do I build a data warehouse with efficiency, reduced risk, and long-term profitability?

Using automation, templates, and built-in intelligence, WhereScape increases architect and developer productivity, provides scalability with standards, and allows you to take control of your own data while fostering collaboration between IT and the business.

With comprehensive technical support spanning virtually every source, destination, and design methodology, WhereScape helps you design, develop, deploy, and operate your Data Vault (or any other data architecture) quickly and easily.

Build Raw Vault, Business Vault, and Star Schema in minutes and enjoy performance and quality improvements by reducing hand-offs, tool switching, and other pitfalls of handwritten or semi-automated development.

When a self-referencing relationship exists in one table, then a hierarchical link can be used to maintain the relationship. Hierarchical links are defined as entities within WhereScape 3D and their business keys tagged as type, Link Business Key. Naming standards are applied (e.g. tablename_hl) to differentiate from other link types. The ingestion mechanism is the same as any normal link.

Hierarchies can be resolved or navigated within downstream objects, such as the information mart, built in WhereScape RED.

WhereScape – 01 – Hierarchical Links

There are multiple options to handle this.

Option #1 as a multi-active satellite by setting the PK column “von” as a Multi-Active Key and a Multi-Active attribute. The rest of the attributes are defined as Multi-Active attributes too. A multi-active satellite with the suffix _ma will be generated.

Option #2: LivingAddress as Hub with Business keys of both primary keys (KundeId and Von) and generating a link and satellite. No link satellite necessary.

WhereScape – 02 – Multi-active satellite

The relationship between the Customer and the AssociationPartner is defined by the foreign key (FK) relationship. The AssociationPartner FK can change and be Null, valid or invalid. The KundeID is set as the Driving Key. To maintain the history and effectiveness of the relationship an effectivity satellite is added to the link. Satellite suffix: _effectivity

WhereScape – 03 – Validity in Relationships

When a table such as the Delivery Service table does not have its own primary key but holds keys of several other related tables, WhereScape can generate a non-historized link. The foreign keys are tagged as type, Link Business Key. Non-descriptive attributes, such as PosID, that are not part of the business key, can be tagged as type, Link Dependent Attribute, and WhereScape will add them to the Link table as a dependent child. Descriptive attributes can be tagged as type, Satellite Transaction, and will be implemented in a Satellite.

WhereScape – 04 – m-n Tables

If early integration is needed, views can easily be defined, and DDL generated, to split the Roadshow Order table (RS_Bestellung) and map these views to the Order and Position tables. WhereScape can also be designed to map and source a subset of the RS_Bestellung and Webshop table attributes to the Order and Position hubs and merge the data into their respective hubs via separate update scripts for each source. This allows independent, isolated and flexible loading whichever source is delivering data.

WhereScape – 05 – Early Integration

Reference tables (for example, sourced from csv files) can be profiled and structures discovered and then mapped as sources, with PKs and FKs defined, all attributes tagged as reference attribute and generated as table objects in the DW. Prefix: r_

For historized reference tables the PK attribute is tagged as a reference attribute and the rest of the attributes are tagged as reference history attributes. All reference history attributes are maintained in a separate satellite and linked to the reference table via FK relationship to maintain the historical changes of the reference table. Satellite suffix: _refhist

WhereScape – 06 – Historized Reference Table

Duplicate data arrives within the same batch timestamp.

#1 Technical DV duplicate (two identical rows).We could choose to de-duplicate using either a batch id which is different from the Business key or by using a SQL windowed function like ROW_NUMBER() to insert only the first row

#2 Two rows with same BK (different data). Source is erroneous solution

  1. Change the satellite as a multi-active satellite and load the received data as is
  2. Communicate with the team delivering the source data and let them fix it
  3. add an error reporting table using error handling
  4. Apply soft rules in the BV to fix the data

WhereScape – 07 – Duplicate Data

Rows arriving without business keys might be meaningless for the business object. One can ignore those rows and put them in the error mart. In case there is a need to maintain ‘unknown keys’ in the hubs, one could create a dummy record in the hub for the NULL keys as zero key records as used traditionally in Kimball-style dimension tables.

WhereScape – 08 – Rows without BKs

The satellites take care of any change in the historically loaded descriptive information of the hubs and links and any other data vault objects. In case there is any mistrust in the data, we use an additional ODS layer to stage the data vault and maintain the historically loaded source data. It is also possible to create a view to reconstruct the source from the data vault objects and compare it to the loaded data.

WhereScape – 09 – Changes of attributes

A CDC mechanism needs to exist to capture the changes. For example for the customer hub there is a satellite for status tracking. The history is maintained in the satellite and can be retrieved to validate the business keys over time. Satellite suffix: _tracking

WhereScape – 10 – Deletion of Business Keys

DeliveryAddress and Customer are created as hubs with their respective satellites and there is a link object between the delivery address and the hub. As the customer hub is also sourced from the delivery address, the customer hub is also populated with the customer business keys from the delivery address even if they do not, at the time, have corresponding records in the Customer table. The link is maintained properly.

WhereScape – 11 – Invalid foreign keys

If business keys are deleted between the dates of Deliveries, we need to maintain a status tracking satellite for the Orders hub. As the Orders are relevant for counting we need to add/generate the measures and transformations in the fact tables within WhereScape RED.

WhereScape – 12 – Deletion of Orders

The Product Category is a link object which maintains the relationship between the Product and the Category hubs. The Product Category dimension based on this link and the satellite attributes reflects the changes over time.

WhereScape – 13 – Change of dimensions

WhereScape supports the creation of a Dimensional Mart on top of the Data Vault.  We used the WhereScape RED tool to create dimensions for Association Partner, Customer, and Product as well as separate fact tables for Webshop and Roadshow contain all the different metrics requested.

WhereScape – 14 – Key Figures

WhereScape supports the generation of business vault objects such as PIT and Bridge tables, current views and historical views, with generated end dates and record version numbers on top of the raw vault satellites.

Business rules and data transformations can be added at any time in WhereScape RED.

The connection between Webshop Orders and Association Partner was modelled in 3D to generate the Link tables.  We also modelled a Sames-As Link table to resolve the alternate Customer ID based on credit card information.

WhereScape – 15 – Business Rules

All objects generated in the data vault or other data foundation layer is fully documented as all object details are stored in a repository. The tool generates the documentation and the graphical data lineage of tables, columns and related procedures, functions, scripts, transformation codes etc. automatically.

WhereScape – 16 – Data Lineage

All errors are maintained in the error and audit log tables stored within WhereScape’s metadata repository. Errors are displayed within the WhereScape scheduler console or can be sent as notifications – such as email/teams/slack – to notify responsible people.

WhereScape – 17 – Error Handling

 

WhereScape’s job scheduler enables full orchestration of steps needed to load and transform data through the various layers of the data vault/data warehouse. The ingestion of the load table, and the scripts to generate ODS/PSA, stage and DW objects are defined as tasks. The tasks can be prioritised, ordered and grouped by object or process type. Jobs could be configured to depend on other jobs to run. Job dependencies can be visualised and are included in the generated technical documentation.

WhereScape – 18 – Orchestration

 

WhereScape generates deployment packages to deploy objects to different metadata repositories (e.g. development, UAT and production).  A deployment could be a reload or new load or handle deletes and partial changes. DDL deltas are detected during deployment and ALTER statements generated. Versioning is managed within WhereScape and can also be integrated with existing processes and external tools such as Git.

WhereScape – 19 – Deployment

 

WhereScape comes with a built-in cross-platform, enterprise-grade batch workflow job scheduler. Running on Windows or UNIX/LINUX or a mix of both, the scheduler supports load balancing, resolves ordering through job dependencies and provides an easy to use web user interface to maintain and track workflows. Users wanting to use their own scheduler can either execute the WhereScape generated scripts directly or by triggering WhereScape jobs via REST API.

WhereScape – 20 – Scheduling

 

WhereScape is continuously developing its enablement packs to support target platforms and currently supports the following:

  • Amazon Redshift
  • Databricks / Deltalake / SparkSQL
  • Exasol
  • Google BigQuery
  • Greenplum
  • Hive
  • IBM DB2
  • Microsoft Azure Synapse + APS/PDW
  • Microsoft SQL Server + Azure SQL Database + Managed Instance
  • MySQL
  • Netezza
  • Oracle
  • PostgreSQL
  • SAP Hana
  • Singlestore
  • Snowflake
  • Teradata

 

WhereScape can be installed on premise, in a cloud environment or a hybrid. Metadata is stored in MS SQL Server/PostgreSQL/Teradata/Oracle and can run locally or on cloud services such as RDS. Depending on the target platform additional connection ODBC, native loading tools and python 3.0+ are required.

WhereScape – 22 – Installation Requirements

 

Klicke außerhalb der Vergleichs-Bar um sie zu schließen.
Vergleichen
Vergleichen ×
Vergleich starten! Zurück zur Seite