DSharp

DSharp ist ein Softwareunternehmen, das innovative Produkte und Methoden für die Entwicklung und Wartung von Low-Code-Datenplattformen entwickelt. Wir sind hier, um die Arbeit von Datenexperten zu revolutionieren.

DSharp Studio ist eine innovative No-Code-Automatisierungslösung für den Aufbau von Datenplattformen. DSharp Studio wurde entwickelt, um manuelle Entwicklungs- und Wartungsarbeiten zu minimieren, indem die Aussagekraft konzeptioneller Modelle genutzt wird. Geben Sie die Informationen an, die Sie in Ihrer Lösung benötigen, indem Sie Ihr zweckmäßiges konzeptionelles Modell entwerfen, es mit Quelldaten verbinden und einige Details anpassen, und schon sind Sie fertig. DSharp Studio wandelt dieses weniger detaillierte konzeptionelle Modell in eine detailliertere physische Datenbanklösung um, die Daten speichern, verarbeiten und veröffentlichen kann, die vollständig mit Ihrem ursprünglichen Modell übereinstimmen.

DSharp Studio enthält die vollständigen Modellierungsfunktionen von DSharp Modeler, das auch separat erhältlich ist. Darüber hinaus enthält es eine leistungsstarke Analysefunktion, die Ihnen fast die ganze Arbeit abnimmt, und vor allem den fast magischen Konvertierungsmechanismus, der Ihnen definitiv die ganze Arbeit abnimmt.

A hierarchical link is based on a self-reference in the conceptual model. The Kategorie class points to itself using the role name Oberkategorie, which results in a generated link table containing the hierarchical reference. The role name serves as the base for the parent reference column. An automatically generated state tracking satellite captures parent changes, ensuring single cardinality in the view layer, as modeled. Procedures are generated to populate these tables, keeping the model layer consistent with the conceptual model, showing only one active parent at a time.

Hierarchical Links

An attribute with an array of values is modeled as a class of its own (for example, Person’s email address class related to the Person class and containing the attribute email). The key is defined as a non-business key containing both the owner reference (the associated Person class) and the value itself (email).

This results in a structure that is a consistent implementation of the model. However, instead of multi-active satellites, this modeling approach generates a link table containing the array values. An optional state tracking satellite is also created.

MultiActiveSatellites

The conceptual model defines the cardinalities of all associations. All implementable associations are many-to-one associations.

For associations, a link table plus a state tracking satellite are automatically generated, and the generated loading mechanism guards the cardinalities defined by the conceptual model. If a relation between objects changes, state tracking will automatically be used to retire the previous relationship, maintaining the modeled cardinalities based on the driving key derived from the model structure. This requires no input from the user.

ValidityInRelationships

We model the m:n case of Something being associated with Something else in Some role at Some specific time (and variations thereof) as a Participation design pattern.

An m:n table is then an implementation of the Participation class that binds its participants together. The class is represented by a link table (either with state tracking turned on or off) which directly implements the many-to-many case. Should a need to reference a Participation class arise later, this can be implemented without link-to-link situations.

Soft delete may be applied if retired link rows need to be hidden.

MNTables

Early integration is achieved by mapping data from different sources to the same class. All sources share hubs and links, but their attribute data is loaded into source-specific satellites generated from the mappings.

Groups of sources using the same key values to represent the same entities can be defined as a Key group for a class. The Key groups define the context of specific sets of key values, thus guaranteeing hash level match where applicable, and on the other hand avoiding hash clashes where same key values are used for different entities.

EarlyIntegration

Reference tables are modeled as Description classes, referenced in the model using standard associations and generating hubs and satellites. These associations are not implemented as hash values in link tables, rather they will be hashed runtime in the view layer to provide a consistent way to navigate the view structure.

For historized content, initial loads can be generated to load all rows in order, after which the rest of the loads can be configured to only load the latest rows from the incoming data set. The column containing the version information is indicated in the mapping file.

HistorizedReferenceTable

With duplicate handling turned off, the hashing process succeeds, but insert attempts of duplicate hashes will fail, leaving the target tables unchanged. The failure is logged for subsequent handling in whatever way has been agreed.

To catch invalid duplicates, we turn on Duplicate handling for a class. After hashing, offending rows are moved to error tables, after which the remaining rows will load as expected. The duplicate rows are reported in the log.

Alternatively, to only allow non-duplicate rows to enter the hashing process, we can use the distinct keyword in the mapping file.

DuplicateData

Null hashes will automatically be moved to error tables after hashing. Composite keys having one or more empty component values will produce a hash that is considered technically valid and will therefore be loaded.

Any changes in attribute values hosted in satellites trigger the hashdiff-based insert of a new row into the satellite.

However, if the attribute is part of the key, the incoming row will be considered another instance of the class than before the change. In this case we can activate the same-as implementation for the class, to redirect references to one selected master instance. Also, depending on the situation, the class may have State tracking and Soft delete turned on, in which case the pre-change instance will be hidden in the view layer and the post-change instance will be shown.

Also see Deletion of Orders.

If a class is marked for State tracking, the load process detects the absence of previously loaded hash values and uses a state tracking satellite to mark them as Deleted (or Recreated, if they reappear). This can always be done during full loads, but also during delta loads, if certain requirements are met.

The collected state information can be ignored or used in the view layer to either keep displaying or hiding the data that has been marked as deleted by just setting the appropriate Soft delete parameter value for the class.

DeletionOfBusinessKeys

Links may be loaded with hash values that don’t exist in the referenced hub, human errors may result in satellites containing hashes not found in their parent hub or link. We have analysis functionality that reports all foreign keys logically pointing to a non-existent primary key, for all table types.

Referential integrity is not enforced on table level.

InvalidForeignKey

Deletion of any data can be tracked by turning on State tracking for a class. By turning on Soft delete, rows marked as deleted can be hidden from the view layer.

This is technically the same case as Deletion of Business keys.

We currently neither model nor generate standard facts and dimensions as such. However, a fact and a dimension can be modeled as derived classes that essentially perform the same duties. Class-to-class mappings are used to populate the derived classes with data from other classes. Business rules and filters can be used.

Hence, “Dimension classes” can be built with or without validity periods. Hierarchical structures can be flattened using the dot-notation path syntax or using SQL.

A future version will generate standard fact and dimension structures from classes tagged as such.

All numbers match. Tested using SQL queries on manually built fact/dimension structure.

Business rules are defined for derived classes, as part of a class-to-class mapping. Attributes in the derived classes are defined as rules and calculations applied on their source classes. The syntax can be a combination of model-based dot-notation paths and SQL.

The dot-notation paths are generated into SQL joins and references, so they work as platform-independent shorthand for writing SQL joins to access specific values used in the rules.

BusinessRules

Column level data lineage is available for the entire model, derived from user-defined source mappings and class-to-class mappings during the process of generating the solution. This information can be used to generate dynamic Data Flow diagrams (table level) in the UI as well as HTML documentation (column level). It is also exported to the Data Catalog database for reporting purposes.

DataLineage

Duplicate and null-hash errors while hashing and loading are caught and logged. Hash or load failures only affect the current table being populated, an error does not stop the entire load batch. Failed datatype conversions result in null values being stored in the target columns, and the original non-converted value is then stored in a separate column. By running built-in error analysis commands these errors can be detected and reported.

ErrorHandling

DSharp Studio is not needed to load the data, it only generates installable database objects.

Source system specific master procedures are generated for both hashing and loading the entire contents of the staging area for that source system. Top-level master procedures are generated for calling these source system specific procedures. Post processing like state changes, pit updates, materializations and generating derived classes are all included in the process.

Calling _HashAll and _LoadAll will load and process all contents in the staging area.

Orchestration

The deployment package is generated by selecting the classes to deploy. All database objects related to the selected classes will be included in the deployment package. The resulting installation process is nondestructive in that it never deletes any objects or columns, only adds to the existing structure. Procedures and views will be recreated.

The deployment package can be written to file and executed, or copied to the clipboard as text.

Deployment

Scheduling is done using the target platform’s mechanisms. We currently do not generate scheduling information.

For development time hashing and loading, the desired classes are selected in the UI and the appropriate hash/load etc command is run.

The current version (version 5.1, October 2024) of DSharp Studio supports the following target platforms:

Microsoft Azure SQL Database
Microsoft SQL Server 2019
Microsoft Azure Synapse Dedicated SQL pool
Microsoft Fabric Warehouse
PostgreSQL databases, version 12 or newer

Snowflake is currently in beta.

Target platform is a project level parameter and can be changed at any time.

DSharp Studio is a Windows desktop application.

Recommended minimum configuration:

RAM: 4-8 GB for application
Free disk space: 300 MB + for data
Display: 1024×768 or better
OS: 64 Bit Windows 10, Windows 11

Earlier Windows versions may work, but are not supported

Git is recommended for version control and distribution of project files between other team members. Visual Studio Code is recommended as a companion tool for git to visualize changes and handle possible conflicts.

DSharp

Willibalds Data Warehouse

Automation Tools