The data architecture framework set up today will determine a firm’s success tomorrow –and how!
In an age where most businesses are already neck-deep in the digital transformation process enforced by the COVID-19 pandemic, success will be primarily determined by a firm’s ability to adapt – especially in the structure of the organisation’s logical and physical data assets and management resources. This includes the models, rules, policies and standards which will govern the procurement, collection, storage, integration and usage of its data resources – or, its data architecture.
According to The Open Group Architecture Framework (TOGAF), the primary goal of setting up a sustainable data architecture is to “translate business needs into data and system requirements and to manage data and its flow through the enterprise.” This is where an integrated team of data architects, data scientists and analysts come into play – and how far they can flourish will determine how far the firm flourishes.
The Principles of Architecture
In determining the appropriate data architecture for an enterprise, primary importance needs to be placed on making data available and transparent. Data is meant to be used as a shared asset, and all modern data architectures need to eliminate compartmentalisation of data in different departmental silos, giving stakeholders a more complete and rounded view of the company and its functioning.
The provision of user-friendly interfaces becomes crucial in this regard, thereby allowing employees to not only easily access the data, but also treat it appropriately using the tools fit for the task at hand. The data being used must be suitably curated and optimised for agility and accuracy.
Security, of course, is also a primary concern and needs to be taken good care of. Security and access controls must begin directly from the raw data itself, and a dedicated cybersecurity suite needs to be set up wherever necessary.
Essentially, however, the idea of a firm’s data architecture should consist of three major components: (i) outcomes: the models, definitions, and dataflows often referred to as data architecture artefacts; (ii) activities: the formation, deployment and fulfilment of data architecture intentions and (iii) behaviours: the collaborations, mindsets, and skills of the various roles that affect an enterprise’s data architecture.
Many Frameworks, many attributes
According to the Data Management Book of Knowledge (DMBOK 2), “data architecture defines the blueprint for managing data assets by aligning with organizational strategy to establish strategic data requirements and designs to meet those requirements. On the other hand, DMBOK 2 defines data modelling as, ‘the process of discovering, analyzing, representing, and communicating data requirements in a precise form called the data model.’
While both data architecture and data modelling seek to bridge the gap between business goals and technology, data architecture is about the macro view that seeks to understand and support the relationships between an organization’s functions, technology, and data types. Data modelling takes a more focused view of specific systems or business cases.”
The three major frameworks that commonly serve as foundations for any organisation’s data architecture setups are:
- The DAMA-DMBOK 2, a framework specifically for data management providing standard definitions for data management functions, roles and delivery, along with several other principles;
- The Zachman Framework, or a multi-level enterprise ontology created by IBM-based John Zackman in the 1980s including relevant architectural standards, an enterprise model, a physical data model a logical data model, and the actual databases; and,
- The Open Group Architecture Framework (TOGAF), offering a high-level framework for enterprise software development.
Irrespective of the kind of architecture being used, emerging technologies such as Artificial Intelligence, IoT, automation and blockchains need to be implemented widely. This means the architectures must have Cloud nativity: high availability, elastic scalability and end-to-end security for data both at rest and in motion – such as in real-time data streaming and micro-batch data bursts.
Both data integration and independence are key, especially with legacy applications using standard API interfaces. While these must be optimised across systems and geographies to facilitate data sharing, they must be loosely coupled, thereby allowing services tasks independent of other services as well. Finally, there must be real-time enablement allowing the automation of tasks such as data validation, classification, management, and governance actively.