Date
Data management: what it is, best practices, and platforms used
Regardless of their size and type of business, companies are dealing every day with an increasing amount and variety of data from an increasingly complex type of sources.
The informational value of data represents enormous potential for the enterprise, but it must be extracted from a raw base, prepared and made available for analytical applications, which through machine learning and other artificial intelligence techniques are able to produce unprecedented results from a descriptive, predictive and generative perspective.
To manage data and exploit it efficiently, the know-how and experience of individuals is no longer sufficient.
Manual operations can no longer keep up with the numerics and complexity of data, especially given the need to analyze it in real time.
Therefore, there is a need to structure a data management strategy that can account for all aspects of data in an increasingly data-driven and customer-centric environment.
In some ways, data management is a practical and consistent response to the corporate data culture.
For many, this is a novelty in the digital age, but we are talking about a process that made its appearance as far back as the 1960s, with the spread of the first mainframe databases, systems capable of managing data according to a predetermined hierarchy.
They were the ancestors of relational databases, which saw their debut in the 1960s and would spread profitably from the 1980s onward, until, in the 1990s, the arrival on the market of modern data warehouses, systems capable of guaranteeing end-to-end data quality necessary for analytical applications.
More recent years have seen the emergence of NoSQL databases, capable of handling even unstructured data, as well as systems such as data lakes, capable of storing huge amounts of data even in raw form, without having to transform them a priori, but only when they prove useful for analytical applications.
Today, with the spread of the cloud, an increasingly hybrid dimension of enterprise IT systems is being configured, and new concepts are being introduced, this time focused on data visibility, such as data fabric and data observability, whose applications enable a unified view regardless of their location, so that they can be managed in one virtual place.
Let’s look at what data management consists of, what the data manager does, and what are the best practices and most significant challenges related to data management in the enterprise.
What is meant by data management
In other words, Data management consists of all activities associated with the management and governance of enterprise data.
In the first instance, data management takes on a strategic connotation to generate added value from the data that companies have.
In order for their analysis to take place efficiently, they need to be managed in an organized manner, addressing a number of challenges and risks that also include aspects related to cybersecurity and the privacy of the data themselves, in accordance with the provisions of the GDPR and the main industry regulations.
The proliferation of IoT systems has recently made the scenario related to data acquisition even more heterogeneous, through dedicated sensor technology, smart devices, and audio-video content continuously streamed in multimedia.
For example, when considering customer relationships, it is no longer possible to consider only the structured data in a CRM.
In order to know them better, it is necessary to treasure all the interaction data with business communication channels: website, e-commerce, social media, in-store exhibits, and all the other touchpoints included in the omnichannel strategy.
The complexity of today’s scenarios makes it necessary to have absolutely rigorous data management, aimed at supporting analytical applications with an integrity and quality that is useful for obtaining useful answers to business needs without risking errors and problems that would frustrate the efforts and investments made.
Ultimately, data management becomes the indispensable activity for companies that intend to acquire, integrate, clean, govern, store and prepare in the best way digital content from all sources with which the business is connected.
Why it is important
In no uncertain terms, data management enables organizations to increase revenues and profits by optimizing the necessary resources and making various business activities more efficient.
A judicious data management strategy ensures, both in the short and long term, practical answers to achieve business objectives by breaking down traditional data silos, thus fostering synergies between the various lines of business.
High visibility of data ensures that users and the computing devices that access it have the appropriate level of usability to meet any operational need, as well as facilitating the achievement of the data quality required by analytical applications to support business decisions.
Data management also fosters innovation by “forcing” organizations to review their data strategy, integrating established technological systems and methodologies based on modern software platforms capable of working on data in real time.
Another aspect for which a data management strategy is now unavoidable is related to regulatory aspects, in order to meet the requirements of the GDPR and all the devices that regulate the storage and processing of data, as well as their protection and topics inherent to the cybersecurity of the company.
The types of data management
It is no coincidence that when it comes to data management, the first step to be addressed very often coincides with the definition of the data architecture.
Data architecture is key to managing data in large numbers and as its complexity changes over time, thanks to a blueprint that contemplates the availability of management systems such as databases, data lakes and data warehouses, regardless of whether they are located on-premises or in services available in the cloud.
One of the most common types of data management is database administration (database administration), systems that provide for the organization of data in a way that makes it easily updatable and usable by all business applications.
Database administration involves several steps:
- Design
- Configuration
- Installation
- Update
- Maintenance
- Security
- backup / restore
- performance monitoring
Other activities that fall under the umbrella of data management include:
- Data governance: defines enterprise-wide shared policies and procedures to ensure data consistency in all planned operations
- Data quality: attributable to data preparation activities, which aim to clean data, eliminate errors and duplicates, as well as ensure the correct format for all application uses
- Data observability: aims to increase the efficiency of data governance and data quality through improved visibility of the data pipeline, achievable through automated monitoring solutions.
- Data integration: aims to collect and combine data from the different sources with which business systems are connected in order to make them available for analytical applications
- Data modeling: describes the relationships between data and how they are managed in various management systems and analytical applications
- Data fabric: represents an architecture that facilitates the end-to-end integration of various data pipelines and cloud environments through the use of platforms with automation features.
The figure of the data manager
Three main responsibilities of the data manager include:
- Management of customer and personnel activity data;
- Account creation and assignment of database access permissions;
- Maintenance of databases and data management systems;
- Simplification of data collection, preparation and analysis procedures;
- Creation and review of documentation for all database changes;
- Resolution of any inconsistencies or anomalies that might distort analytical results;
- IT infrastructure configuration: software, hardware and data storage;
- Communication with executives and non-technical stakeholders on corporate data issues.
The requirements to meet this wide range of needs are many.
The data manager is a professional who must combine management skills with technical knowledge in the world of data, as well as having strong soft skills.
The data manager must ensure a data-driven business approach, spreading the culture of data within the organization, In addition to the technological aspects, the data manager must in fact be able to communicate with all stakeholders, technical and non-technical, to highlight all problems and facilitate their solution, urging the various business managers.
In terms of education, the data manager comes from a degree inherent to the data discipline, such as computer science, statistics, or data science specializations.
In any case, the degree constitutes only a starting point, which must be enriched by solid experience in the field, faced with increasing responsibility in a business context, developing a concrete aptitude for problem solving.
The data manager should also not be confused with the data protection officer (DPO), a figure that companies have had to identify in order to meet the provisions of the GDPR.
The data manager specifically focuses on taking care of the entire data lifecycle in business processes, where compliance with the GDPR, although of crucial importance, is only one of the aspects to be taken into account.
Likewise, the data manager should never be confused with the IT manager and privacy manager.
Some best practices of data management
Indeed, there are various frameworks for structuring data management in the business context, such as those defined by the aforementioned DAMA or the Data Governance Professionals Organization.
A celebrated work in this area is the DAMA-DMBOK (Data Management Body of Knoweledge), a text that aims to define a standard view of the functions and methods that underlie data management.
After its first edition (2009), the reference guide was more recently revised in the context of a second edition, the DMBOK2 (2017) However pointedly identified in the DMBOK and similar tools, best practices usually involve specialized consultancies capable of working on both cultural and typically methodological aspects.
At the best practice level, it is desirable to refer to the following activities.
- Databases with functional performance for business needs: in particular, through the implementation of scalable databases capable of optimizing the expenditure of resources due to analytical applications, eliminating both costly waste and performance bottlenecks.
- Fostering data automation and reuse: in this context, the data manager must choose software platforms that can automate data transformation while fostering both data quality and the reuse of data itself in multiple applications.
- Fostering the adoption of AI and ML tools: applications based on machine learning techniques enable the monitoring of data management systems to levels not even conceivable with traditional systems, ensuring high levels of performance and an exhaustive understanding of what is happening on the systems themselves in real time.
- Use of converged databases: these are modern databases with native support for all data, structured and unstructured, facilitating their use in various applications.
Converged databases facilitate the integration of many emerging technologies into business processes.
In addition to the aforementioned machine learning, the contribution of IoT and blockchain cannot be overlooked. - Implementation of a discovery layer to identify data: allows analysts and data scientists to efficiently identify datasets and make them available for their business intelligence and business analytics applications
- Provision of common query layers to manage data storage: data management platforms have tools that make various storage systems interact in a way that is transparent to the end user, regardless of their physical location.
The platforms used
As mentioned in the introduction, data collection refers to many channels: websites, social media, e-mail, sales history, etc. Their use has become best known in the marketing sphere, where they were first implemented, currently evolved into the customer data platform (CDP).
Among the main DMPs we find:
- Microsoft Customer Insights and Marketing
- SAS Data Management
- Snowflake
- Oracle Advertising and Customer Experience
- Adobe Audience Manager
- Google Audience Center
Challenges and risks of data management
When one decides to implement a data management strategy, one often inherits fragile architectures, the result of successive layering of services and solutions, which have produced a series of data silos that make communication between various business departments problematic.
The first challenge in the context of data management coincides in ensuring that data sets are accurate and consistent across all management systems, seeking to improve visibility aspects in particular.
One of the most complex aspects of data management is knowing what is available and where it is available.
An issue made even more complex by cloud computing, which allows the use of managed services offered by different providers, on different systems.
To facilitate this task, data management platforms have tools that can use metadata for cataloging.
The same migration of an on-premise database to the cloud involves the adoption of next-generation technologies, which are not always compatible with traditional systems.
It is therefore necessary to evaluate the most suitable solutions on a case-by-case basis, without necessarily wanting to migrate everything a priori.