New Release
Learn More
Your submission has been received!
Thank you for submitting!
Thank you for submitting!
Download your PDF
Oops! Something went wrong while submitting the form.
Table of Contents
Get weekly insights on modern data delivered to your inbox, straight from our hand-picked curations!
the following is a revised edition.
We added a summarised version below for those who prefer the written word, made easy for you to skim and record top insights! 📝
Additional note from community moderators: We’re presenting the insights as-is and do not promote any specific tool, platform, or brand. This is to simply share raw experiences and opinions from actual voices in the analytics space to further discussions.
Prefer watching over listening? Watch the Full Episode here ⚡️
Dr. Dominik Neumann is a distinguished data professional with over five years of experience across various data roles, now serving as the data and AI strategy manager at Accenture. With a rich academic background as a research associate and hands-on industry experience in business intelligence and analytics engineering, Dominik has led teams in developing data pipelines, driving process improvements, and implementing AI-driven strategies. His blend of academic rigour and industry expertise makes him a true asset in the data world. We highly appreciate him joining the MD101 initiative and sharing his much-valued insights with us!
We’ve covered a RANGE of topics with Dominik. Dive in! 🤿
Before diving in, sign up to get notified when Episode 4 goes LIVE! ⏺️
A decentralized Business Intelligence (BI) team can be pivotal for bridging gaps between tech-heavy data initiatives and business operations with low data literacy. One impactful project involved building such a team designed to enable data-driven decision-making and address the disconnect between business needs and technical delivery.
By embedding this team within both tech and business units, they acted as translators—converting business requirements into technical terms and aligning both sides. This approach not only empowered end users but also streamlined data infrastructure to support a more data-centric business transformation.
To simplify complex legacy models and reduce computational load, we adopted best practices focused on incremental data loading and model consolidation. Previously, the legacy system was reloading all data nightly, which was inefficient and resource-heavy. We remodelled this to use incremental loads where feasible, reducing strain on the system while addressing error correction needs.
Additionally, they combined models where possible, archived outdated ones, and limited parallel runs to manage load across both the legacy and new platforms. This approach prioritized efficient, scalable data management to support a smoother transition and reduce operational risks.
To ensure data quality and governance within complex, large datasets, we implemented a data quality tool tailored to their tech stack. After evaluating major vendors, they selected a smaller, highly compatible tool that worked seamlessly with their existing dbt structures.
This solution enabled alerting and issue visualization for the analytics engineering team, which was essential, as they had previously spent hours daily troubleshooting data issues. By introducing this data quality framework, we streamlined error detection, reduced manual workload, and improved overall data governance and visibility.
Real-time or near-real-time ETL processes were deprioritized due to feasibility and resource constraints. While some exploratory steps were taken, such as monitoring Order Management Systems (OMS) to understand raw and streaming data, the focus remained on practical impact and effort, particularly within the operations domain.
Given evolving priorities in data engineering, we chose to focus resources where they could make the most immediate difference, leaving real-time analytics as a potential future goal rather than a present priority.
Balancing infrastructure engineering with data integrity relies heavily on clear communication and collaboration frameworks. As an engineering manager, prioritizing effective handshakes between teams—such as data engineering and analytics engineering—was key.
This approach enabled alignment on requirements, especially those impacting end users and SLAs. By focusing on establishing the right team interactions and ensuring the right people were involved in decision-making, the team could more effectively manage trade-offs and maintain data integrity across the infrastructure.
To build scalable data solutions, engineers should focus on two key areas: maintaining curiosity and embracing emerging tools. Staying curious by actively exploring industry updates—such as reading newsletters or market developments—prevents falling into a routine and enables proactive innovation. With rapid changes, particularly in AI, engineers can benefit from continuously learning how new systems might enhance their work.
Secondly, embracing new technologies, especially AI, as job enablers rather than threats is essential. Instead of clinging to past methods, engineers should leverage these advancements to improve productivity and stay relevant in the evolving tech landscape.
Mastery of foundational skills should take priority over adopting new technologies, as they enable engineers to understand and verify the outputs from digital tools. Relying solely on machine-generated results without understanding them can undermine an engineer’s expertise and reduce trust with stakeholders.
To maintain proficiency, engineers should test their knowledge through hands-on coding and self-guided problem-solving, filling any gaps that arise. By balancing foundational skills with the selective use of digital helpers, engineers can enhance their abilities and provide clear, reliable insights to the organization without becoming overly dependent on automation.
Effective mentoring of junior engineers centres on actively listening to their career goals and fostering authentic growth rather than fast-tracking titles. By understanding their personal journeys, mentors can guide juniors in aligning their daily work with their learning paths, encouraging practical application of new skills, and pursuing relevant certifications.
This approach allows juniors to showcase their progress both within and outside the organization, branding themselves as capable and adaptable professionals. The emphasis is on growth as a continuous journey, helping engineers gain real-world insights and competencies necessary for senior roles.
To assess the suitability of new data tools, it’s crucial to test them within your actual environment, gather user feedback, and ensure alignment with your team’s workflows and needs. For example, when evaluating a data monitoring tool in a previous role, we considered top industry solutions but prioritized testing directly in our data landscape rather than in a vendor-provided sandbox.
This real-environment testing helped gauge both usability and fit, allowing engineers to assess if the tool met their needs. We ultimately chose a player that was in a startup mode due to its openness to collaborative testing and extensive proof-of-concept support in our environment, ensuring the tool aligned closely with our specific requirements.
A key challenge in data engineering is balancing strategic priorities with ad-hoc requests. Much of the daily workload is spent on pre-qualifying requests and clarifying the business impact of these inquiries. To address this, a comprehensive data literacy program and a more guided data marketplace could empower end users and reduce unnecessary clarification.
Leveraging AI tools to help users refine their questions could also be transformative, ultimately streamlining decision-making and improving prioritization. For example, understanding the business impact of requests—such as distinguishing between high and low-revenue markets—would help shorten conversations and guide resource allocation more effectively.
The future of data engineering will see a significant increase in efficiency due to the adoption of automation and cloud-native tools. However, while these tools will streamline day-to-day tasks, the evaluation and qualification of tasks will remain a human-driven process, with digital tools offering assistance.
Ultimately, the quality assurance and final output will continue to require a collaborative effort between humans and machines. The key shift will be in the efficiency gained through embracing new tools, but the core human involvement in critical decision-making and oversight will remain essential.
While the idea of semantic and logical models in data modeling is exciting, the implementation has not yet been pursued by us due to the need to address foundational issues first. However, the potential benefits of these models are clear: they could improve understandability, facilitate communication between teams, and make onboarding new team members much easier. I’m cautiously optimistic about the future of this approach, particularly if more native methods for describing and constructing data models emerge.
The future of data engineering will likely make it easier to understand the impact of data roles on key business metrics like KPIs, ROI, and ARR. Data teams, while traditionally revenue-neutral, will contribute to cost savings by enabling scaled efficiencies.
“The Future of Cost Savings is in Spending.” Investing in data roles will unlock significant improvements in revenue-to-cost ratios by reducing hidden costs like rework, manual fixes, and inefficient processes. By prioritizing data as a strategic business pillar, companies will become healthier and more efficient, ultimately leading to better overall performance.
Catalogs can simplify the work of analytics engineers, but their effectiveness depends on the business context and user literacy. For established businesses with complex processes, retroactively implementing a catalog can create significant backlogs and resource drain (significant gaps and documentation to backfill).
However, for younger businesses, starting with a catalog can help organize and document data from the outset. The key challenge with catalogs lies in user literacy; a catalog must be capable of answering unclear or ambiguous questions. Additionally, without proper controls, a catalog can lead to misuse, with users mistakenly modifying data or misinterpreting definitions. Careful, well-paced implementation alongside user education is crucial.
Data products can be valuable when defined and managed properly, particularly in terms of operations and business data models. However, careful handling is required, especially when applying an agile approach to data development. While incremental improvements can work for adding functionalities or data pieces, ensuring consistent data quality from the start is crucial.
It’s important not to compromise quality in early iterations. Additionally, while the concept of data products aligns with continuous development, selling the idea of concise, high-quality products to the business can be challenging, as stakeholders often demand more. Nonetheless, it remains a valuable approach for improving data management.
Data product development requires close collaboration between engineering teams and the business, as it cannot happen in isolation. Analysts, in particular, play a crucial role by being actively involved in continuous conversations about what is needed, what is problematic, and what should be prioritized in the roadmap.
They act as data shepherds, ensuring the development aligns with user needs and business goals. Analysts are key spokespeople for data users, contributing to the design and ongoing improvement of data products.
A good data product must prioritize transparency, especially regarding data lineage—understanding where the data comes from and the transformations it has undergone. Analysts need control, visibility, and transparency to build trust in the data.
Without clear visibility into the origins and transformations of the data, mistrust can arise. For a data product to add real value, everyone involved must be able to explain the source and flow of the numbers, particularly when presenting critical information to stakeholders like C-level executives.
Data products help business data teams by providing a clear structure for responsibilities, road mapping, and communication. By defining product ownership, teams can work more efficiently, reducing the tendency for everyone to handle everything.
This clarity enhances the ability to communicate the value of data products to business stakeholders, demonstrating how they benefit the business and improve scalability. Additionally, data products increase visibility and transparency, making it easier to showcase business value and move beyond the ease of manual processes.
📝 Note from Editor
The above insights are summarised versions of Dr. Dominik Neumann’s actual dialogue. Feel free to refer to the transcript or play the audio/video to capture the true essence and details of his as-is insights. There’s also a lot more information and hidden bytes of wonder in the interview, listen in for a treat!
Thanks for being at Modern Data 101! Subscribe for free to receive new posts and support our work.
Connect with me on LinkedIn 🙌🏻