1. What Do Data Professionals Do?
Learning Objective: By the end of this lesson, learners will be able to identify and describe the primary skill sets of data professionals, including Data Engineering, Reporting, Data Analysis, Experimentation, Machine Learning, and Visualization and Storytelling, in order to effectively manage and collaborate with data teams.
Activity Flow of Data Work
It’s all too common for someone not in a data field to be responsible for hiring, managing, or making requests of data professionals. This can quickly go sideways if they don’t have a solid grounding in the professional data ecosystem. By this we mean understanding the variety of backgrounds and skills a data professional might have and the outputs you should expect from them.
So, what do data professionals do? The first step to answering that question is understanding the activity flow of data work. This is a topic we cover in Data Literacy Fundamentals. As you can see in the image below, the data work flow begins with getting the data needed and generally ends with some sort of consumable data product used to inform decisions. In between there is work to clean and transform the data into something useful, apply analytics techniques to the data – such as visualization, statistical analysis, modeling, experimentation, and the like – and then publish the outputs.
All of these activities require specific skill sets to be successful. Many data professionals have a multitude of skill sets but some specialize in only one part of the flow, such as getting the necessary data or creating machine learning modules. Let’s talk a bit more in depth about some groupings of data skill sets.
Data Skill Sets
Click on each of the tabs below to learn more about each group of data skill sets.
Data engineering is a broad skill set that encompasses a variety of activities related to ingesting, cleaning, transforming, and modeling data into a usable form. This work makes downstream applications such as reporting or machine learning possible.
A simple example is tying together the data from two systems. Maybe you have a system that tracks customers and another system that tracks orders. Data engineering work is needed to take the data produced by the two systems, place it in an accessible database, and connect the data so you can know all the details about a customer and the purchases they have made from you.
It can get much more complex when you include needs around big data, real-time, web data feeds, machine learning production, and so on. To do this type of work, you will need someone with data engineering skills.
While this skill set can fall into data analysis or visualization and storytelling, it is common enough to call out in a section of its own. Reporting is a skill set that produces information that, most commonly, answers the question “What happened?” This might be in the form of a dataset made available in a spreadsheet application or a highly curated dashboard. By far, this is the most common activity for which non-data professionals will seek out a data professional for help.
A key thing to note for reporting is that the skill set to answer “What is happening?” is not the same skill set that will answer “Why is this happening?” or “What will happen next?” or even “What should happen?” Many data professionals have skill sets that span all of these spaces, but not all do.
This is where a variety of techniques are applied to data in order to answer questions or mine for insight. It can be simple questions, such as “How many times did our Illinois customers call us last month?” (which might be answered via a report) to very complex questions like “Why did our Illinois customers call us so much last month?”
Accordingly, the techniques applied to different situations vary and the skill sets an individual data professional may have will also vary the most.
A strong grounding in data literacy is especially important if data is being analyzed. It is exceedingly easy to reach the wrong conclusions or produce inaccurate information. Data professionals who specialize in data analysis will have training in statistics and be familiar with techniques such as hypothesis testing, correlation analysis, and segmentation.
While experiments technically fall under the realm of data analysis, it is a specialization worth calling out. Experimentation skill sets are focused on testing hypotheses with well-designed experiments to answer questions about the effectiveness of strategies, of which several options should be chosen, and why a particular outcome is being seen. The skill to properly design experiments to ensure the results seen are valid and can be applied to broader situations is distinct from the ability to analyze the results and is not particularly common.
So if experimentation is a data strategy you want to lean into, you’ll want professionals with these skill sets.
Machine learning is another skill set that increasingly falls under the broader data analysis umbrella for the basics. However, skill sets that specialize in putting a model into production, keeping it updated, and tuning it are critical when a model is being used to inform – or make – critical business decisions. In addition, data professionals who specialize in machine learning tend to have deep knowledge of a broader variety of methodologies and are best at choosing the right model for the right situation to optimize performance and outcomes.
Finally, individuals with skill sets centered around machine learning are sources of expertise in areas such as time-series forecasting (predicting future volumes such as cash flow or interactions), computer vision (allowing computers to interpret the visual world), or deep learning (developing virtual assistants).
There is a lot of overlap in this space but there are opportunities for very, very specific expertise.
It may seem like visualizing data and telling stories with data would fall under the reporting skill set, or data analysis. While these combinations of skill sets can often be found in the same data professional, they are different. In fact, visualization and storytelling are actually distinct from each other as well but related enough that I’m putting them in the same section. Visualization skill sets involve a deep understanding of how human perception works as well as expansive knowledge of various ways to visualize data.
These skill sets are combined to produce data products that make it easy for people to consume the insights being made available. This can often involve the technical ability to create custom data products outside of common visualization tools (Tableau, Power BI, etc.).
Storytelling with data involves visualization skill sets, data analysis, and exposition. If specific insights need to be understood or acted upon, there is a need for storytelling skill sets. This is an area that can be – and often is – done by data professionals, or non-data professionals, with a minimum of technical ability. However, as with data analysis, without a solid grounding in data literacy it is very easy to go wrong unintentionally.
Picture an individual who is skilled at making convincing arguments with data but not particularly skilled in how to properly interpret data, the limits of data, or the ethics of data. Such individuals can do a great deal of damage, usually unintentionally, by convincing individuals and organizations of things that are incorrect.
Mapping Data Functions to Data Activities
Now that you have an overview of the activity flow of data work and an understanding of each group of data skill sets, it’s time to connect the two. The image below maps the importance of each data skill set, or function, to each step in the data activity workflow. Let it serve as a guide to help you hire the right data professional to achieve your organization’s goals.
Case Study: Building a Data Team
Situation Background
Meet Gabrielle, the director of operations at a mid-sized retail company called TrendyStyles. TrendyStyles has been growing rapidly and is now looking to leverage data to make better decisions and improve efficiency.
Problem
Gabrielle has been tasked with building a data team but is unsure about the specific skill sets and roles she needs to hire to address the company’s data needs.
Question
What types of data professionals should Gabrielle consider hiring to create a well-rounded data team that can address the various data needs of the company?
Decision and Outcome
After learning about the different roles and skill sets of data professionals, Gabrielle decided to hire a data engineer, a data analyst, and a data visualization expert. The data engineer focused on creating a solid data foundation, the data analyst provided valuable insights to support decision-making, and the data visualization expert ensured that the insights were easily understandable and actionable. This diverse team helped TrendyStyles make data-driven decisions, leading to increased efficiency and growth.
Knowledge Check
Which data skill set is best suited for each situation? Consider the following scenarios, and then hover over each scenario to reveal the answer.
Your company wants to implement a new feature on the website and needs to conduct A/B testing to evaluate its effectiveness.
Experimentation
The leadership team asks for a visual representation of sales performance over the last fiscal year, broken down by regions and product categories.
Visualization and Storytelling
You have a large dataset from a customer survey, and you need to understand the key segments in your customer base.
Data Analysis
The server is overloaded because of the large amounts of raw data being ingested from various sources. The system needs to be optimized.
Data Engineering
Your company’s AI model that predicts customer churn is outdated and needs to be retrained with the latest customer data.
Machine Learning
Each team needs to track their KPIs on a dashboard that updates automatically and that they can access at any time to see how they are doing relative to their goals.
Reporting
Key Takeaways
There are different types of data work and different professionals have different skill sets. Not every data professional will have skills in all the different types of work, so it’s important to understand the types of work you need in your organization.
- Data Engineering: A broad skill set that encompasses a variety of activities related to ingesting, cleaning, transforming, and modeling data into a usable form.
- Reporting: A skill set that produces information that, most commonly, answers the question “What happened”?
- Data Analysis: A skill set that uses various techniques (hypothesis testing, correlation analysis, segmentation, etc.) to answer questions such as “Why did that happen?”
- Experimentation: Technically a skill set that falls within data analysis but is a specific specialization focused on testing hypotheses with well-designed experiments.
- Machine Learning: Another area that can fall within data analysis but is a specialization with deep knowledge of a broader variety of methodologies and are best at choosing the right model for the right situation to optimize performance and outcomes.
- Visualization and Storytelling: Visualization skill sets involve a deep understanding of how human perception works, as well as expansive knowledge of various ways to visualize data. Storytelling with data involves visualization skill sets, data analysis, and exposition.
Now that you have an overview of the various types of data skill sets you’ll encounter in data professionals, we’ll move on to how to hire a data professional, even if you yourself are not one.
Before that though, launch the Lesson 1 quiz below to test your comprehension of the material covered in Lesson 1!