There is a lot of talk lately about data, especially big data, and how it can be used to help organizations learn more about the people connected to them: employees and customers. The term data science gets tossed around casually, now that we have the tools and computing power to trivially handle these massive, often unstructured, data sets.
Luckily, in addition to the recent influx of interest, there are many established experts in this space who are helping to guide the conversation about how people data should, and should not be used, both from an ethical and practical standpoint.
As an experienced researcher focused on understanding how people – employees, customers, and users – perceive and experience the environments and systems they interact with, I am a data junkie at my core. The insane amount of data available to us in the current era is both amazing and staggering, and there is a reason that the outcome of some organizational and product research is analysis paralysis rather than informed decision making.
I love rooting around in a huge data set as much as the next person who shed blood, sweat, and tears to collect and organize it, but this type of data exploration with no basis in theory often leads us to find spurious relationships and draw erroneous conclusions.
Providing Context to Data Collection
In this era of seemingly limitless data, we need to take a step back to look at how we are deciding what questions to ask and what data to collect. Without this planning, without this basis in sound methods, we find ourselves forging exciting new paths, chock full of information, but often with little to no basis in practical application or significance. And without the proper scientific or subject matter expertise, we cannot determine if what we found is, in fact, accurate, useful, or even plausible.
Thus, when faced with copious amounts of data, we need to get back to (research) basics to make sure we stay focused on how data can help inform our understanding of organizations, systems, and their members/users, rather than get caught up in the latest and greatest analysis technique. Here are a few big questions to consider.
What is the basis of our research?
Clarity and truth should be sought out from the beginning, and this starts with defining goals and purpose. We have greatest ability to modify our plans and our questions at this stage of the process based on what we learn as we consult stakeholders and those who know the situation best.
As researchers, we have our own set of expertise, but we cannot afford to overlook the valuable insight provided by those closest to the subject matter we are trying to understand. This helps define our research question, data collection plan, and analytics strategy—all crucial to finding meaningful insights.
What data do we have or can we gather? How do we handle the data once we have it?
Once we’ve defined what we want to know, figure out how to get there. Determining how to get the data needed can have important implications for what questions to answer and what insights we may find. If collecting new data, we have control over how, when, and from whom data is collected, but we must also spend the time, energy, and budget to do it. These resources can be tight, and we are not always able to ask employees or users more questions.
We should leverage stakeholders and those closest to the subject matter to help us understand what existing data helps us in our research, as well as explain the context around how it was collected, from whom, when, etc. I am all about leveraging what we already have, as long as we acknowledge the need to clearly understand the limitations and caveats of the data.
We must also be sensitive to the fact that we are using people data, which requires varying levels of increased sensitivity when we handle and analyze it. We oftentimes promise our participants anonymity or confidentiality, and it is our prerogative to respect those promises, especially since ignoring them hurts the trust we have established. Being careful to match our approach to the context—not just the content—of our data will ensure we draw appropriate conclusions in a way that respects the individuals we’re hoping to better understand in the first place.
So what, and what next?
Having collected and analyzed our data, we find ourselves ready to answer the questions we set out to investigate in the first place. Sometimes, we find great insights and are equipped to inform decisions or ask more complex questions. Other times, we find ourselves without answers, but with a solid basis for going back to our stakeholders (and the drawing board) and trying again. In other cases, we find unexpected answers but these can still be used to inform the path forward.
No matter what we find, we need to ask ourselves why it matters. What is the value? Can we take action based on what’s been learned? Are the differences significant in a practical sense? Is there anything that could be done differently based on what we know? Though it’s easy to be seduced by pretty graphs and low p-values, these do not tell us anything that makes life better for our stakeholders.
When it comes to data, I may be trained as a researcher and statistician, but I practice as a pragmatist and skeptic in my daily work. In a world where we have so much at our fingertips, we must rely on our ability to focus on what efforts will lead to the most useful insights so we can make the most positive change for our employees and users.