5 Ethical Issues Data Analysts face in the U.S


I used to be amazed how websites knew that I was looking for my next “vacation to Cancun” or browsing through “cheap air tickets” by popping up options and websites that I had earlier browsed for such information. It was almost creepy to a point where it felt like my moves were being tracked by some unknown “powers” present inside my computer. Until I realized that my every click on a website or post on social media was being collected as “data” for future use, data through which my “click-stream behavior” could be tracked by using “cookies” and patterns could be drawn to gauge my and other consumer behaviors. After all, this was an ingenious way to identify your target audience, build on your analysis and market your product or services accordingly. Today, the field of “Data Analytics” has me fascinated about the opportunities and insights such “data collection” can offer but at the same time has me concerned on where one can draw the line and the risk of unintended consequences.

Whilst attending a “talk series” by Data Analytic pundits, a brave soul among the audience raised the issue on “ethics” and “customer privacy” that data analysts face in the United States. What transpired was a discussion that I will now try breaking down in this article as “The Five ethical issues that Data Analysts in the United States encounter”

  • The Impact on Privacy

One of the major issues data analysts face is the overwhelming amount of private information they are privy to and where to draw a line with this sort of information. A great example is Target’s pregnancy targeting where Target figured out a way to know whether a woman was pregnant long before she even started shopping for diapers. This was done using predictive data analysis where patterns were drawn by analyzing pregnant women buying behaviors. These patterns were then linked to other women consumers whose shopping patterns started showing similar signs to that of a pregnant woman. These “probably” pregnant women were then targets of baby ads and coupons, which eventually proved to be a winner for Target. Whilst most may argue that privacy laws were not violated in this case, as data was inferred from pre-existing available information, the motive behind using this data to draw an analysis/inference is questionable. “While people might know that information is being collected about them, knowing this and knowing that the data will be analyzed for various purposes are two different things. As such, it can be argued that private data is being gathered without proper informed consent and this is morally wrong.” (LaBossiere, 2012). Data analysts struggle with the notion of “how much digging around is acceptable” and the “impact such research” can unearth and/or negatively impact lives. It does not help that there is a clear lack in clarity of laws around ‘informed consent.’

  • Fabrication and Falsification of Data

Falsification is the process of manipulating research materials, equipment, or processes in such a way that the results of the research are no longer accurately reflected in the research record. Fabrication is the practice of inventing data or results and recording and/or reporting them in the research record. Considering these are probably the most serious offenses a data analyst can commit, as it challenges the whole integrity of the research, they are constantly treading cautiously. However, Ghiselin (1989) reported that subtle forms of biases are “ubiquitous to the point that it is not considered unethical. Everybody distorts things just a little bit at least, and everybody else knows it.” The American Statistical Association (ASA) developed ethical guidelines for statistical practice (Ad Hoc Committee for statistical practice, 1983) of which 5 of them directly pertain to data analysis. First, the findings of the research should be presented honestly and openly. Second, statements that are not true should not be in one’s report. Third, data reference and inference should be clearly mentioned. Fourth, a complete documentation of one’s findings and assumptions should be documented and finally, application of statistical measures should be completely unbiased. Therefore, it has become more important now than ever for data analysts to take the right step in ensuring that data selectivity (discarding data that tends to not support their hypothesis) is avoided and that there is fair and a transparent representation of facts on the table, supported by accurate research and analysis.

  • Data Security

Whilst there are rules in place on what data can be researched and what cannot, how data needs to be treated etc., data analysts are also enforced to protect the data they conduct research on. Most data analysts have a legal responsibility to ensure that data is stored safely and securely. This can sometimes lead to data analysts questioning the extent of their job responsibility. As one data analyst rightly said “It is imperative for me, as a data analyst within the company to make sure the financial numbers of the company are reported accurately without inflating them and making sure my numbers and analyses highlight the correct profitability/financial viability of the organization, but how can I control if this data gets hacked and then information is misrepresented?” In today’s world, the extent to which data analysts influence a research is massive. An unprecedented increase in competition and the lure of profits can sometimes have data analysts deploying corrupt practices to gather data. For example, bribing government agencies to secure data that is not publicly available. This is when secured data, otherwise not intended for public use, can be leaked and made available for more research and exploitation. Whilst security of data may not be the primary responsibility of a data analysts but ethically it would be wrong to be callous of that data.

  • Potential Harm and Risks

In the world of data research, there is a high possibility that whilst examining relationships between various variables, a person’s or group’s identity may get exposed. The information in a qualitative study may result in negative consequences which can then lead to the loss of the data analyst’s job and in extreme cases, being sued or arrested. As Al Gore said “Science has a culture that is inherently cautious and that is normally not a bad thing. You could even say conservative, because of the peer review process and because the scientific method prizes uncertainty and penalizes anyone who goes out on any sort of a limb that is not held in place by abundant and well-documented evidence”. As researchers, data analysts have often been threatened with dire consequences of reporting or issuing work that may compromise or expose any wrong-doing by a person, party or organization. Most qualitative researchers have, at some points in their lives, been subjected to enormous pressure to tweak their findings in a bid to maintain harmony or keep issues under wraps. Whilst risk and harm can work both ways, in compromising patient or subject (person) integrity and in threats to data analysts (for example; in the form of litigation), it is nevertheless the moral duty of the data analysts to report factual and accurate data, free of perception.

  • Using/Misusing my results

Data analysts have the art of converting seemingly unconnected data into organized pieces that make absolute and thorough sense. They can turn volumes of unrelated data into relevant information and help organizations make better business decisions through it. However, one of the most common challenges faced by data analysts is the dissemination of this information to the right hands. Misuse of information takes many forms and as more individuals gain access to technology, chances of misuse magnifies. Data analysts tussle with the obligation of their findings being represented correctly or not. More often than not they question the negative repercussions of misuse and misinterpretation of their data. After all, misuse of a data analyst’s information can result in long term potential risks that may eventually malign the reputation of said analyst and destroy his/her career prospects. Therefore, it is imperative that a tight control is placed on “Confidentiality Protection” where researchers can provide more meaningful and insightful analysis to the information provided without being concerned over data misuse.


The primary root cause of questionable ethical practices in data gathering can be attributed to a researcher’s personal attachment to his/her theories (Birch,1990; Chamberlain, 1965). This then leads to channelizing one’s focus on proving these theories correct versus actually testing them. As hard as it may seem, addressing this ethical concern may help in resolving conflicts on the other ethical issues faced by data analysts that has been mentioned in this article. Intervention and advocacy to do the right thing at the right time is perhaps the most sacred code of conduct every data analyst should live by. Whether codes of proper professional conduct are made explicit or remain implicitly embedded in the practices of the group to which one belongs is not the point, even though making such norms explicit may be desirable. The point is that membership in a professional community carries with it binding collective obligations and forces us to view ethics from a shared perspective (Soltis 1990:250)

References and Sources: