From the rise of the digital revolution, there also rises a new way to gather data which is more efficient than ever before. In the article written by Joshua Blumenstock, the author points out the international-development problems that can be solved by using those large amounts of data. However, at the same time, the author also discusses the pitfalls we inevitably meet when using data and the possible outcomes. With potential improvements proposed, the author sees the promising future of data science in general. From my perspective, “good intent”, “transparency”, and “balancing act” are three integrals in problem-solving in data science.
After gathering data, a regulator’s job is to code and train data that can build algorithms logic and let computers teach themselves how to analyze data. It is true that with the help of powerful algorithms, various situations and complicated data can be simplified into a general problem and the most efficient suggestion can be proposed. However, it can only be a suggestion instead of a feasible solution because algorithms won’t make flexible changes depending on the different situations. This is where we need good intent to play their role in data science. Just like Joshua Blumenstock mentioned in his article, we need to “customize further”, and “deepen collaboration” to take human-based factors into account. In this way, transparency for regulators is needed to allow them to access the source code. This made it possible for regulators to identify relationships between inputs and outcomes, spot possible biases, and give routes into fixing problems[1], which is a process to combine good intent with the rigid algorithm that lead to increasing in problem-saving efficiency.
An example that can well support my point is my hometown, Taiyuan. With the increasing demand in industry, living, and transportation based on fossil fuel, the city Taiyuan also suffers serious urban air pollution. Theoretically, if there’s an algorithm that can take all the factors that may lead to the local air pollution into account, the final result may suggest Taiyuan Iron & Steel Corporation, one of the largest metallurgical industries in China, needs to shut down or move outside of the city. This suggestion is definitely effective in dealing with the local air pollution issue. However, it doesn’t take “good intent” into account like which may cause “unanticipated effects” like the surge of the unemployment rate in the local area.
Just like Kayla Seggelke says, when facing a trade-off between efficiency and “good intent”, it can be hard to find a balance. However, dealing with issues with data science is still the most effective way nowadays compare to the traditional way to collect data such as interviews, and questionnaires. Thus, humbler data scientists who can combine the viewpoints outside of the math are needed.
[1] Hosanagar, K. & Jair, V. 2018 We Need Transparency in Algorithms, But Too Much Can Backfire, Harvard Business Review