

Three Ways Researchers are Using Data Science for Good
Data Science for GoodconservationData Science for GoodMachine Learningmobile moneyODSC Westsolar energyposted by Paxtyn Merten October 23, 2018 Paxtyn Merten

Data experts have long identified marginalization and narrow-minded problem solving as some of the biggest challenges facing data science. When large technology enterprises only seek solutions to problems they face within their company and their communities, it exacerbates inequalities.
But companies, nonprofits, and individuals across the globe are making it a priority to address problems like inequality, poverty, health, and more — and they’re using data science to do it. Here we explore three applications of data use to solve societal issues.
Detecting autism early with machine learning
Early identification of autism and other behavioral disorders can help people affected live higher-quality lives. That’s why researchers and developers created Cognoa, a digital health company that uses machine learning to detect these disorders.
The company’s app, Cognoa for Child Development, allows parents to provide data about their child’s behavior at home, from home. The FDA-approved diagnostic app then identifies and reports the child’s risk for developmental delay and autism in minutes.
The app is available for free on Google Play and the App Store. However, a parent’s employer must offer Cognoa to use the services. It has been used by 300,000 families, according to its description in the App Store.
The app was developed through five years of clinical research at Harvard and Stanford’s medical schools. The company optimized its predictive system by combining the results of multiple machine learning algorithms, which were trained on different media to identify autism. Its researchers are now looking into ways to simultaneously screen for multiple conditions.
Combating poverty in Sub-Saharan Africa
A number of groups are finding ways to use data to combat poverty, particularly in rural areas in Africa.
DrivenData hosts international challenges online to encourage participants to create statistical models that address social issues. In one case study, participants promoted mobile money usage in rural Tanzania, which provides more income stability in those populations.

The research and design process that teams used during the DrivenData case study in Tanzania.
Though much of Tanzania’s population has cell phones, mobile money was slow to catch on. Project participants found that was because Tanzanians didn’t trust digital currencies. So they worked with qualitative and quantitative data to promote relationships between those users and digital financing agents.
In a similar vein, Fenix International provides combined solutions for people in Africa to build credit and access solar energy. The company says its machine learning techniques allow individuals to access solar energy that was previously cost prohibitive.
Fenix provides solar panels and batteries to people in poverty through payment plans. The company then mines data from those devices. It also mines demographic information, repayment patterns, weather, climate, and satellites.
Data analysts use that data to build models that predict repayment and develop credit histories for individuals using their service. The majority of those users don’t have any previous documented financial history. Fenix’s credit profiles can allow impoverished families to access credit for loans and utilities, eventually improving their quality of life.
Watch ourselves: Working toward ethical and bias-free data work
Instead of working on external problems, some data science for good efforts are introspective. They work to ensure data-dependent systems aren’t working against society or any particular groups of people.
Laura Noren, the director of research at Obsidian Security and a professor of ethics for data science, noticed students in her classes were adept at finding ethical gaps in their peers’ work but missed the mark when evaluating their own. So, she developed a 90-minute interactive workshop to help researchers uncover their internal biases and learn how to address those biases in their work.
Between reviewing how research ethics can benefit data science and looking at real-world examples, workshop attendees learn to create strategies to intervene and take ethical approaches with minimal negative societal impacts. Noren will bring this workshop to ODSC West, too.
Others working in the same realm try to promote inclusionary design in data-driven work. Frances Haugen, the director of data product at Gigster smart development service, recently pushed Pinterest to give users the option to filter searches by skin tone — and succeeded.
She regularly works to build AI tools aimed at inclusion, and she promotes best practices to limit bias in machine learning. If developers don’t keep inclusion in mind, “objective” artificial intelligence can end up perpetuating the same biases and systematic discrimination as people would. Those committing time and money toward those goals are central in the “data for good” movement.
Speakers at ODSC West Oct. 31-Nov. 3 who will deliver lectures and lead workshops on the topic of data science for good include:
- Ford Garberson, a senior data scientist at Cognoa, will present about the company’s diagnostic app.
- Peter Bull, the co-founder of DrivenData, will walk through the Tanzania mobile money case study in detail.
- Brianna Schuyler, the director of data science at Fenix International, will talk about how her company creates credit scores through providing solar energy to people in Africa.
- Laura Noren, the director of research at Obsidian Security, will lead a workshop to reveal data scientists’ biases.
- Frances Haugen, the director of data product at Gigster smart development service, will discuss how to design with an eye towards inclusion.
- David Smith, a cloud developer advocate at Microsoft, will lead a workshop and discussion on the projects AI for Earth has been working on, using artificial intelligence for biodiversity protection, precision agriculture, and conservation.
- Ahna Girshick, a senior computational research scientist at Ancestry DNA, will give a lecture about the genetic network of 10 million people Ancestry has built and how it has used machine learning to help users find their genetic communities and discover more about their origins.
- Sydeaka Watson, a senior data scientist at AT&T’s Chief Data Office, will give a talk about racial bias in police interactions and how segmentation analysis can help researchers study it.