Learning data science and doing it are two different things. At school, stats professors teach us how to curve-fit the “perfect” machine learning model but do not teach us how to be practical, how to manage a project, and how to listen to clients’ needs.
[Related Article: Are You Ready to Lead a Data Science Project?]
Formal education sometimes can fail us, fail us in a terrible way.
When talking to other more seasoned data scientists, the most common complaint about junior data practicians is the lacking ability of working in the industry.
There is a huge gap exists between knowing DS and actually doing DS. In this post, I’d like to share 4 non-tech success ingredients that I’ve learned from managing a program evaluation project.
As a PhD candidate, I’ve received 5+ years of training in research design and data science. Throwing a few years back, I had a statistical consulting engagement with an educational company.
This company trains less privileged students about social etiquette and professional development. It operates as an NGO style and relies heavily on public donations and government grants. Though with awesome programmings, the company finds itself in a financial predicament because they can’t quantify the social impact. Their grant applications have been rejected several times by companies and foundations.
A side note, it is always difficult to quantify such an intangible concept as social effect. Program evaluators need to be data savvy and also have strong domain knowledge.
My job is to come up with a valid research design to quantify their good work.
From scratch. Well, sort of.
This is easily said than done. When I first started it, I was surprised to learn how unprepared and disorganized the data team was. The lack of preparation makes it impossible to redesign their workflow. Just to name one example, as an educational training provider, they don’t even have Program Objective and Learning Outcomes Statements (LOS).
For those who are not so familiar with the field, these two elements are the fundamental ingredients for program evaluation. Without LOS, there is no way we can prove whether the program has achieved its goal or not.
Ingredient 1: Understand business questions really well
First thing first. I sit down with the leadership and develop what their specific objectives and goals should be. To quantify the results, I developed a new set of metrics, ranging from three dimensions: Knowledge, Attitude, and Behavior.
- Knowledge. What is professional conduct? One of the key LOS is to educate students on what professionalism is.
- Attitude. What is the right attitude in professional settings? This is consistent with the LOS.
- Behavior. How do you behave in a professional way? Also, this is consistent with the LOS.
The process takes about 1–1.5 weeks.
With these goals in mind, the next step is to collect data. I asked for every piece of data they have, any surveys asked, etc. Also, I requested access to their database and the previous rejection letters received from unsuccessful grant applications.
This is a critical step. Rejection letters provide reasons why they fail and point us to future directions. Information collection of what they have done definitely helps us understand what else should be done. There is no need to reinvent the wheel.
Unsurprisingly, the most common reason for rejection is the lack of quantifiable results. The company has provided some descriptive data but not inferential data. Descriptive data tells how students perform in the group, and inferential data tells whether the program makes the difference.
Ingredient 2: A simpler solution is a better solution
Junior data scientists like to fall for this common pitfall: prefer fancy, sophisticated solutions to show what they can do over simple solutions that can solve the problem.
A 10 layer deep learning model over a simple regression?
This is crucially important in the Program Evaluation field. We want to impress our stakeholders and foundations with our great work. To confuse them with less interpretable models is the last thing I’d do.
With this caveat in mind, I adopt a simple quasi A/B testing and include some common ways of presenting the results.
- Mean Scores. Compare the averages between the treated and control groups.
- Two-Sample T-Test. To ensure the experimental groups are comparable.
- Maturity Check. As a robust check, I eliminate the maturity effect as the participants in the control group stays the same over the experimental duration.
- Paired T-Test. The data analysis shows a modest and statistically significant increase after participating in this program.
Ingredient 3: Communication is the King
For the entire duration, the most challenging part is to turn vague business questions into testable DS questions. This is really not easy. I sit down with the team several times and bounce ideas about what is testable and what isn’t. The process lasts for a couple of weeks, and we have to give up some ideas reluctantly and focus on a few that works.
During the process of brainstorming, my non-tech colleagues help me better understand the corporate structure and unique needs. In return, I help explain what is feasible to test statistically. Great ideas bounce around and the conversation lasts for a long time.
As the data guy in the team, I often bring my ears more often than my mouth with me, listening to and understanding their needs and concerns before speaking up.
As said, 20% of successful project management comes from good models and 80% comes from non-tech parts.
Ingredient 4: Time Management
This is the last non-tech component, but an essential one. To keep things going, we are on a tight schedule and have to provide workable results in about a 2-month window. Maximum.
Delivering perfect results in a year’s time?
Sure, but the company will go bankrupt.
Within the short-window frame, I re-design the entire workflow and make it more practical, including participant recruitment, model selection, data analysis, robust test, and drafting program report.
- Sampling. Instead of random sampling, I suggest reaching out to potential schools and pick the one who agrees to participate. On a side note, program evaluation does not rely on random sampling that much as long as we are careful with the scope of generalizability.
- Case Selection. Within the chosen school, we select two classes that resemble each other in multiple ways, like gender ratio, race, and annual family income. One is the control group, and the other is the treatment group.
- Model Selection. No fancy model, just a two-sample paired T-Test.
Conclusion and Takeaways
Overall, I’m pretty happy with the results and enjoy every second of doing it. Learning business is fun, talking to various stakeholders is even better, and making a real-life impact is the best part!
[Related Article: How to Lead a Great Code Sprint]
Because of my work, this company is able to quantify their good work and raise more than the expected amount of money within a few months.
Lesson 1: Understand business questions really well
Lesson 2: A simpler solution is a better solution
Lesson 3: Communication is the King
Lesson 4: Good Time Management
Originally Posted Here