Below is an interview with Travis Oliphant of Quansight, a platform designed to connect the open source coder community and companies all in the name of open data.
[Related Article: How Developers are Driving Innovation Through Open Source Economy]
1) Tell me about your background. What brought you to where you are today?
I studied math and electrical engineering in college at Brigham Young University (BYU) and pursued a Ph.D. in Biomedical Engineering at the Mayo Clinic. While at the Mayo Clinic, I started releasing open-source libraries in 1998 that would become the foundation of SciPy in 2001.
During my Ph.D., I discovered a passion for open source as well as an appreciation for the evolving nature of volunteer markets that ultimately led me to pursue entrepreneurship. After receiving my degree, I became a professor at my alma mater, BYU. As a professor, I taught courses in electromagnetism, probability theory, signal/image processing, and inverse problems.
I moved to Texas in 2007 to join Enthought, whose founder had previously helped release SciPy and organize the SciPy conference. I developed experience in running a consulting / training business, but was unable to put as much effort into NumPy and SciPy as I would have liked which ultimately led to my leaving Enthought to start Continuum Analytics in 2012.
As Continuum Analytics’ CEO, I had the goal of building a venture-backed product company bootstrapped by a consulting / training business. The series A led to the growth of our product business around Anaconda. In 2018, I stepped down as CEO and we brought in a more experienced leader to more fully develop Anaconda as an enterprise software company.
At the end of 2017, I left Anaconda to form Quansight which continues where Continuum Analytics left off, providing consulting services around open-source data-science libraries in the PyData ecosystem and working to create and nurture new companies that will spin-out of our efforts.
2) Discuss Quansight, its mission, objectives, successes so far, and expectations for 2019.
Quansight builds and connects companies and communities of open source developers to help solve challenging problems with data. We help companies get the most out of their open source investments while helping them apply the latest open-source innovations in their organizations.
We also created Quansight Labs, which is a group of community-oriented developers dedicated to helping maintain and innovating around the quantitative stack in Python. Additionally, we mentor and train data scientists and developers while incubating new technologies and new organizations that may become independent companies someday. We are following the pattern we established at Continuum Analytics to both help open source communities and launch new companies. We consider Anaconda to be our first “spin-out” company.
3) Why did you choose to go open source?
I have been writing open source software since 1997, it is part of my DNA at this point. At Quansight, it is our mission to help companies use open source effectively.
4) How could other organizations benefit from offering more open source products or services?
It is important for companies to establish an effective open source strategy. The best developers demand it and will want to engage with the broader community instead of only working on your proprietary stack. Your proprietary code will benefit from community knowledge and community-maintained infrastructure, and you will spend less money developing unnecessary competing solutions.
There are several ways to interact poorly with the open source community and hurt either your brand or your architecture. For example, using open-source without engaging with the community or understanding your dependency is a recipe for future headaches.
Engaging in open source will help you in recruiting and retention as well, as it provides a substrate for innovation in your company — but only if the product management and product engineering are also adapted to use open-source effectively.
5) What are some recent use cases of Quansight?
We helped a technology company use JupyterLab, Numba, Pandas, and Altair to help an organization show-off their product in a common data science workflow. We helped another organization make use of Dask to scale their pandas-based workflow to much larger data sizes. Recently, we also helped an organization improve their pandas code-base to be more optimized and start to take advantage of PyTorch.
6) Where do you see the open culture moving forward? Do you think more people will come together to use it, or will groups become more isolated and keep their data to themselves?
Open culture continues its vibrancy in software as it becomes the de-facto standard for software infrastructure. In the data world, we are just starting the process of understanding how an open culture will inform practices. For public datasets, it will be clear that open data will be the standard and the cloud providers will use that to compete for mind-share by providing as much data as they can be tied to their computing infrastructure. This will cause some otherwise proprietary datasets to become public.
However, there will be a strong motivation to keep data relevant to a particular business private — providing access in some cases over APIs and in other cases only using the data to provide a better product.
There will be some pressure provided by the large cloud providers to at least provide the appearance of “open-data” particularly around AI models as software becomes more about models on data in a machine learning framework, then imperative code.
7) Bonus question: If you could only have 5 members for a data science team, what would your ideal group be?
- Machine Learning specialist
- Data Scientist
- Data Engineer
- DevOps (Developer with SysAdmin skill)