fbpx
How to Fix Data Leakage – Your Model’s Greatest Enemy
At ODSC London 2018, Yuriy Guts of DataRobot gave a talk on data leakage, including potential sources of the problem and how it can be remedied. Data leakage – also sometimes referred to as data snooping – is a phenomenon in machine learning that occurs when a model is... Read more
Going to the Bank: Using Deep Learning For Banking and the Financial Industry
At ODSC London 2018, Pavel Shkadzko explained to the audience how Gini GmbH, where he works as a semantics engineer, uses deep learning to automate information extraction from financial documents, such as invoices. By applying deep learning to tasks historically handled by optical character recognition and clever regular expression... Read more
Digital Transformation and Agile Decision-Making in Commerce
Dr. Kanishka Bhattacharya spoke at ODSC London 2018 on how data science is changing commerce and enabling large companies to become more “agile” in their decision process. Dr. Bhattacharya, who works as a senior director at Publicis.Sapient, explained that large corporations with ‘traditional’ business models are failing to meet... Read more
Why Consumers Should Trust Companies with Their Data
Hugo Pinto is an asthmatic. He’s aware of the environmental triggers that can induce an asthma attack, but he wasn’t satisfied with the option that faces most asthmatics: wait until an attack happens, and treat the symptoms once it does. [Related Article: Why The New Era of Big Data... Read more
Layer-wise Relevance Propagation Means More Interpretable Deep Learning
Wojciech Samek is head of machine learning for Fraunhofer Heinrich Hertz Institute. At ODSC Europe 2018, he spoke about an active area of research in deep learning: interpretability; layer-wise relevance propagation. Samek launched his lecture with the following preface on the rising importance of interpretability of deep learning models: [Related... Read more
How to Play Fantasy Sports Strategically (and Win)
Daily Fantasy Sports is a multibillion-dollar industry with millions of annual users. The Imperial College Business School’s Martin Haugh created a framework to best those users by modeling what they’ll do and constructing a team based on it. Haugh presented his research on how to play Fantasy sports strategically... Read more
Mail Processing with Deep Learning: A Case Study
Businesses increasingly delegate simple, boring, and repetitive tasks to artificial intelligence. In a case study, Alexandre Hubert — lead data scientist of software company Dataiku’s U.K. operations — worked on a team of three to automate mail processing with deep learning. At ODSC Europe 2018, Hubert detailed how his team... Read more
Olivier Blais of Moov AI on His Experience as a Speaker at ODSC West 2018
I am back from Open Data Science Conference (ODSC West) in California. What a blast! Not only was I able to present my talk on the democratization of AI, but I have learned a lot of very interesting stuff! I honestly am impressed by the projects and technologies presented... Read more
Handling Missing Data in Python/Pandas
Key Takeaways: It’s important to describe missing data and the challenges it poses. You need to clarify a confusing terminology that further adds to the field’s complexity. You should take the time to review methods for handling missing data. You need to learn how to apply robust multiple imputation... Read more
Thomas Wiecki of Quantopian on ‘Minding the Gap’ Between Statistics and Machine Learning at ODSC Europe 2018
Key Takeaways: It’s important for data scientists to understand the so-called “gap” between statistics and machine learning, and how there actually is a lot of commonality between the two; it’s just a matter of how you look at things. PyMC3 is a very useful probabilistic programming framework for Python.... Read more