Building Data Teams. Business Data Science Use Cases

Similar documents
Transcription:

Building Data Teams Business Data Science Use Cases

Agenda (mail order company) 2003-2007 (social network) 2007-2009 (online marketing agency) 2011-2012 (web-pages to the people!) 2012-

OTTO - Catalogue Targeting SENN* Neural Networks *Software Environment for Neural Networks (http://www.siemens. com/innovation/apps/pof_microsite/_pof-fall-2011/_html_en/neural-networks. html)

OTTO - Catalogue Targeting Sample Explore Modify Model Assess

OTTO - Catalogue Targeting

OTTO - Lessons Learned Standard processes, transparency and reproducibility might top performance

XING - Recsys

XING - Recsys Tag Graph Source: Data-Driven Ontologies for Recommender Engines in Social Networks, I. Bax, J. Moldvay, 2009

XING - Recsys-Tag Graph Clustering SQL Data Analytics BI Data Mining Python SAS Multivariate Analysis R Predictive Analytics Java Data Science Hive Statistical Modelling HBase Machine Learning Big Data MapReduce Spark

XING - Recsys-Tag Graph Clustering SQL Data Analytics BI 4.1 2.2 Multivariate Analysis Data Mining Python SAS R 3.3 Java 3.5 Predictive Analytics Data Science Hive Statistical Modelling HBase Machine Learning Big Data MapReduce Spark

XING - Recsys-Tag Graph Clustering 2.2+4.1=6.3 SQL Data Analytics BI 4.1 2.2 Multivariate Analysis Data Mining Python SAS R 3.3 Java 3.5 =3.5+3.3=6.8 Predictive Analytics Data Science Hive Statistical Modelling HBase Machine Learning Big Data MapReduce Spark

XING - Recsys

XING - Lessons Learned ab-test performance (against random, top10) evaluate and tune your recsys according to how they are integrated (top 3 or top 10) implementing first productive recsys was guerilla warfare

Unique Digital (Online Marketing)

Unique Digital (Online Marketing) Independent Variables Display Ad View Google Ad Click Facebook Ad Click Affiliate Click Model Dependent Variables P(Sale=1) logreg

Unique Digital (Online Marketing) P(Sale=1)= 0.001 Display += 0.001 Display Model P(Sale=1)= 0.002 Affiliate += 0.006 Display, Affiliate Model P(Sale=1)= 0.008 Search += 0.012 Display, Affiliate, Search Model P(Sale=1)= 0.02

Unique Digital - Lessons Learned Went from local MySQL, to AWS RDS, to AWS S3 and EMR All modelling done in R

Jimdo - Design Recommender

Jimdo - Design Recommender dists = cosine_similarity(df_train) adopted from: http://docs.yhathq.com/scienceops/deploying-models/examples/python/deploy-a-beer-recommender.html

Jimdo - Design Recommender dists = cosine_similarity(df_train) adopted from: http://docs.yhathq.com/scienceops/deploying-models/examples/python/deploy-a-beer-recommender.html

Jimdo - Design Recommender def get_top_2(products): p = dists[products].apply(lambda row: np.sum(row), axis=1) p = p.order(ascending=false) return p.index[p.index.isin(products)==false][:2] print get_top_2(["334","de_de"]) Index([u'283', u'278'], dtype='object') adopted from: http://docs.yhathq.com/scienceops/deploying-models/examples/python/deploy-a-beer-recommender.html

Jimdo - No Data Princelings Data

Jimdo - Self Service DWH Data Self Service DWH

Jimdo - Lessons Learned Data Driven through: AB-Testing & Self Service SQL Serve the business (Recsys might come later)

Building Data Teams Finding Data Scientist Unicorns Source: http://www.forbes.com/sites/danwoods/2012/03/08/hilary-mason-what-is-a-data-scientist/ & http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram

Building Data Teams Source: http://www.oreilly.com/data/free/files/analyzing-the-analyzers.pdf

Building Data Teams Hacking & Engineering Statistics & Analytics DATA TEAM Business Analysis & Communication

Building Data Teams Recruiting fairs: OTTO Create positions & projects for talent: XING & Jimdo Internships: Unique Digital & Jimdo PhDs & academics!: Jimdo

What s next? - Street Fighting DS* *Source: http://de.slideshare.net/pskomoroch/street-fighting-data-science-12072010

What s next? Soft Skills matter! Source: http://data-informed.com/soft-skills-matter-data-science/

What s next?

Contact twitter: @jmoldvay email: janos@jimdo.com blog: blabladata.com