CMU SCS Mining Large Social Networks: Patterns and Anomalies
|
|
- Crystal Harvey
- 5 years ago
- Views:
Transcription
1 Mining Large Social Networks: Patterns and Anomalies Christos Faloutsos CMU
2 Thank you The Department of Informatics Happy 20-th! Prof. Yannis Manolopoulos Prof. Kostas Tsichlas Mrs. Nina Daltsidou C. Faloutsos (CMU) 2
3 International-caliber friends among AUTH alumni Prof. Evimaria Terzi (U. Boston) Prof. Kyriakos Mouratidis (SMU) Dr. Michalis Vlachos (IBM) C. Faloutsos (CMU) 3
4 Outline Introduction Motivation Problem#1: Patterns in graphs Problem#2: Tools Problem#3: Scalability Conclusions C. Faloutsos (CMU) 4
5 Graphs - why should we care? $10s of BILLIONS revenue >500M users Food Web [Martinez 91] Internet Map [lumeta.com] C. Faloutsos (CMU) 5
6 Graphs - why should we care? IR: bi-partite graphs (doc-terms) D T 1 web: hyper-text graph D N T M... and more: C. Faloutsos (CMU) 6
7 Graphs - why should we care? web-log ( blog ) news propagation computer network security: /ip traffic and anomaly detection... [subject-verb-object: graph] Graph == relational table with 2 columns (src, dst) BIG DATA big graphs C. Faloutsos (CMU) 7
8 Outline Introduction Motivation Problem#1: Patterns in graphs Static graphs Weighted graphs Time evolving graphs Problem#2: Tools Problem#3: Scalability Conclusions C. Faloutsos (CMU) 8
9 Problem #1 - network and graph mining What does the Internet look like? What does FaceBook look like? What is normal / abnormal? which patterns/laws hold? C. Faloutsos (CMU) 9
10 Graph mining Are real graphs random? C. Faloutsos (CMU) 10
11 Laws and patterns Are real graphs random? A: NO!! Diameter in- and out- degree distributions other (surprising) patterns So, let s look at the data C. Faloutsos (CMU) 11
12 Solution# S.1 Power law in the degree distribution [SIGCOMM99] log(degree) ibm.com internet domains att.com log(rank) C. Faloutsos (CMU) 12
13 Solution# S.1 Power law in the degree distribution [SIGCOMM99] log(degree) ibm.com internet domains att.com log(rank) C. Faloutsos (CMU) 13
14 But: How about graphs from other domains? C. Faloutsos (CMU) 14
15 More power laws: web hit counts [w/ A. Montgomery] Web Site Traffic Count (log scale) Zipf ``ebay users sites in-degree (log scale) C. Faloutsos (CMU) 15
16 And numerous more Who-trusts-whom (epinions.com) Income [Pareto] distribution Duration of downloads [Bestavros+] Duration of UNIX jobs ( mice and elephants ) Size of files of a user Black swans C. Faloutsos (CMU) 16
17 Outline Introduction Motivation Problem#1: Patterns in graphs Static graphs degree, diameter, eigen, Triangles Time evolving graphs Problem#2: Tools C. Faloutsos (CMU) 17
18 Solution# S.3: Triangle Laws Real social networks have a lot of triangles C. Faloutsos (CMU) 18
19 Solution# S.3: Triangle Laws Real social networks have a lot of triangles Friends of friends are friends Any patterns? C. Faloutsos (CMU) 19
20 Triangle Law: #S.3 [Tsourakakis ICDM 2008] Reuters SN Epinions X-axis: degree Y-axis: mean # triangles n friends -> ~n 1.6 triangles C. Faloutsos (CMU) 20
21 Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD 11] C. Faloutsos (CMU) 21
22 Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD 11] C. Faloutsos (CMU) 22
23 Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD 11] C. Faloutsos (CMU) 23
24 Outline Introduction Motivation Problem#1: Patterns in graphs Static graphs Time evolving graphs Problem#2: Tools C. Faloutsos (CMU) 24
25 Problem: Time evolution with Jure Leskovec (CMU -> Stanford) and Jon Kleinberg (Cornell CMU) C. Faloutsos (CMU) 25
26 T.1 Evolution of the Diameter Prior work on Power Law graphs hints at slowly growing diameter: diameter ~ O(log N) diameter ~ O(log log N) What is happening in real data? C. Faloutsos (CMU) 26
27 T.1 Evolution of the Diameter Prior work on Power Law graphs hints at slowly growing diameter: diameter ~ O(log N) diameter ~ O(log log N) What is happening in real data? Diameter shrinks over time C. Faloutsos (CMU) 27
28 T.1 Diameter Patents Patent citation network 25 years of 2.9 M nodes 16.5 M edges diameter time [years] C. Faloutsos (CMU) 28
29 Outline Introduction Motivation Problem#1: Patterns in graphs Problem#2: Tools Belief Propagation Problem#3: Scalability Conclusions C. Faloutsos (CMU) 29
30 E-bay Fraud detection w/ Polo Chau & Shashank Pandit, CMU [www 07] C. Faloutsos (CMU) 30
31 E-bay Fraud detection C. Faloutsos (CMU) 31
32 E-bay Fraud detection C. Faloutsos (CMU) 32
33 E-bay Fraud detection - NetProbe C. Faloutsos (CMU) 33
34 Popular press And less desirable attention: from Belgium police ( copy of your code? ) C. Faloutsos (CMU) 34
35 Outline Introduction Motivation Problem#1: Patterns in graphs Problem#2: Tools Problem#3: Scalability -PEGASUS Conclusions C. Faloutsos (CMU) 35
36 Scalability Google: > 450,000 processors in clusters of ~2000 processors each [Barroso, Dean, Hölzle, Web Search for a Planet: The Google Cluster Architecture IEEE Micro 2003] Yahoo: 5Pb of data [Fayyad, KDD 07] Problem: machine failures, on a daily basis How to parallelize data mining tasks, then? A: map/reduce hadoop (open-source clone) C. Faloutsos (CMU) 36
37 Outline Introduction Motivation Problem#1: Patterns in graphs Problem#2: Tools Problem#3: Scalability PEGASUS Radius plot Conclusions C. Faloutsos (CMU) 37
38 HADI for diameter estimation Radius Plots for Mining Tera-byte Scale Graphs U Kang, Charalampos Tsourakakis, Ana Paula Appel, Christos Faloutsos, Jure Leskovec, SDM 10 Naively: diameter needs O(N**2) space and up to O(N**3) time prohibitive (N~1B) Our HADI: linear on E (~10B) Near-linear scalability wrt # machines Several optimizations -> 5x faster C. Faloutsos (CMU) 38
39 Count???? 19+ [Barabasi+] ~1999, ~1M nodes Radius C. Faloutsos (CMU) 39
40 Count?????? 19+ [Barabasi+] ~1999, ~1M nodes Radius YahooWeb graph (120Gb, 1.4B nodes, 6.6 B edges) Largest publicly available graph ever studied. C. Faloutsos (CMU) 40
41 Count 14 (dir.)???? ~7 (undir.) 19+? [Barabasi+] Radius YahooWeb graph (120Gb, 1.4B nodes, 6.6 B edges) Largest publicly available graph ever studied. C. Faloutsos (CMU) 41
42 Count 14 (dir.)???? ~7 (undir.) 19+? [Barabasi+] YahooWeb graph (120Gb, 1.4B nodes, 6.6 B edges) 7 degrees of separation (!) Diameter: shrunk C. Faloutsos (CMU) Radius 42
43 Count???? ~7 (undir.) Radius YahooWeb graph (120Gb, 1.4B nodes, 6.6 B edges) Q: Shape? C. Faloutsos (CMU) 43
44 YahooWeb graph (120Gb, 1.4B nodes, 6.6 B edges) effective diameter: surprisingly small. Multi-modality (?!) C. Faloutsos (CMU) 44
45 Radius Plot of GCC of YahooWeb. C. Faloutsos (CMU) 45
46 YahooWeb graph (120Gb, 1.4B nodes, 6.6 B edges) effective diameter: surprisingly small. Multi-modality: probably mixture of cores. C. Faloutsos (CMU) 46
47 Conjecture: EN DE ~7 BR YahooWeb graph (120Gb, 1.4B nodes, 6.6 B edges) effective diameter: surprisingly small. Multi-modality: probably mixture of cores. C. Faloutsos (CMU) 47
48 Conjecture: ~7 YahooWeb graph (120Gb, 1.4B nodes, 6.6 B edges) effective diameter: surprisingly small. Multi-modality: probably mixture of cores. C. Faloutsos (CMU) 48
49 Outline Introduction Motivation Problem#1: Patterns in graphs Problem#2: Tools Problem#3: Scalability Conclusions C. Faloutsos (CMU) 49
50 OVERALL CONCLUSIONS low level: Several new patterns (shrinking diameters, triangle-laws, etc) New tools: Fraud detection (belief propagation) Scalability: PEGASUS / hadoop C. Faloutsos (CMU) 50
51 OVERALL CONCLUSIONS medium-level BIG DATA: Large datasets reveal patterns/outliers that are invisible otherwise C. Faloutsos (CMU) 51
52 Project info Chau, Polo Koutra, Danae Prakash, Aditya Akoglu, Leman Kang, U McGlohon, Mary Tong, Hanghang Thanks to: NSF IIS , IIS , CTA-INARC; Yahoo (M45), LLNL, IBM, SPRINT, C. Faloutsos (CMU) 52 Google, INTEL, HP, ilab
53 Thank you for the honor! Congratulations for 20-th anniversary and C. Faloutsos (CMU) 53
54 High-level conclusion: Collaborations Sociology + CS (triangles) Civil engineering + CS (sensor placement) fmri/medical + graphs (medical db s) C. Faloutsos (CMU) 54
55 Never stop learning GHRASKW AEI DIDASKOMENOS Socrates Plato Aristotle C. Faloutsos (CMU) 55
CMU SCS Mining Billion-node Graphs: Patterns and Tools
Mining Billion-node Graphs: Patterns and Tools Christos Faloutsos CMU Thank you! Monica Rogati C. Faloutsos (CMU) 2 Our goal: Open source system for mining huge graphs: PEGASUS project (PEta GrAph mining
More informationCMU SCS Large Graph Mining
Large Graph Mining Christos Faloutsos CMU Thank you! Ed Kao Lori Tsoulas Joan Meehan-Dion C. Faloutsos (CMU) 2 Our goal: Open source system for mining huge graphs: PEGASUS project (PEta GrAph mining System)
More informationCMU SCS Large Graph Mining Patterns, Tools and Cascade analysis
Large Graph Mining Patterns, Tools and Cascade analysis Christos Faloutsos CMU Thank you! Brian Gallagher Jan Winfield C. Faloutsos (CMU) 2 Roadmap Introduction Motivation Why big data Why (big) graphs?
More informationCMU SCS Mining Billion-Node Graphs: Patterns and influence propagation
Mining Billion-Node Graphs: Patterns and influence propagation Christos Faloutsos CMU Thank you! Sinan Aral Foster Provost Arun Sundararajan Shirley Lau C. Faloutsos (CMU) 2 Our goal: Open source system
More informationAnalytics Building Blocks
http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Analytics Building Blocks Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech Partly based
More informationAnalytics Building Blocks
http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Analytics Building Blocks Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech Partly based
More informationJure Leskovec, Includes joint work with Jaewon Yang, Manuel Gomez Rodriguez, Andreas Krause, Lars Backstrom and Jon Kleinberg.
Jure Leskovec, Stanford University Includes joint work with Jaewon Yang, Manuel Gomez Rodriguez, Andreas Krause, Lars Backstrom and Jon Kleinberg. Information reaches us by personal influence in our social
More informationSunnie Chung. Cleveland State University
Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:
More informationCo-evolution of Social and Affiliation Networks
Co-evolution of Social and Affiliation Networks Elena Zheleva Dept. of Computer Science University of Maryland College Park, MD 20742, USA elena@cs.umd.edu Hossam Sharara Dept. of Computer Science University
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 19 1 Acknowledgement The following discussion is based on the paper Mining Big Data: Current Status, and Forecast to the Future by Fan and Bifet and online presentation
More informationE-guide Hadoop Big Data Platforms Buyer s Guide part 1
Hadoop Big Data Platforms Buyer s Guide part 1 Your expert guide to Hadoop big data platforms for managing big data David Loshin, Knowledge Integrity Inc. Companies of all sizes can use Hadoop, as vendors
More informationCS 6604: Data Mining Large Networks and Time- series. B. Aditya Prakash Lecture #6: Hadoop and Graph Analysis
CS 0: Data Mining Large Networks and Time- series B. Aditya Prakash Lecture #: Hadoop and Graph Analysis What to do if data is really large? Peta- bytes (exabytes, ze?abytes.. ) Google processed PB of
More informationJure Leskovec, Computer Science, Stanford University
Jure Leskovec, Computer Science, Stanford University Includes joint work with Jaewon Yang, Manuel Gomez- Rodriguez, Andreas Krause, Lars Backstrom and Jon Kleinberg. Information reaches us by personal
More informationFinal Report: Local Structure and Evolution for Cascade Prediction
Final Report: Local Structure and Evolution for Cascade Prediction Jake Lussier (lussier1@stanford.edu), Jacob Bank (jbank@stanford.edu) ABSTRACT Information cascades in large social networks are complex
More informationBig Data: How it is Generated and its Importance
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727 PP 01-05 www.iosrjournals.org Big Data: How it is Generated and its Importance Asst. Prof. Mrs. Mugdha Ghotkar 1, Ms.
More informationFrom Information to Insight: The Big Value of Big Data. Faire Ann Co Marketing Manager, Information Management Software, ASEAN
From Information to Insight: The Big Value of Big Data Faire Ann Co Marketing Manager, Information Management Software, ASEAN The World is Changing and Becoming More INSTRUMENTED INTERCONNECTED INTELLIGENT
More informationInferring Social Ties across Heterogeneous Networks
Inferring Social Ties across Heterogeneous Networks CS 6001 Complex Network Structures HARISH ANANDAN Introduction Social Ties Information carrying connections between people It can be: Strong, weak or
More informationSpark, Hadoop, and Friends
Spark, Hadoop, and Friends (and the Zeppelin Notebook) Douglas Eadline Jan 4, 2017 NJIT Presenter Douglas Eadline deadline@basement-supercomputing.com @thedeadline HPC/Hadoop Consultant/Writer http://www.basement-supercomputing.com
More informationFinal Report: Local Structure and Evolution for Cascade Prediction
Final Report: Local Structure and Evolution for Cascade Prediction Jake Lussier (lussier1@stanford.edu), Jacob Bank (jbank@stanford.edu) December 10, 2011 Abstract Information cascades in large social
More informationCOMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 1, 2017 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2457
More informationGot Hadoop? Whitepaper: Hadoop and EXASOL - a perfect combination for processing, storing and analyzing big data volumes
Got Hadoop? Whitepaper: Hadoop and EXASOL - a perfect combination for processing, storing and analyzing big data volumes Contents Introduction...3 Hadoop s humble beginnings...4 The benefits of Hadoop...5
More informationTransforming Big Data into Unlimited Knowledge
Transforming Big Data into Unlimited Knowledge Timos Sellis, RMIT University timos.sellis@rmit.edu.au 2 Big Data What is it? Most commonly accepted definition, by Gartner (the 3 Vs) Big data is high-volume,
More informationTransforming Big Data into Unlimited Knowledge
Transforming Big Data into Unlimited Knowledge Timos Sellis, RMIT University timos.sellis@rmit.edu.au Big Data What is it? Most commonly accepted definition, by Gartner (the 3 Vs) Big data is high-volume,
More informationNouvelle Génération de l infrastructure Data Warehouse et d Analyses
Nouvelle Génération de l infrastructure Data Warehouse et d Analyses November 2011 André Münger andre.muenger@emc.com +41 79 708 85 99 1 Agenda BIG Data Challenges Greenplum Overview Use Cases Summary
More informationHadoop and Analytics at CERN IT CERN IT-DB
Hadoop and Analytics at CERN IT CERN IT-DB 1 Hadoop Use cases Parallel processing of large amounts of data Perform analytics on a large scale Dealing with complex data: structured, semi-structured, unstructured
More informationWill This Paper Increase Your h-index? Scientific Impact Prediction
Will This Paper Increase Your h-index? Scientific Impact Prediction Yuxiao Dong, Reid A. Johnson, Nitesh V. Chawla Interdisciplinary Center for Network Science and Applications University of Notre Dame
More informationBIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW
BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW TOPICS COVERED 1 2 Fundamentals of Big Data Platforms Major Big Data Tools Scaling Up vs. Out SCALE UP (SMP) SCALE OUT (MPP) + (n) Upgrade
More informationOnline Social Networks
Online Social Networks Daniel Huttenlocher John P. and Rilla Neafsey Professor of Computing, Information Science and Business Social Network Models [Zachary77] Individuals and relationships between them
More informationSmarter Analytics for Big Data
Smarter Analytics for Big Data Anjul Bhambhri IBM Vice President, Big Data February 27, 2011 The World is Changing and Becoming More INSTRUMENTED INTERCONNECTED INTELLIGENT The resulting explosion of information
More informationRealising Value from Data
Realising Value from Data Togetherwith Open Source Drives Innovation & Adoption in Big Data BCS Open Source SIG London 1 May 2013 Timings 6:00-6:30pm. Register / Refreshments 6:30-8:00pm, Presentation
More informationLearning Based Admission Control. Jaideep Dhok MS by Research (CSE) Search and Information Extraction Lab IIIT Hyderabad
Learning Based Admission Control and Task Assignment for MapReduce Jaideep Dhok MS by Research (CSE) Search and Information Extraction Lab IIIT Hyderabad Outline Brief overview of MapReduce MapReduce as
More informationCan Cascades be Predicted?
Can Cascades be Predicted? Rediet Abebe and Thibaut Horel September 22, 2014 1 Introduction In this presentation, we discuss the paper Can Cascades be Predicted? by Cheng, Adamic, Dow, Kleinberg, and Leskovec,
More informationTop 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11
Top 5 Challenges for Hadoop MapReduce in the Enterprise Whitepaper - May 2011 http://platform.com/mapreduce 2 5/9/11 Table of Contents Introduction... 2 Current Market Conditions and Drivers. Customer
More informationMining Heterogeneous Urban Data at Multiple Granularity Layers
Mining Heterogeneous Urban Data at Multiple Granularity Layers Antonio Attanasio Supervisor: Prof. Silvia Chiusano Co-supervisor: Prof. Tania Cerquitelli collection Urban data analytics Added value urban
More informationTheses. Elena Baralis, Tania Cerquitelli, Silvia Chiusano, Paolo Garza Luca Cagliero, Luigi Grimaudo Daniele Apiletti, Giulia Bruno, Alessandro Fiori
Theses Elena Baralis, Tania Cerquitelli, Silvia Chiusano, Paolo Garza Luca Cagliero, Luigi Grimaudo Daniele Apiletti, Giulia Bruno, Alessandro Fiori Turin, January 2015 General information Duration: 6
More informationIBM SPSS & Apache Spark
IBM SPSS & Apache Spark Making Big Data analytics easier and more accessible ramiro.rego@es.ibm.com @foreswearer 1 2016 IBM Corporation Modeler y Spark. Integration Infrastructure overview Spark, Hadoop
More informationred red red red red red red red red red red red red red red red red red red red CYS Rithu P Ravi CYS Saumya K
red red red red red red red red red red red red red red red red red red red red CYS14011 - Rithu P Ravi CYS14012 - Saumya K Why and What HADOOP?... Apache Hadoop is an open-source software framework A
More informationFinal Project - Social and Information Network Analysis
Final Project - Social and Information Network Analysis Factors and Variables Affecting Social Media Reviews I. Introduction Humberto Moreira Rajesh Balwani Subramanyan V Dronamraju Dec 11, 2011 Problem
More informationExperiences in the Use of Big Data for Official Statistics
Think Big - Data innovation in Latin America Santiago, Chile 6 th March 2017 Experiences in the Use of Big Data for Official Statistics Antonino Virgillito Istat Introduction The use of Big Data sources
More informationCS4491/CS 7265 BIG DATA ANALYTICS
CS4491/CS 7265 BIG DATA ANALYTICS BIG DATA * The contents are adapted from Dr. Jeongkyu Lee@UB Mingon Kang, Ph.D Computer Science, Kennesaw State University Era of Big Data 2011: Amount of Digital Information
More informationBig Data in the Future Workforce
Big Data in the Future Workforce Prof Dr Abdullah G ani. SMIEEE, FASc Table of Contents What is data size What is Big Data and its characteristics Where the big data comes from? What processes involved
More informationJure Leskovec Stanford University
Jure Leskovec Stanford University Includes joint work with Jon Kleinberg (Cornell), Lars Backstrom (Cornell), Andreas Krause (Caltech), Carlos Guestrin (CMU),Christos Faloutsos (CMU) 2 Intersection of
More informationA Quantified Approach for Analyzing the User Rating Behaviour in Social Media
A Quantified Approach for Analyzing the User Rating Behaviour in Social Media P. Surya 1, Dr. B. Umadevi 2 1 Research Scholar, 2 Assistant Professor & Head, P.G & Research Department of Computer Science,
More informationData Informatics. Seon Ho Kim, Ph.D.
Data Informatics Seon Ho Kim, Ph.D. seonkim@usc.edu What is Big Data? What is Big Data? Big Data is data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics
More informationReaction Paper Regarding the Flow of Influence and Social Meaning Across Social Media Networks
Reaction Paper Regarding the Flow of Influence and Social Meaning Across Social Media Networks Mahalia Miller Daniel Wiesenthal October 6, 2010 1 Introduction One topic of current interest is how language
More informationOperational Hadoop and the Lambda Architecture for Streaming Data
Operational Hadoop and the Lambda Architecture for Streaming Data 2015 MapR Technologies 2015 MapR Technologies 1 Topics From Batch to Operational Workloads on Hadoop Streaming Data Environments The Lambda
More informationModeling the Temporal Dynamics of Social Rating Networks using Bidirectional Effects of Social Relations and Rating Patterns
Modeling the Temporal Dynamics of Social Rating Networks using Bidirectional Effects of Social Relations and Rating Patterns Mohsen Jamali School of Computing Science Simon Fraser University Burnaby, BC,
More informationChina Center of Excellence
1 China Center of Excellence Project Guardian ebay is a global company, projects within ebay normally require efforts and synergies from teams located in different cities of different countries. This is
More informationBusiness Intelligence, 4e (Sharda/Delen/Turban) Chapter 1 An Overview of Business Intelligence, Analytics, and Data Science
Business Intelligence, 4e (Sharda/Delen/Turban) Chapter 1 An Overview of Business Intelligence, Analytics, and Data Science 1) Computerized support is only used for organizational decisions that are responses
More informationText Mining. Theory and Applications Anurag Nagar
Text Mining Theory and Applications Anurag Nagar Topics Introduction What is Text Mining Features of Text Document Representation Vector Space Model Document Similarities Document Classification and Clustering
More informationSOCIAL MEDIA MINING. Behavior Analytics
SOCIAL MEDIA MINING Behavior Analytics Dear instructors/users of these slides: Please feel free to include these slides in your own material, or modify them as you see fit. If you decide to incorporate
More informationBasics of Big Data Analytics
Basics of Big Data Analytics BRETT AMIDAN JEFFERY DAGLE Pacific Northwest National Laboratory NASPI Presentation (October 23, 2014) November 3, 2014 b.amidan@pnnl.gov 1 What is Big Data? Any collection
More information2012 SNIA Analytics and Big Data Summit. Insert Your Company Name. All Rights Reserved.
A Working Definition of Big Data Data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Wikipedia, 4/26/2011
More informationThe Importance of Secure Analytics & AI
The Importance of 2014 Secure Analytics & AI Rick Hutley Program Director, Professor of Practice, Data Science, University of the Pacific Email: rhutley@pacific.edu Vice President of IoT, Cisco Systems
More informationA STUDY ON SNA: MEASURE AVERAGE DEGREE AND AVERAGE WEIGHTED DEGREE OF KNOWLEDGE DIFFUSION IN GEPHI
A STUDY ON SNA: MEASURE AVERAGE DEGREE AND AVERAGE WEIGHTED DEGREE OF KNOWLEDGE DIFFUSION IN GEPHI Ayyappan.G 1, Dr.C.Nalini 2, Dr.A.Kumaravel 3 Research Scholar, Department of CSE, Bharath University,Chennai
More informationIs Machine Learning the future of the Business Intelligence?
Is Machine Learning the future of the Business Intelligence Fernando IAFRATE : Sr Manager of the BI domain Fernando.iafrate@disney.com Tel : 33 (0)1 64 74 59 81 Mobile : 33 (0)6 81 97 14 26 What is Business
More informationUnderstanding the Behavior of In-Memory Computing Workloads. Rui Hou Institute of Computing Technology,CAS July 10, 2014
Understanding the Behavior of In-Memory Computing Workloads Rui Hou Institute of Computing Technology,CAS July 10, 2014 Outline Background Methodology Results and Analysis Summary 2 Background The era
More informationData Analytics and CERN IT Hadoop Service. CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB
Data Analytics and CERN IT Hadoop Service CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB 1 Data Analytics at Scale The Challenge When you cannot fit your workload in a desktop Data
More informationELEC 6910Q Analytics and Systems for Social Media and Big Data Applications
ELEC 6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 1 Intro. to Analytics, Big Data, and this Course Prof. James She james.she@ust.hk 1 About me Assistant Professor, ECE,
More informationLinked Big Data Graph Database and Analytics
E6893 Big Data Analytics Lecture 6: Linked Big Data Graph Database and Analytics Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science October 11, 2018 E6893 Big
More informationRelational Data Mining and Web Mining
Relational Data Mining and Web Mining Prof. Dr. Daning Hu Department of Informatics University of Zurich Nov 20th, 2012 Outline Introduction: Big Data Relational Data Mining Web Mining Ref Book: Web Intelligence,
More informationINDUSTRIAL VISUALIZATION
1 INDUSTRIAL VISUALIZATION U of T, Info Vis Oct 31, 2016 From CBS News Show 60 Minutes: http://www.cbsnews.com/news/new-search-engine-exposes-the-dark-web/ From Wired.com Contents 2015 Uncharted Software
More informationIntro to Big Data and Hadoop
Intro to Big and Hadoop Portions copyright 2001 SAS Institute Inc., Cary, NC, USA. All Rights Reserved. Reproduced with permission of SAS Institute Inc., Cary, NC, USA. SAS Institute Inc. makes no warranties
More informationData Analytics on a Yelp Data Set. Maitreyi Tata. B.Tech., Gitam University, India, 2015 A REPORT
Data Analytics on a Yelp Data Set by Maitreyi Tata B.Tech., Gitam University, India, 2015 A REPORT submitted in partial fulfillment of the requirements for the degree MASTER OF SCIENCE Department of Computer
More informationThe ABCs of Big Data Analytics, Bandwidth and Content
White Paper The ABCs of Big Data Analytics, Bandwidth and Content Richard Treadway and Ingo Fuchs, NetApp March 2012 WP-7147 EXECUTIVE SUMMARY Enterprises are entering a new era of scale, where the amount
More informationOptimizing Outcomes in a Connected World: Turning information into insights
Optimizing Outcomes in a Connected World: Turning information into insights Michael Eden Management Brand Executive Central & Eastern Europe Vilnius 18 October 2011 2011 IBM Corporation IBM celebrates
More informationFinal Report Evaluating Social Networks as a Medium of Propagation for Real-Time/Location-Based News
Final Report Evaluating Social Networks as a Medium of Propagation for Real-Time/Location-Based News Mehmet Ozan Kabak, Group Number: 45 December 10, 2012 Abstract This work is concerned with modeling
More informationCommon Customer Use Cases in FSI
Common Customer Use Cases in FSI 1 Marketing Optimization 2014 2014 MapR MapR Technologies Technologies 2 Fortune 100 Financial Services Company 104M CARD MEMBERS 3 Financial Services: Recommendation Engine
More informationBusiness Network Analytics
Business Network Analytics Sep, 2017 Daning Hu Department of Informatics University of Zurich Business Intelligence Research Group F Schweitzer et al. Science 2009 Research Methods and Goals What Why How
More informationThe Structure of Comment Networks
The Structure of Comment Networks Vahed Qazvinian Department of EECS University of Michigan Ann Arbor, MI vahed@umich.edu Jafar Adibi Center for Advanced Research PricewaterhouseCoopers LLP San Jose, CA
More informationWhat about streaming data?
What about streaming data? 1 The Stream Model Data enters at a rapid rate from one or more input ports Such data are called stream tuples The system cannot store the entire (infinite) stream Distribution
More informationABOUT THIS TRAINING: This Hadoop training will also prepare you for the Big Data Certification of Cloudera- CCP and CCA.
ABOUT THIS TRAINING: The world of Hadoop and Big Data" can be intimidating - hundreds of different technologies with cryptic names form the Hadoop ecosystem. This comprehensive training has been designed
More informationMapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia
MapR: Converged Data Pla3orm and Quick Start Solu;ons Robin Fong Regional Director South East Asia Who is MapR? MapR is the creator of the top ranked Hadoop NoSQL SQL-on-Hadoop Real Database time streaming
More informationBig Data The Big Story
Big Data The Big Story Jean-Pierre Dijcks Big Data Product Mangement 1 Agenda What is Big Data? Architecting Big Data Building Big Data Solutions Oracle Big Data Appliance and Big Data Connectors Customer
More informationGE Intelligent Platforms. Proficy Historian HD
GE Intelligent Platforms Proficy Historian HD The Industrial Big Data Historian Industrial machines have always issued early warnings, but in an inconsistent way and in a language that people could not
More informationCONNECTING SOCIAL MEDIA TO ECOMMERCE USING MICROBLOGGING AND ARTIFICIAL NEURAL NETWORK
CONNECTING SOCIAL MEDIA TO ECOMMERCE USING MICROBLOGGING AND ARTIFICIAL NEURAL NETWORK Ms.S.P.VidhyaPriya 1,B.Gokhila 2, T.Santhiya 3, K.Saranya 4 1 M.E.,Assistant Professor-CSE, Kathir College Of Engineering,
More informationBig Data and Automotive IT. Michael Cafarella University of Michigan September 11, 2013
Big Data and Automotive IT Michael Cafarella University of Michigan September 11, 2013 A Banner Year for Big Data 2 Big Data Who knows what it means anymore? Big Data Who knows what it means anymore? Associated
More informationWho s Here Today? B2B Social Media: Why?
Who s Here Today? Agenda B2B Social Media Going Beyond LinkedIn B2B: Why Use Social Media? Best Practices Facebook LinkedIn Twitter Analytics B2B Social Media: Why? B2B Social Media: Why? B2B Social Media:
More informationComputational Challenges for Real-Time Marketing with Large Datasets
Computational Challenges for Real-Time Marketing with Large Datasets Alan Montgomery Associate Professor Carnegie Mellon University Tepper School of Business e-mail: alan.montgomery@cmu.edu web: http://www.andrew.cmu.edu/user/alm3
More informationAdvanced Topics in Computer Science 2. Networks, Crowds and Markets. Fall,
Advanced Topics in Computer Science 2 Networks, Crowds and Markets Fall, 2012-2013 Prof. Klara Kedem klara@cs.bgu.ac.il Office 110/37 Office hours by email appointment The textbook Chapters from the book
More informationBig Data Trends to Watch
Big Data Trends to Watch Bill Peterson NetApp September, 2012 1 Bill Peterson @thebillp What I hope to accomplish today... ...and avoid this. What is Big Data? Big Data refers to datasets whose volume,
More informationBehavioral Data Mining. Lecture 22: Network Algorithms II Diffusion and Meme Tracking
Behavioral Data Mining Lecture 22: Network Algorithms II Diffusion and Meme Tracking Once upon a time Once upon a time Approx 1 MB Travels ~ 1 hour, B/W = 3k b/s News Today, the total news available from
More informationCognitive Data Warehouse and Analytics
Cognitive Data Warehouse and Analytics Hemant R. Suri, Sr. Offering Manager, Hybrid Data Warehouses, IBM (twitter @hemantrsuri or feel free to reach out to me via LinkedIN!) Over 90% of the world s data
More informationSO YOU WANT TO HIRE A DIGITAL MARKETING AGENCY...
SO YOU WANT TO HIRE A DIGITAL MARKETING AGENCY... LET S TALK GOALS. To build a successful partnership with a digital marketing agency, you need to know what you want. Having a clear picture of what your
More informationTransforming Business Needs into Business Value. Path to Agility May 2013
Transforming Business Needs into Business Value Path to Agility May 2013 Agile Transformation Professional services career Large scale projects Application development & Integration Project management
More informationA scalable Approach to Sizeindependent
Research Dublin Dept. of Computer Science Rutgers A scalable Approach to Sizeindependent Network Similarity Michele Berlingerio Tina Eliassi- Rad Danai Koutra Christos Faloutsos WIN, September 28 th -
More informationData Mining Applications with R
Data Mining Applications with R Yanchang Zhao Senior Data Miner, RDataMining.com, Australia Associate Professor, Yonghua Cen Nanjing University of Science and Technology, China AMSTERDAM BOSTON HEIDELBERG
More informationSAP Big Data. Markus Tempel SAP Big Data and Cloud Analytics Services
SAP Big Data Markus Tempel SAP Big Data and Cloud Analytics Services Is that Big Data? 2015 SAP AG or an SAP affiliate company. All rights reserved. 2 What if you could turn new signals from Big Data into
More informationData Science Strategies for ReaI-Time Analytics
http://www.mif.pg.gda.pl/homepages/jasiu/stud/eim/pdf/22-ind-zastosowania.pdf Data Science Strategies for ReaI-Time Analytics Kirk Borne @KirkDBorne Principal Data Scientist Booz Allen Hamilton http://www.boozallen.com/datascience
More informationSocial Prediction: Can we predict users action and emotions?
Social Prediction: Can we predict users action and emotions? Jie Tang Knowledge Engineering Group Department of Computer Science and Technology Tsinghua University Email: jietang@tsinghua.edu.cn 1 Motivation
More informationIntroduction to Stream Processing
Introduction to Processing Guido Schmutz DOAG Big Data 2018 20.9.2018 @gschmutz BASEL BERN BRUGG DÜSSELDORF HAMBURG KOPENHAGEN LAUSANNE guidoschmutz.wordpress.com FRANKFURT A.M. FREIBURG I.BR. GENF MÜNCHEN
More informationEMC IT Big Data Analytics Journey. Mahmoud Ghanem Sr. Systems Engineer
EMC IT Big Data Analytics Journey Mahmoud Ghanem Sr. Systems Engineer Agenda 1 2 3 4 5 Introduction To Big Data EMC IT Big Data Journey Marketing Science Lab Use Case Technical Benefits Lessons Learned
More informationResearch on Prescriptive Analytics
Research on Prescriptive Analytics Nov. 2014 Hanmin Jung Head of Dept. of Computer Intelligence Research, KISTI Self-introduction (Google Scholar) 2 2 Self-introduction (Academic Activities) * NABD2015
More informationFrom Big Data to Fast Data. Sina Sheikholeslami
From Big Data to Fast Data Sina Sheikholeslami s.sheikholeslami@digikala.com CEIT GradTalks, Tehran Polytechnic May 29 2017 Overview The War on Big Data Definition The Early Days State-of-the-art Big Data
More informationSocial Influence Analysis in Social Networking Big Data: Opportunities and Challenges
ACCEPTED FROM OPEN CALL Social Influence Analysis in Social Networking Big Data: Opportunities and Challenges Sancheng Peng, Guojun Wang, and Dongqing Xie Abstract Social influence analysis has become
More informationBIG DATA TRANSFORMS BUSINESS. Copyright 2013 EMC Corporation. All rights reserved.
BIG DATA TRANSFORMS BUSINESS 1 Big Data = Structured+Unstructured Data Internet Of Things Non-Enterprise Information Structured Information In Relational Databases Managed & Unmanaged Unstructured Information
More informationData Analytics with HPC L01. Introduction What are data analytics, big data, data science,?
Data Analytics with HPC L01 Introduction What are data analytics, big data, data science,? 1 Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0
More informationIncentivized Sharing in Social Networks
Incentivized Sharing in Social Networks Joseph J. Pfeiffer III Department of Computer Science Purdue University, West Lafayette, IN, USA jpfeiffer@purdue.edu Elena Zheleva LivingSocial Washington, D.C.,
More informationGetting Big Value from Big Data
Getting Big Value from Big Data Expanding Information Architectures To Support Today s Data Research Perspective Sponsored by Aligning Business and IT To Improve Performance Ventana Research 2603 Camino
More informationA Survey on Influence Analysis in Social Networks
A Survey on Influence Analysis in Social Networks Jessie Yin 1 Introduction With growing popularity of Web 2.0, recent years have witnessed wide-spread adoption of rich social media applications such as
More information