Hadoop fundamentals. Big Data Consulting. Robert Gibbon
|
|
- Ellen Fleming
- 5 years ago
- Views:
Transcription
1 Hadoop fundamentals Big Data Consulting Robert Gibbon
2 Rob Gibbon Industries Belgium Focus on designing, deploying & integrating web scale solutions with Hadoop Deliveries for clients in telco, financial services, media & many more
3
4 The information age The economic third wave has badly hit many blue chip organisations Manufacturing and retail is in rapid decline in Europe and the US Tech, connectivity and information is restructuring our societies New trading platforms are empowering small businesses
5
6 Innovation Mass-production hates innovation Innovation means change a huge cost with little benefit for production-line economies Continuous improvement Knowledge services need to innovate to differentiate Change in a virtual world can be cheap and yield huge rewards Continuous reinvention
7 The rover bicycle, 1885
8 Big data viz. innovation In a free market like the web, innovation can open up new opportunities Consumer access to grid computing tech is a recent innovation Grid computing opens up new opportunities that would otherwise not be viable Ideal for ventures architected around the long-tail economic model
9 The future - thingternet The internet of things is with us Billions of connected devices, even digital tattoos
10 Big data viz. internet of things Billions of connected devices create a huge amount of data Until big data tech, Internet of Things was nearly impossible to monetize
11 The internet of things is a wild west Many new, unsolved challenges Privacy Governance Civil liberties New challenges = new opportunities
12 Artificial Intelligence Driverless transport Algotrading platforms Personal assistants Carers and companions
13 let's get back to hadoop
14 Hadoop in a nutshell Free, Open Source Software For processing Big Data By connecting standard rackmount servers together To work together as one
15 what can you do with hadoop?
16 Storage Pure online data storage, with no other processing Low cost per-gb for petascale online storage Option to directly query and analyse the data is available if required.
17 Search Example: huge, constantly changing catalogue of products like Ebay and Amazon SolrCloud an advanced search engine serving terabytes of content from Hadoop
18 Messaging A distributed message queue backed by a Hadoop cluster - Apache Kafka Elastically scalable Messages are persisted and replicated for durability TBs of messages per broker with predictable performance
19 Targeting Personalised content for users Generates and consumes a huge amount of log data for reporting for predictive analysis Predictive analysis is compute intensive Can be TBs of data per day
20 Self-service Business Intelligence Enterprise Data Lake paradigm A very popular emerging use case Business users directly access raw datasets using specialised discovery tools built on top of Hadoop - DataMeer, Platfora and others
21 Data warehousing Migration of Enterprise Data Warehouse to Hadoop Big cost savings versus trad vendors like Oracle and Teradata
22 Machine learning Predictive analytics with eg. Spark MLLib Automatically predict component failures for proactive intervention
23 Big Database Low latency, high throughput, high concurrency, high volume Financial trading Realtime ad auctions Volumes at 200BN transactions per day in realtime reliably served
24 Device management Analysis and response to threats detected by SPI module on remote switch Automated systems management shut down heating & lighting Monitor driver propensity to break the speed limit offer lower insurance premiums to good drivers
25 hadoop - mature?
26 Choice of vendors
27 Solid operational management
28 Apache Impala v commercial dbms
29 Free grid computing
30 Free scale-out database & data warehouse
31 Growing commercial ecosystem
32 Secure and available Solid data protection controls Solid information security measures MasterCard run Hadoop to PCI DSS compliance
33 thanks for listening be.linkedin.com/in/robertgibbon