Hadoop fundamentals. Big Data Consulting. Robert Gibbon

Size: px
Start display at page:

Download "Hadoop fundamentals. Big Data Consulting. Robert Gibbon"

Transcription

1 Hadoop fundamentals Big Data Consulting Robert Gibbon

2 Rob Gibbon Industries Belgium Focus on designing, deploying & integrating web scale solutions with Hadoop Deliveries for clients in telco, financial services, media & many more

3

4 The information age The economic third wave has badly hit many blue chip organisations Manufacturing and retail is in rapid decline in Europe and the US Tech, connectivity and information is restructuring our societies New trading platforms are empowering small businesses

5

6 Innovation Mass-production hates innovation Innovation means change a huge cost with little benefit for production-line economies Continuous improvement Knowledge services need to innovate to differentiate Change in a virtual world can be cheap and yield huge rewards Continuous reinvention

7 The rover bicycle, 1885

8 Big data viz. innovation In a free market like the web, innovation can open up new opportunities Consumer access to grid computing tech is a recent innovation Grid computing opens up new opportunities that would otherwise not be viable Ideal for ventures architected around the long-tail economic model

9 The future - thingternet The internet of things is with us Billions of connected devices, even digital tattoos

10 Big data viz. internet of things Billions of connected devices create a huge amount of data Until big data tech, Internet of Things was nearly impossible to monetize

11 The internet of things is a wild west Many new, unsolved challenges Privacy Governance Civil liberties New challenges = new opportunities

12 Artificial Intelligence Driverless transport Algotrading platforms Personal assistants Carers and companions

13 let's get back to hadoop

14 Hadoop in a nutshell Free, Open Source Software For processing Big Data By connecting standard rackmount servers together To work together as one

15 what can you do with hadoop?

16 Storage Pure online data storage, with no other processing Low cost per-gb for petascale online storage Option to directly query and analyse the data is available if required.

17 Search Example: huge, constantly changing catalogue of products like Ebay and Amazon SolrCloud an advanced search engine serving terabytes of content from Hadoop

18 Messaging A distributed message queue backed by a Hadoop cluster - Apache Kafka Elastically scalable Messages are persisted and replicated for durability TBs of messages per broker with predictable performance

19 Targeting Personalised content for users Generates and consumes a huge amount of log data for reporting for predictive analysis Predictive analysis is compute intensive Can be TBs of data per day

20 Self-service Business Intelligence Enterprise Data Lake paradigm A very popular emerging use case Business users directly access raw datasets using specialised discovery tools built on top of Hadoop - DataMeer, Platfora and others

21 Data warehousing Migration of Enterprise Data Warehouse to Hadoop Big cost savings versus trad vendors like Oracle and Teradata

22 Machine learning Predictive analytics with eg. Spark MLLib Automatically predict component failures for proactive intervention

23 Big Database Low latency, high throughput, high concurrency, high volume Financial trading Realtime ad auctions Volumes at 200BN transactions per day in realtime reliably served

24 Device management Analysis and response to threats detected by SPI module on remote switch Automated systems management shut down heating & lighting Monitor driver propensity to break the speed limit offer lower insurance premiums to good drivers

25 hadoop - mature?

26 Choice of vendors

27 Solid operational management

28 Apache Impala v commercial dbms

29 Free grid computing

30 Free scale-out database & data warehouse

31 Growing commercial ecosystem

32 Secure and available Solid data protection controls Solid information security measures MasterCard run Hadoop to PCI DSS compliance

33 thanks for listening be.linkedin.com/in/robertgibbon