Combine Microservices Framework for Flexible, Scalable, High Availability Big Data Analytics

Size: px
Start display at page:

Download "Combine Microservices Framework for Flexible, Scalable, High Availability Big Data Analytics"

Transcription

1 Combine Microservices Framework for Flexible, Scalable, High Availability Big Data Analytics Dan Widdis, Principal Operations Research Analyst May 10, 2016 Approved for public release; distribution is unlimited. 1

2 Monolithic vs. Microservice-based Traditional big data architecture: Single, monolithic system Little flexibility Difficult to scale Slow update cycle Microservice-based big data architecture: small services, lightweight communications Low coupling, high cohesion, and automation Core components facilitate coordination of many contributing services Additional services easily added from multiple vendors, regardless of programming language Leverages cloud-based environment for scalability Enables frequent low-impact/low-risk updates 2

3 Avoids a single monolithic application and technology lock-in Allows you to rapidly develop and deploy software products that solve your needs Integrates with the client infrastructure, commercial cloud, or Govt. cloud Provides an Architectural Pattern for distributed and scalable services Core Services: High Availability (HA) services managing the Combine infrastructure Participant Services: Good Citizens implementing the Big Data service contract, some being VIPs with HA Observer Services: External / Remote services interfacing with Participants 3

4 Framework Characteristics Componentization via services Organized around business capabilities Products, not projects Microservices can be written/deployed in two weeks Encourages ownership of services Smart endpoints, dumb pipes Simple REST* and JSON* communications Receive a request, apply logic, provide a response Decentralized governance Teams/vendors free to use the best language/approach/tool * REST = REpresational State Transfer defines the architecture JSON = JavaScript Object Notation lightweight communications 4

5 Framework Characteristics Design for failure Resilient to failure of any individual service Infrastructure automation DevOps Collaboration of development and operational teams Continuous Delivery approach to automated product delivery Tools: Jenkins, Docker, Vagrant, Puppet Decentralized data management Polyglot Persistence Transactionless coordination between services Compartmentalized Scalability Scaling of one service does not alter the system as a whole 5

6 Uses modern, open-source distributed computing capabilities (Technology As A Service) Apache Mesos resource management Apache Hadoop Distributed File system (HDFS) Apache Spark analytics Apache Kafka event log Titan Graph database Marathon, Docker Container management Example Combine Framework Stack 6

7 Registration Core Services and Management Metadata collection and service inspection required before a service can join the system Discovery Dynamic service lookup based on capability and load Authentication Facilitates authentication through pluggable authentication providers Health Monitoring Real-time health status of individual services deployed to the system Performance Monitoring Real-time profiling and performance metrics 7

8 High Availability Technology Distributed infrastructures that can be deployed to a Cloud Environment Resources used by leading companies (Netflix, Amazon) Focusing on distributed computing capabilities that allow for deployment to infrastructures ranging from commodity level hardware, to enterprise level servers, to massively scaling cloud systems Swagger Chaos Monkey 8

9 Benefits to Product Owner Frequent delivery of working software Smaller, focused teams Improved performance and user experience Protects sources and need-to-know Benefits to Operations Team Efficient deployment process Easily support multiple product and service teams Easier to identify and isolate problems Service ownership incentivizes preventing reoccurring problems Integrates with cloud service containers for monitoring and analytics 9

10 Benefits to Development Team Easier code maintenance and enhancements Leverage existing skills, platforms, languages Enables independent development teams Simplifies tracking dependencies Easier to scale bottlenecks Benefits to Quality Assurance Team Containerized structure focuses tests Less testing downtime Decreased testing time cycles and less testers required 10

11 Research and Development - Technology Platforms Big Data/Cloud Data Collections Technologies Distributed Processing and Cluster Testing Provenance Data Pedigree Cognitive Computing Identity Management Immersive Environments Personalization Services Computer Vision Context Aware technologies 11

12 For more information, visit or contact: Matt Hoffman, Technical Solutions Architect (814)