Persistent Systems SanGeniX Solution on HANA

Size: px
Start display at page:

Download "Persistent Systems SanGeniX Solution on HANA"

Transcription

1 Persistent Systems SanGeniX Solution on HANA Aarti Desai, Ph.D. Senior Domain Specialist

2 Persistent at a Glance Experience 350 Customers Product releases in last 5 years 14 of Top 20 technology companies in the world are our customers 23 Years in business Growth Publically listed (BSE, NSE) $237.8M FY13 revenue Rs 1, crore FY13 revenue 23% CAGR The Practice U.S., APAC and EMEA regions Employees ISO 9001:2008 ISO 27001:2005 ISO 14001:2004 OSHAS 18001:2007 Named a Leader in the IAOP 2013 Global Outsourcing 100 service providers list 2013 Computerworld Honors Laureate ComputerWorld Leading vendor in Global OPD & Specialty Application Development & Management (ADM) Top 3 player in Smartphone Application Development Forrester Research, 2011 Leading Player in Software/ ISV R&D & Consumer Software Segments. Ranked Highly in Cloud & Enterprise Mobility Segments IAOP Global Services Media Zinnov Management Consulting 2

3 With a Global Footprint Delivery Center Sales Office Chicago Toronto Scotland Netherlands Victoria Seattle San Jose Dallas Ohio Quebec Boston London France Pune Goa Nagpur Hyderabad Malaysia Singapore Tokyo Bangalore 3 and more than 6600 employees.

4 Recognized R&D Center by Department of Scientific and Industrial Research (DSIR), Govt. of India

5 Four Focus Areas Systems Biology Next Generation Sequencing (NGS) Data Analysis Lab Informatics Partners in Innovation Diabetes Research 5

6 NGS Data Analysis Group Key Activities Developing a comprehensive, affordable and user friendly data analysis suite. Performing analysis of NGS data in a collaborative manner with leading research institutes in India to validate and fine tune the data analysis pipelines. Establishing thought leadership by publishing in peerreview journals, magazines and public forums. Performed de novo genome assembly and genetic variation analysis of Rubella virus for submission to a regulatory body. Key Collaborators DBT funded NGS analysis suite development project A 3 year project funded by DBT to develop SanGeniX, an end-to-end NGS data analysis suite. 6

7 SanGeniX NGS data analysis suite Web-based user interface Grid and cloud enabled Support for de novo genome assembly Pre-defined and custom workflows Advanced visualization Statistical Analysis 7

8 Key Challenges of NGS data analysis Time intensive Analysis of sequencing data for moderate to large genomes (> 250 MB) takes anywhere between one three weeks to complete depending on the analysis type. Computationally intensive Analysis of sequencing data for moderate to large genomes (> 250 MB) requires hardware with greater than 256 GB RAM. Resource Intensive Analysis of NGS data is a multi-step process and requires manual intervention at every step to evaluate data quality and initiate the sequential steps. Data Interpretation Results from NGS analysis contains millions of rows and getting to the crux of the analysis is a task that requires large amount of time and efforts. 8

9 SAP HANA an In-Memory Data Platform for real-time business One platform to do it all One platform to handle all things from Data Management to schematic search to statistics and predictive analysis. Statistical Analysis such as SOM, Hierarchical Clustering are already available in HANA and are widely used in genomic data analysis. Data Compression HANA compresses the data and the compression is retained even when the data is stored on disk. Genomics data is extremely voluminous and HANA compression can dramatically improve TCO for end customers Next Generation DB platform HANA as a Next generation DB platform has dramatically improved the processing throughput that can be achieved 9

10 SanGeniX Architecture Integration with third part tools and plugins for custom tools/algorithms Desktop App Scalable for deployment on grid cluster or cloud. Genomics workbench Workflow Reporting Charting UIs Analysis workflow, reporting and charting interface. Scalable data management structure. Scalability Cloud com. Hadoop Cluster Workflow engine Services Management tool Customization Data filter Statistical tool API and Plugin Add-ons 3rd Party S/W File system External data source Postgres Database Data Management 10

11 SanGeniX Home Page 11

12 SanGeniX Predefined Workflow 12

13 SanGeniX Custom Workflow 13

14 SanGeniX Running an analysis 14

15 SanGeniX Results Dashboard 15

16 Summary Business/IT/Scientific Challenges NGS data analysis is time consuming. Requires anywhere between 1 3 weeks for data analysis NGS data analysis is labor intensive and requires analyst involvement at multiple steps. NGS data analysis is computationally intensive. Technology SanGeniX platform SAP HANA platform R Analytical package Value Drivers Process Innovation Predefined workflows to minimize user intervention. Custom workflows to accommodate lab specific analysis. HANA infrastructure for multi-fold increase in data processing speed. Visualization dashboard to simplify data interpretation Increase the number of projects that an organization can undertake Improve productivity, reduce the time spent on data management 16

17 HANA Competency at Persistent Persistent Overview Experience 350 Customers Product releases in last 5 years 14 of Top 20 technology companies in the world are our customers 23 Years in business ShareInsights BigData Analytics Platform on HANA A dynamic platform to analyse data and share the insights for events happening around us T20 Cricket League dashboard reveals interesting insights on Team/ Player performance against fan conversations Marquee project was for Satyamev Javate show to analyse sentiments of people per episode 17 Persistent HANA offering HANA App factory for our customers BWA on HANA Connectors for various data sources into HANA Analytics via BO/VI Persistent is doing good work in understanding on how HANA applications are built, with a real customer, a pharma giant - Seeker of HANA Solutions at SAP SanGeniX on HANA Persistent LABS team is building a Next Generation Sequencing (NGS) data analysis application on HANA Application involves running Burrows-Wheelers algorithm on HANA ALN step which used to take hours/days to complete now takes only minutes. Dealer Enablement Mobile Solution on HANA Mobility Solution for distributors and suppliers of a Pharma company to place & track their Drug orders App supports real-time reporting of drug delivery cycles, statistical analysis on distributor performance and order execution process using rapid data processing on HANA

18 Thanks! Persistent SAP Team Sidharth Sujir Chintamani Deshmukh Pramod Pagare 18