Turning unstructured data into business insight. INNOVATION WHITE PAPER Barrie Hadfield, founder and CTO of Workshare October 10th, 2018

Size: px
Start display at page:

Download "Turning unstructured data into business insight. INNOVATION WHITE PAPER Barrie Hadfield, founder and CTO of Workshare October 10th, 2018"

Transcription

1 Turning unstructured data into business insight INNOVATION WHITE PAPER Barrie Hadfield, founder and CTO of Workshare October 10th, 2018

2 Introduction Over the last year, Workshare has partnered with 6 New York-based firms to put a machine learning layer into their outbound flow. The purpose? To look for risky sharing patterns or strange user behavior. Data loss events are most frequently caused by those acting inside a business by users. Whether it s unintentional sharing of content with inappropriate people, or deliberate stealing of data for personal gain. This makes monitoring activity related to content hugely important. What we have uncovered in the process is that it is valuable to monitor and track the content being created and shared across a business for a huge variety of reasons. Data security is just one of them. It s possible to gain insight into other business issues that directly impact client relationships, productivity and, crucially, profitability. All through drawing correlations between the content shared by a firm. In a structured data base, like a CRM or a finance system, it s simple to perform a query on the data. However, it s not so easy to join the dots between these systems to create greater insight. The data is unstructured, the content is isolated and therefore it s opaque. Our aim now is to extend the machine learning layer already used for data security to provide law firms with different ways to query unstructured data. By using the content held within disparate platforms, we d look to identify trends that reveal business insight. Firms will then be able to use this information for decision making, business improvement and competitive advantage. To extend the machine learning layer already used for data security to provide firms with different ways to query unstructured data Turning unstructured data into business insight 1

3 Applying machine learning Currently, Workshare is using machine learning to look at the content of files shared over and monitoring elements such as who files are being sent from and to. By establishing what normal sharing behavior looks like, the system can then flag anomalies. These anomalies can be investigated to see whether they posing an actual threat to security. Examining the contents of both inbound and outbound mail flow captures a huge amount of additional data. By extending our machine learning system to process all (historical, inbound, outbound and internal) as well as all document systems (DMS, file stores, precedent libraries), we can rationalize unstructured data into structured information. Workshare has the potential to provide additional insight that goes well beyond the scope of risk management. We are now looking to create more high-value business insight. Examining the contents of both inbound and outbound mail flow captures a huge amount of data For example: One might assume that in a law firm, the partner has the best relationship with a key account. By monitoring who actually exchanges most content and communicates most frequently with an important client across different platforms, including , the DMS or CRM, there is the potential to uncover who truly has the deepest knowledge, expertise and relationship. Mining historical data, perhaps over a long period of time, to track communications that may have been forgotten or never known by the partner can help when planning to tackle a new matter. The correct team can be found to best serve the client and the best approach devised to prepare for the project. Law firms can use a scenario like this to take the guess work out of customer relationship management. Taking insight from content Having an application that can group 100s of types of documents to interrogate them for new information has huge potential for law firms. For example: Taking data from a billing system and querying it with data from communications platforms can help firms understand and predict which matters are most likely to be profitable. With the issue of fixed-fees, it s important to understand the effort vs. the reward on matters. With machine learning tracking communications on the most recent or similar matter for a client, queried with information taken from the billing system it s possible to predict profitability on a new project. Law firms can understand which clients are most lucrative and where to invest time, or where efficiencies need to be made on other accounts to improve margin. Turning unstructured data into business insight 2

4 Techniques in practice Machine learning and traditional data extraction techniques enable Workshare to extract the core business objects, such as a Person (and role); Client; Matter; Document (per file type); and Team from a wide range of data sources. Machine Learning Platform Existing Corporate Data Silos Data Enrichment Engine DMS Exchange Fingerprinting Change Extraction SMTP Flow Billing Data Ingestion Crawlers/ Monitors Raw Data Relationship Identification Fuzzy Matching Ethical Walls CRM Business Object Identification Entity Extraction Database Making this information readily available to a data scientist or downstream system allows derivation of Communication pattern, Document deviation, Change chronology, and even trends around Risk and Profitability. All of which can be used to drive better working practices, client relationships, mitigation of risk or greater profitability. We envisage a system that provides a data scientist with an API or BI system to allow them to query business objects. Then, opaque data from seemingly disparate sources is transformed into valuable information and business insight. We envisage a system that provides a data scientist with an API or BI system to allow them to query business objects. Turning unstructured data into business insight 3

5 Gains from insight acquired The types of insight Workshare is aiming for, which a content analysis and extraction system like this can provide are: Profitability Law firms need to be as efficient as possible and reduce the amount of write-offs on each matter. The queries run to gauge profitability might include: What volume of content is shared with a client? How much time is being written off per matter? When are bills disputed? Which teams are most frequently involved in billing queries? How long does it take a client to pay their invoice? Productivity Using machine learning to speed up repetitive processes, we could support lawyers in finding similar documents with similar changes. Or, finding documents that deviate from the norm, so it s easy to identify non-standard clauses or agreements. Planning Before starting a new matter, teams could look for precedent within their firm to give them a game plan that will maximize efficiency and productivity on the project. A modelled profile could be created based on similar and related past matters. Comparison across data points would give insight into how to best manage the new project based on similarities with historic cases. Patterns of communication How frequently are clients contacted by , via the DMS or other methods. What is the cost of the communication based on who interacts with the client most frequently? Is it possible to engage other members of the team in the communication to make it more cost-efficient? How long does it take different people to respond to their clients? These factors can be brought into account reviews to improve processes and drive profitability on matters. Profiling team efficiency Monitoring who is working on different tasks on a matter and understanding the time they put in vs. their hourly rate. Can anything be learnt from the most efficient members of a team that can be applied more widely across the firm. Best practice and best delivery methods can be uncovered for wider productivity gains. Points of weakness Firms may also want to understand where efficiency most often breaks down and where help is needed the most. So, where are teams enlisting help during a matter? Looking at support requests, or tickets raised. Understanding which platforms aren t serving teams during busy periods or where support needs to be made more robust. Turning unstructured data into business insight 4

6 Clear vision of the future from opaque data If a firm had a system that made it easy for data scientists to reach into unstructured data and content, and correlate this with structured data, the possibilities are limitless. In a structured data base, like a finance system or DMS, you can perform any query you like. They aren t joined together though, leaving true insight hard to reach. Our strategy is to provide machine learning to law firms, so they can establish direct correlates between their structured and unstructured data. Linking systems through the content that s shared or stored within them. Custom Applications Tableau Python Machine Learning tools R Analytics and Machine Learning Custom Reports Workshare ML Data API These APIs could be used to feed into other reporting systems and products to surface yet more information related to key personnel, clients or accounting. The core hypotheses are: Database Machine Learning Platform Our strategy is to provide machine learning to firms, so they can establish direct correlates between their structured and unstructured data Firms leveraging data science are hindered because the majority of their data is unstructured and distributed across many systems There is value in the content and data in documents and systems. This value can be exploited to increase competitive advantage and reduce business risk By applying machine learning, trends and patterns can be seen in business objects. Firms can use the content they share to go beyond looking at where risk in sharing might lie, to solving wider business issues, overcoming new challenges and gaining competitive advantage. Turning unstructured data into business insight 5

7 About Workshare Workshare is dedicated to helping professionals compare, protect and share their content. Since 1999, Workshare has developed and released intelligent technology for business services firms. Now, more than two million professionals use Workshare around the world. About Barrie Hadfield Barrie was one of the original founders of Workshare and remains the firm s CTO. During his tenure, Workshare has gained more than 14,000 enterprise customers in 70 countries. He brings more than 20 years of experience to the business conceiving, producing, and innovating in software. Contact us web: barrie.hadfield@workshare.com phone: +44 (0) address: 20 Fashion Street, London, E1 6PX, UK