Understanding Return On Investment (ROI) and Total Cost of Ownership (TCO) for Machine Translation Dion Wiggins Chief Executive Officer dion.wiggins@asiaonline.net
There will always be someone who says that they can do it cheaper but at what cost?
Just Add WaterUpload Data If it was really this easy, don t you think custom MT success stories would be everywhere? Anyone why tells you MT is easy is either trying to fool you or is incompetent. Time, Effort, Skill and Investment are Mandatory
Understanding of Total Cost of Ownership (TCO) essential and a means for businesses to assess both direct and indirect costs and benefits related to any purchase. The intention is to arrive at a final figure that will reflect the effective cost of purchase, all things considered. There is no specific formula for TCO, it is unique to each organization and requirement. TCO should include ALLcosts over the entire lifetime of the investment, not just the initial costs. Initial costs, recurring costs, replacement costs, end of life costs
Human Resources Technical Resources Operational Resources
Computer Hardware and Programs Network Infrastructure Costs Hardware and Software (switches, routers, firewall, storage) Bandwidth (recurring service fees, excess bandwidth fees, data transfer fees) Storage Costs Setup Costs Server Hardware and Software Costs Training Servers Translation Servers Backup Servers File Servers Database Servers Translation Management System Servers Setup Costs Workstation Hardware and Software Translation Management System (TMS) Content Management System (CMS) External Adaptors and Connectors
Installation and Integration of Hardware and Software Purchasing Research Warranties and Licenses Annual Software Maintenance Fees License Tracking (Compliance) Specialist External Consultants Risks: susceptibility to vulnerabilities, availability of upgrades, patches and future licensing policies, etc.
General Operational Expenses Infrastructure (floor space for equipment and personnel) Electricity (for related equipment, cooling, backup power) Testing Costs Pilot Costs Migration Costs Downtime, outage and failure expenses Diminished performance (i.e. users having to wait, diminished moneymaking ability) Security (including breaches, loss of reputation, recovery and prevention) Backup and Recovery Process Technology Training Audit (internal and external) Insurance Corporate Management Time Legal costs for contracts
Long Term Expenses Replacement Future upgrade or scalability expenses Depreciation of hardware and software investments Decommissioning
Machine Translation Costs Customizing Machine Translation Initial Customization Costs (setup, one time, recurring) Service Fees CPU Costs Storage Costs Improvement Costs (setup, one time, recurring) Service Fees CPU Costs Storage Costs Translation Costs Costs of translation Per word CPU costs Recurring service fees Licensing costs Storage costs (near term and long term) Editing and Proofing Costs (see Human Resources Costs)
Data Related Costs Data purchase and licensing costs (i.e. purchasing translation memories from TAUS) Data export and conversion costs Data preparation costs Data cleaning costs Data alignment costs Data gathering costs Software Development Costs Integration Costs Custom Workflow Development Project Management Costs
Human Resources Costs Salaries and Benefits Information Technology Personnel Programmers / Software Developers Network Engineers Natural Language Programming / Computational Linguistics personnel Outsourcing Costs (for all of the above) Translation Personnel Translators Editors Proof Readers Quality Assurance Project Managers Machine Translation Engineers Training Course Costs Certification Costs Training and Ramp Up Delay Costs
Recruitment Recruitment Costs Recruitment Delay Costs Testing and Qualification Verification Costs Management Personnel
Language Studio Secure Cloud Software-as-a-Service (SaaS) Language Studio On-Site In Your Office / Data Center Load On Demand Dedicated Servers Secure Managed SaaS Licensed Software Asia Online secure data center Asia Online hardware Managed by Asia Online Servers shared across many customers Asia Online secure data center Asia Online hardware Managed by Asia Online Servers dedicated to you only Your data center Your hardware Managed remotely via VPN by Asia Online Language Studio Secure Private Cloud Your data center Your hardware Managed by your staff with no outside access Asia Online also offers all of the above deployment models in a tailored to fit Secure Private Cloud option. Asia Online systems architects will design a configuration to meet your organizations specific security, scale and functional requirements. Secure Private Cloud offerings are available in a wide range of geographic locations.
Language Studio Secure Cloud -Hosted SaaS Computer Hardware and Programs Workstation Hardware Translation Management System (client / server) Machine Translation Costs Customization Costs (initial, improvement) Translation Costs (words, human resources) Data Related Costs (export, validation) Human Resources Costs Salaries and Benefits Information Technology Personnel Translation Personnel Management Personnel Training Recruitment Language Studio On-Site Licensed Software Computer Hardware and Programs Network Infrastructure Costs Server Hardware and Software Costs Workstation Hardware Translation Management System (client / server) Software Maintenance Fees Machine Translation Costs Customization Costs (initial, improvement) Translation Costs (words, human resources) Data Related Costs (export, validation) Human Resources Costs Salaries and Benefits Information Technology Personnel Translation Personnel Management Personnel Training Recruitment General Operational Expenses Infrastructure (floor space for equipment and personnel) Electricity (for related equipment, cooling, backup power) Backup and Recovery Process Technology Training Depreciation
It is important to understand all the costs associated with MT, not just the basic costs Month 1 2 3 4 5 6 7 8 9 10 Total Costs for 10 Months Service: EUR 82,357
A performance measure used to evaluate the efficiency of an investment or to compare the efficiency of a number of different investments. To calculate ROI, the benefit (return) of an investment is divided by the Cost of the Investment; the result is expressed as a percentage or a ratio. The Return On Investment formula: ROI = (Gain from Investment -Cost of Investment) Cost of Investment
Productivity Gains: The increase in productivity for all human tasks within the translation production process and workflow. Profit Margins: The increase in profit margin for each project and progressive improvements in profit margin as custom machine translation engines mature.
There are also a number of other less obvious ROI benefits such as greater consistency in translations which increases customer satisfaction. Other business benefits relate more to sales and ongoing business deal flow. These include: New Deals: New deals that would not be possible without machine translation such as: New deals where machine translation was a component. New deals that are machine translation only. New deals where time was a critical factor that would not have been possible with a human only approach. Competitive Deals: Time, quality and price are all factors that impact the competitiveness of a project bid: Deals where you could offer a more competitive bid that your competitors due to the use of machine translation. Deals that would have been lost to a competitor without the advantages such as speed, terminology accuracy and writing style consistency that good machine translation offers. Deals where the ability to be more flexible on pricing during negotiations due to higher profit margins help to win the project.
TEP Human translators can typically translate between 2,000-3,000 words per day in order to deliver a first pass translation. A single Language Studio custom engine instance translates at approximately 3,000 words per minute, and can run 24 hours a day 7 days a week. Our fastest customer is translating 2.4 billion words per day. What used to be the most expensive and slowest part of translation can be the lowest cost and fastest.
TEP High quality MT that has been customized by experts can deliver output that requires little of no editing. Generic or lower quality MT will require more editing and can sometimes be more expensive and time consuming than human only. Types of error for MT differ from that of human translators Many Language Studio users report output where greater than 50% of content requires not editing at all. Common Mistake:When content needs to be retranslated, not passing it through the editing stage. Editors should be making minor corrections only, not retranslating. Costs for this should be included in ROI calculations.
TEP Proofing with high-quality machine translation is a very different experience to proof reading with lower quality machine translation output. As terminology in a mature engine is more consistent than with human translators, there is less work to perform in order to bring the quality to a final publication level. While the error patterns made by machine translation are different to those made by humans, by this stage in the overall translation process, the majority of the errors should have been eliminated. This process should be as efficient as or even more efficient than a human only translation process.
Automated Metrics Language Studio Google Bing Systran BLEU Case Sensitive 71.86 40.21 35.71 26.67 BLEU Case Insensitive 76.71 41.27 36.73 27.52 F-Measure Case Sensitive 86.52 69.98 68.26 60.67 F-Measure Case Insensitive 88.93 70.62 69.06 61.06 TER Case Sensitive 21.75 44.08 46.36 55.30 TER Case Insensitive 19.93 43.81 46.00 55.18 Human Review Language Studio Google Perfect, Exact Match to Reference 369 8 Perfect, But Different to Reference 310 270 Perfect and Better than Reference 30 - Minor Edits Required (1-2 Edits) 230 111 Medium Edits Required (3-4 Edits) 61 291 Bad Needs Retranslation - 320 1,000 1,000 61% Needed considerable work
Word Costs Language Studio Google Human MT + Human MT + Human Translation 0.1700 0.0052 0.00014 Post Editing 0.0500 0.0500 0.1300 Proof Reading 0.0300 0.0250 0.0300 Project Size (Words) 1,000,000 Total Costs Translation 170,000 5,200 140 Post Editing 50,000 50,000 88,400 Re-Translation - 54,400 Re-Translaiton Post Editing - 16,000 Proof Reading 30,000 25,000 30,000 TOTAL COSTS 250,000 80,200 188,940 Rate will vary depending on package purchased. Typical package between US$ 0.005-0.0002 per word. Only content that needed to be 100% retranslated Savings Analysis Human vs. MT Value 169,800 61,060 Language Studio vs. Google Value 108,740 Human vs. MT % 68% 24% Language Studio vs. Google% 43% Words to Break Even with Language Studio = 191,402 Total Savings for Full Word Pack Utilization = US$ 1,103,700
MT + Human can deliver higher quality output than human only translation processes Cost reductions Faster deliverables Higher customer satisfaction Happier post editors
Directly linked to time to improve quality Faster improvement delivers faster ROI Many customers recovers costs on their first project Projects progressively become lower cost
Common Mistake:Measure quality of initial engine and determine costs from this engine. Many issues in data and formatting can only be seen once an engine has been customized One of the most important metrics is how quickly a custom engine can improve. This differs between vendors, engines, language pair and domain. Many of the most significant issues can be addressed very quickly. Metrics should begin once and initial correction/diagnostic phase has been completed and corrective action taken.
Language Studio uses the Clean Data SMT approach with expert human guided granular sub-domains.
Google/Bing Quality Level Typical Competitor Quality Level Language Pair Top-Level Domain EN-ES Automotive Honda Cars Toyota Engines/Sub-Domains Motorbikes Cars User Manuals Engineering Service Manuals User Manuals Engineering Service Manuals Marketing Service Reports Customization Level: Typical Productivity Gain: Generic???? Domain < 20-40% Client Product Target Audience / Purpose 50%+ 90%+ 150-300%+ Generic MT from Google, Bing, etc. offers unknown productivity gains and sometimes productivity loss due to lack of control. Competitors offer < 20-40% productivity gains due to domain centric and dirty data SMT customization model. Language Studio : Targets of 150-300%+ productivity gains with granular sub-domain clean data SMT approach. Provides complete control of writing style, terminology and is mapped to target audience reducing editing effort.
Post Editing Cost MT learns from post editing feedback and quality of translation constantly improves. Cost of post editing progressively reduces as MT quality increases after each engine learning iteration. Post Editing Effort Reduces Over Time The post editing and cleanup effort gets easier as the MT engine improves. Initial efforts should focus on error analysis and correction of a representative sample data set. Each successive project should get easier and more efficient. Job Duration and Human Resources MT with the same number of physical human resources can reduce the time required to complete the job (job duration) vs. human only. MT + human post editing reduces overall project duration by multiples of human only approach. Cost Per Word Quality Job Duration 6 5 4 3 2 1 Publication Quality Target Post Editing Effort Post Editing (Human Translation) 1 2 3 4 5 6 Engine Learning Iteration Raw MT Quality 1 2 3 4 5 6 Engine Learning Iteration MT + Human Post Editing MT Post Editing Human Translation + Human Post Editing Human Resources
Skilled Language Studio Linguists understand the complexities of customizing high-quality custom machine translation engines, so that if you don t want to you don t have to. Language Studio Linguists will listen to your issues and help to identify the cause and apply a correction on your behalf. Users can focus on translation and when required the identification of issues. Skilled Language Studio linguists perform the more complex tasks of resolving the issues provided by clients. Your account manager will assign a linguist to work with you to quickly resolve each issue.
Language Studio Linguists develops a specific roadmap for improvement for each custom engine. This ensures the fastest development path to quality possible. You can start from any level of data. We will develop based on the following: Your quality goals Amount of data available in foundation engine Amount of data that you can provide Quality expectations are set from the outset Asia Online performs a majority of the tasks base on your guidance Many are fully automated
Initial Customization Select Language Pair and Top Level Domain Create Sub-Domain Engine Project Incremental Improvement Create New Project Version Machine Translate / Post Edit Evaluation & Release Release to Production Upload Data Linguist Review No Yes Acceptable Quality? Create Training / Quality Improvement Plan Pre-Customization Consultation with Client Clean Data Data Manufacturing Client Data Validation Fine Tune with Rules Evaluate Quality Yes More Client Data Needed? No Training Key Client Role Release to Diagnostic Review Asia Online Specialist Asia Online Automated
German -> Slovenian Technical Engineering Project of ~1 million words Encouraged by TAUS and other DIY MT advocates tried their own DIY MT engine Hired a computational linguist Spent 6 months trying to build a quality system Result: Inconsistent and unusable results Google was better quality (although also unusable) Considerable time, effort and cost spent Built a Expert Customized Engine with Asia Online Worked with Language Studio Linguists who created a custom engine plan to deliver high quality output and address issues Result: In a very short period had an engine that was delivering high quality output Some output was as good as their human translators were producing Cost and time was greatly reduced compared to human only or DIY approach http://www.asiaonline.net/en/resources/casestudies/iolar1.aspx
Language Studio Workshop & Round-Table TM February 2014 20-22 Thursday, Friday, Saturday Bangkok Thailand Language Studio Advanced MT Customization Workshop Empower your organization to control quality, terminology& writing style Become a Certified Language Studio Linguist Learn how to fully customize a machine translation engine Advanced customization techniques Hands on training with Language Studio tools Machine Translation in Business Round-Table Discuss with industry peers hot topics relating to machine translation Best practices in post editing and customization Return On Investment and much more http://www.asiaonline.net/en/resources/events
Understanding Return On Investment (ROI) and Total Cost of Ownership (TCO) for Machine Translation Dion Wiggins Chief Executive Officer dion.wiggins@asiaonline.net