Nexus 2.2: Overview. Dr Nicholas Marchetti Product Manager Dr Adrian Fowkes Senior Scientist

Size: px
Start display at page:

Download "Nexus 2.2: Overview. Dr Nicholas Marchetti Product Manager Dr Adrian Fowkes Senior Scientist"

Transcription

1 Nexus 2.2: Overview Dr Nicholas Marchetti Product Manager Dr Adrian Fowkes Senior Scientist

2 Derek Nexus 6.0 Update Dr Nicholas Marchetti Product Manager

3 New Features and Enhancements: Derek Nexus 6.0 An update to the current knowledge base, including addition of new alerts. Endpoints of focus have been Mutagenicity in-vivo, Mutagenicity in-vitro, Carcinogenicity, Chromosome damage, Skin sensitisation Negative Predictions: Now available for Skin Sensitisation endpoint References for nearest neighbours are now included in this feature New endpoints of Glucocorticoid Receptor Agonism and Androgen Receptor Modulation have been included.

4 NEGATIVE PREDICTIONS

5 Why do we need negative predictions? Previously, a lack of alerts firing would always lead to this: For endpoints that are well developed, we wanted to provide a stronger prediction and provide more information for expert review:

6 Negative Predictions Workflow in Derek Query Compound Yes Prediction for/against toxicity Match alert or example Derek KB No Compare structural features to publicly available data Make negative predictions, provide information for expert assessment

7 Confidence in Negative Predictions Bacterial Mutagenicity in-vitro and Skin Sensitisation will not provide a Nothing to Report In the absence of an alert, we compare the structure to an external dataset, to assess if there could be any other cause for concern: No misclassified or unclassified features Contains misclassified features Contains unclassified features Highly confident negative prediction Slightly lower confidence some features may be a cause for concern

8 References We are now providing references for nearest neighbours containing misclassified features for both mutagenicity and skin sensitisation.

9 Skin sensitisation: Lhasa Skin Sensitisation Negative Prediction Dataset A fragment library is generated from an internal skin sensitisation dataset of > 2500 chemicals Dataset consists of human, mouse and guinea pig data Overall experimental call is derived using a hierarchical approach Human Standard animal Non-standard animal Other animal BgVV Basketter LLNA (OECD Guidelines) GPMT (OECD Guidelines) Non-standard LLNA Non-radioactive LLNA Buehler (closed patch) test Freund s complete adjuvant test Freund s complete adjuvant test (modified) Split adjuvant test Single injection adjuvant test Single injection adjuvant test (modified) Maurer optimisation test Optimisation test Open epicutaneous test Closed epicutaneous test Draize test Draize test (altered) Mouse ear swelling test positive data only

10 Hierarchy of Skin Sensitisation Assays in dataset Chemical of interest Yes Is there any human data? Call assigned as human result No Conservative call between standard assay results Is there any standard assay data? Yes Conservative call between non-standard assay results No Is there any non-standard assay data? Yes No Is there positive data from other assays? No Call assigned as positive No call assigned

11 No misclassified or unclassified features This type of prediction is given for compounds where all features in the molecule are found in accurately classified compounds from the data set. Query Compound Search datasets Derek KB Non-sensitiser

12 Contains misclassified features Misclassified features are those that have been derived from a non-alerting positive compound in the data set To get this type of prediction, your query compound has a feature in common with a non-alerting positive compound A non-alerting positive compound is experimentally positive in a particular assay (e.g. Ames), but is not covered by an alert in the Derek Knowledgebase. Derek does not have a mechanistic explanation as to why the data set example is positive, therefore expert assessment is required

13 Misclassified features: Skin Sensitisation Workflow Query Compound No Alert Fired Search Lhasa Skin Sensitisation Negative Prediction Dataset Highlighted feature found in Sensitiser Derek KB Expert review: Is shared feature the cause of sensitisation?

14 Non-sensitiser (contains misclassified features)

15 Contains unclassified features Unclassified features are those that have not been found in the data set. To get this type of prediction your query compound has fired no alerts, but does contain a structural fragment that is not covered in the respective data sets. Although Derek has found no mechanistic reason for this compound to be positive, the unclassified feature could be of concern, therefore expert assessment is required

16 Unclassified features: Skin Sensitisation Workflow Query Compound No Alert Fired Search Lhasa Skin Sensitisation Negative Prediction Dataset Derek KB Features not found Unclassified Features

17 Negative Predictions Performance Skin Sensitisation 5 fold cross validation 100 How often are non-sensitisers correctly predicted? Derek no alert 73.9% Derek with skin negative predictions 77.3 % How often does each type of negative prediction occur? % 12% Negative predictivity (%) prevalence = 51% 52.1 % 8% 80% Non-sensitiser Non-sensitiser with misclassified features 0 Non-alerting compound Non-sensitiser Non-sensitiser with misclassified features Non-sensitiser with unclassified features Non-sensitiser with unclassified features Misclassified and unclassified features occur infrequently and represent areas of increased uncertainty, which may require further scrutiny

18 SKIN SENSITISATION

19 Derek KB What s new? 10 new Skin Sensitisation alerts 7 of which were built using member donated data 12 alerts modified e.g. Expanding/refining the scope of alerts

20 Structural alerts Performance Ongoing development of the skin sensitisation endpoint: Derek KB Acc Se Sp PP NP No. of alerts Derek 2014* Derek 2015* Derek 2018* Acc = Accuracy, Se = Sensitivity, Sp = Specificity, PP = Positive Predictivity, NP = Negative Predictivity *Analysis based on an in-house dataset of 1267 sensitisers and 1282 non sensitisers based on conservative combination of results from the LLNA and/or guinea pig assays. Alert performance has improved over recent years, due to: Continued analysis of the available public data Extraction of (anonymous) knowledge from proprietary data shared by members e.g. Bristol-Myers Squibb

21 Alert example Alert Imine or alpha-beta unsaturated imine Alert was made more specific by: 1. Narrowing the scope to exclude ketimines/tertiary imines 2. but still include alpha, beta-unsaturated imines, which can react through Michael addition This reduced the number of false positives by 83% when tested against the members data, and by 86% when tested against public data 1. No alert fires 2. Alert fires

22 GENOTOXICITY

23 Derek KB What s new? 18 Mutagenicity in-vitro alerts modified e.g. Expanding/refining the scope of alerts 12 new Mutagenicity in-vitro alerts 10 of which were built using member donated data 6 Chromosome damage alerts modified 4 of which were modified using member donated data Extended 12 Mutagenicity in-vitro alerts to also apply to Mutagenicity in-vivo Using newly publicly available transgenic rodent mutation assay data

24 Alert example Alert 746: Arylboronic acid or derivative Alert was made more specific by: 1. Narrowing the scope to exclude aryl boronic acids with bulky para substituents 2. Alert will also no longer fire if there is a fused non-aromatic ring at the para position This reduced the number of false positives by 27% when tested against members data, and by 6% when tested against public data No alert fires

25 CARCINOGENICITY

26 Derek KB What s new? 2 new carcinogenicity alerts Aniline or precursor This alert was validated against 3 public datasets, giving an average (mean) positive predictivity of 91% Uracil, thymine or precursor This alert was validated against 3 public datasets, giving an average (mean) positive predictivity of 100%

27 REPRODUCTIVE TOXICITY

28 Derek KB What s new? 1 new Teratogenicity alert 17-Hydroxyprogesterone derivative 2 new endpoints relating to Teratogenicity Glucocorticoid receptor agonism Androgen receptor modulation These molecular initiation event based endpoints were designed to provide better coverage for the Teratogenicity model

29 Meteor Nexus 3.0 Update Dr Nicholas Marchetti Product Manager

30 What s new in Meteor Nexus 3.1.0? New Biotransformation 566 Hydrolysis of Cyclic Peptides Modified Biotransformations 563 N-Glucuronidation of Amides and Related Compounds 41 Conjugation of Hydrazines, Hydrazides and Related Compounds with Pyruvic Acid Biotransformations 41 and 42 have been merged 43 Conjugation of Hydrazines, Hydrazides and Related Compounds with alpha-ketoglutaric Acid Biotransformations 43 and 44 have been merged 245 Oxidative N-Dealkylation 371 Epoxidation of 1,1-Disubstituted Haloalkanes

31 Metabolism Dataset Metabolic data collected from the following journals Drug Metabolism and Disposition Xenobiotica Biochemical Pharmacology Journal of Pharmacology and Experimental Therapeutics Chemical Research in Toxicology Journal of Medicinal Chemistry Journal of Agriculture and Food Chemistry 2,608 papers 18,379 reactions

32 Sarah Nexus 3.0 Update Dr Adrian Fowkes Senior Scientist

33 Sarah Nexus Improvements to predictions a) Structure standardisation b) Sarah Nexus training set 2. Improvements to interpretability a) Additional information for example compounds b) Strain profile information c) Additional compounds for analysis 3. Improvements to model building a) Additional curation options

34 Structure Standardisation Additional structure standardisation rules have been implemented into Sarah Nexus

35 Structure Standardisation Structure standardisation is beneficial for two main reasons 1. Appropriate curation of structures to ensure the activity of compounds is accurately reflected during model building 21% compounds with CAS numbers in the training set have at least 2 structure representations before any standardisation 2. To ensure that whatever way a query structure is drawn by the user the same prediction is produced

36 Sarah Nexus Training Set Data Source Conflicted Negative Positive Equivocal Unreliable Acid Halide Mutagenicity Dataset Bursi Mutagenicity Dataset CGX Mutagenicity Dataset Derek Nexus Example Compounds FDA CFSAN Mutagenicity Dataset Feng Mutagenicity Dataset Hansen Mutagenicity Dataset Helma Mutagenicity Dataset ISSSTY Mutagenicity Dataset Marketed Pharmaceuticals Dataset Member Data Vitic Nexus NTP Table Vitic Nexus Summary Call Table Sarah Model 2.0 Training Set Previous Training Set Larger training set due to amount of data donated by member organisations to improve performance and curation of the public literature

37 Sarah Nexus - Validation 90 Sarah Nexus vs proprietary data % BA Sens Spec Cov Performance Metric Sarah Model Sarah Model 2.0

38 Sarah Nexus Improvements to predictions 2. Improvements to interpretability a) Additional information for example compounds b) Strain profile information c) Additional compounds for analysis

39 Additional Information for Compounds Hypotheses Training set examples

40 Additional Information for Compounds Toggle between published and standardised structure Examine data source and follow up references

41 Strain Profile Information There is lots of detailed strain data available for compounds in the Sarah Nexus training set Sarah Nexus allows users to explore supporting Ames strain profiles for both hypotheses and individual structures Reduce uncertainty Better decision making

42 Additional Information for Compounds

43 Addition Of Detailed Strain Data Overall strain data for the hypothesis Strain data for the individual example

44 Additional Information for Compounds

45 Additional Compounds For Review Compounds whose activity was not resolved for inclusion into the training set are now available for review in the Nexus interface. Compounds in this panel have access to the new features in Nexus. For example, viewing strain profiles and references.

46 Sarah Nexus Improvements to predictions 2. Improvements to interpretability 3. Improvements to model building a) Additional curation options

47 Model Building Sarah Nexus allows for the creation of new models using its SOHN methodology Supplement the Sarah Nexus training set with additional data Build models from new data sets New options have been implemented into the model building workflow to support the model building process

48 Model Building Add meta data for new compounds which can be viewed in the compound information panel during the review of predictions

49 Model Building Structure standardisation rules developed by Lhasa Limited can be applied to imported datasets

50 Model Building Dataset curation is further supported by options to handle the experimental activities present in the model dataset and the imported dataset

51 Conclusions The new features implemented in Sarah Nexus 3.0 further support its use as a statistical system for ICH M7 Expansion of the training set to support Sarah Model 2.0 Improved structure standardisation rules to improve consistent representation of structures and their experimental activity Increased interpretability to support expert review Strain information Additional compounds for review Compound meta data

52 Questions? Lhasa Limited Granary Wharf House, 2 Canal Wharf Leeds, LS11 5PS +44(0) info@lhasalimited.org Registered Charity (290866) Company Registration Number