Modelli predittivi in radioterapia: modelli statistici vs Machine Learning

Size: px
Start display at page:

Download "Modelli predittivi in radioterapia: modelli statistici vs Machine Learning"

Transcription

1 Modelli predittivi in radioterapia: modelli statistici vs Machine Learning Tiziana Rancati Programma Prostata Fondazione IRCCS Istituto Nazionale dei Tumori

2 Modeling Established ideas Mathematical extension (model) Predictions To overlay known principles on data, allowing hypothesis, yet support predictions MECHANISTIC MODELS Mechanistic models are based on an understanding of the behavior of a system's components. EMPIRICAL/STATISTICAL/PHENOMENOLOGICAL MODELS Empirical models are based on direct observation, measurement and extensive data records.

3 MECHANISTIC MODELS Assumptions I. radiation DNA (cluster) damage X,γ II. DNA cluster damage chromosome aberrations III. chromosome aberrations cell death p Adjustable parameters a. average number of CLs per Gy and per cell b. fragment (distance-dependent) mis-rejoining, or un-rejoining probability He ions

4 REPAIR OF DAMAGE: ROLE OF ATM

5 RADIOSENSITIVITY: ROLE OF ATM (Foray 2016)

6 TEST of patm nucleo-shuttling + γh2ax (Granzotto 2016) γh2ax and patm foci are required for a better description of radiosensitivity. The residual γh2ax foci reflect DSB repair deficiency, and the number of early patm foci reflects DSB recognition

7 Courtesy of Arjen van der Schaaf, ESTRO Vienna 2014

8 DOSIMETRY concomitant nononcologic drugs NTCP Other treatment related variables patient features (phenotype/ genotype)

9 Radioinduced toxicity is a multi-factorial problem DOSE RISPOSTA INTRINSECA DEI TESSUTI SANI AL DANNO CELLULARE

10 NTCP as a STATISTICAL MODEL Courtesy of Arjen van der Schaaf, ESTRO Vienna 2014, modified

11 The resulting multivariable NTCP models are based increasingly on available data and less on existing biological knowledge. Despite their ability to describe the present data well, many of these models may later turn out to be inconsistent with subsequent data.

12 Van der Schaaf et al, IJROBP, editorial, 2015

13 STATISTICAL (NTCP) MODELLING DOSIMETRIC DESCRIPTOR (which one?) PATIENT DATA ( omics ) Mathematical frame (statistical modelling vs machine learning)

14 STATISTICAL (NTCP) MODELLING DOSIMETRIC DESCRIPTOR (which one?) PATIENT DATA ( omics ) Mathematical frame (statistical modelling vs machine learning)

15 DOSIMETRIC DESCRIPTOR (which one?)

16 DOSIMETRIC DESCRIPTOR (which one?) DVH? DVH cutoffs? EUD? DSH? DSH cutoffs? EUD? Dose-map? Spatial descriptors? 3D dose distributions? Whole Organ? Substructures? Planned dose? Accumulated dose?

17 Courtesy of Oscar Acosta

18 Knowledge vs clinical utility!

19 STATISTICAL (NTCP) MODELLING DOSIMETRIC DESCRIPTOR (which one?) PATIENT DATA ( omics ) Mathematical frame (statistical modelling vs machine learning)

20 PATIENT DATA ( omics ) PRECISION MEDICINE Omics is the study of particular types of information (such as, for example, genomics), typically on a complete or massive scale ( 10 6 ), which involve high level technologies to be determined However, genomics is but one sibling in this family of information and associated technologies. The information may be focused on classes of molecules (e.g., proteins in proteomics, metabolites in metabolomics) or systems (e.g., microbes in microbiomics).

21

22

23 MULTI-OMIC MODELING MULTI-OMIC DREAM

24 BIG DATA HORIZONTAL & VERTICAL statistics quality 1. Reproducibility 2. Size: BIG DATA vs SMALL DATA 3. Multicenter studies: Data sharing 4. Multicenter studies: Standardization /Harmonization

25 STATISTICAL (NTCP) MODELLING DOSIMETRIC DESCRIPTOR (which one?) PATIENT DATA ( omics ) Mathematical frame (statistical modelling vs machine learning)

26 Mathematical frame (statistical modelling vs machine learning) Statistical Modelling is formalization of relationships between variables in the form of mathematical equations Palorini RO 2015 Machine Learning is an algorithm that can learn from data without relying on rules-based programming

27 Statistical Modelling Sigmoid-shaped models (semi-phenomelogical models) Why a sigmoid shaped dose-response relationship? Cell survival experiments First experience with pts

28 CLASSICAL NTCP MODELS usually named as radiobiological models all models predict an increase in NTCP when increasing the dose and the irradiated volume they describe the dose- NTCP relationship through a sigmoid shaped curve the dose-volume relationship differs for each organ considered Parametrization: Position dose required to obtain a certain level of response (50%: D 50 ) Steepness normalized dose response gradient increase in response for a 1% increase in dose at D 50 level ϒ D 50

29 Modifiers of dose-response relationship: given the same dose levels, subgroups of pts have greater (less) probabilities of tox events Dosimetric factors Clinical factors Genetic factors

30 Including risk factors into NTCP models inserting a dose modifying factor Idea is that patients harboring clinical/genetic/treatment-related risk factors are exibihting their toxicity at lower doses they have a lower D50 dose modifying features Fitting the volume parameter n and the slope parameter in the whole population and adding multiple D50s, one for pts without risk factors and one derived for pts exhibiting each risk factor Peeters IJROBP 2006

31

32 Including risk factors into NTCP models Using logistic regression NTCP i = 1+ 1 e logit(k) i logit(k) i =... b 0 + b1 x1, i + b2 x2, i + + bn xn, i These values are not forced to be dichotomous YES/NO values Logistic regression can host multiple continuous variables in a natural way: e.g. biomarker levels multiple dosimetric metrics

33 Multicomponent prediction model for acute dysphagia in lung ca pts Dehing-Oberije RO 2010

34 Machine Learning Machine learning is the subfield of computer science that gives "computers the ability to learn without being explicitly programmed. It explores the study and construction of algorithms that can learn from and make predictions on data through building a model from sample inputs.

35

36 Some Machine Learning Algorithms Decision Tree Support Vector Machine Artificial Neural Network Naive Bayes K-Means Random Forest Dimensionality Reduction Algorithms Gradient Boosting algorithms

37 Machine Learning 1. Goal: learning from data of all sorts 2. No rigid pre-assumptions about the problem and data distributions in general 3. More liberal in the techniques and approaches 4.Generalization is pursued empirically through training, validation and test datasets 5. Redundancy in features (variables) is okay, and often helpful. 6. Does not promote data reduction prior to learning. Promotes a culture of abundance: the more data, the better 7. Has faced with solving more complex problems in learning, reasoning, perception, knowledge presentation,

38 Machine Learning No rigid pre-assumptions about the problem and data distributions in general Focus of Artificial Neural Network PROs of ANNs: ability to model extremely complex functions and huge numbers of data no assumption on relationship among data Every relationship is allowed, no assumption on analysed problems The goal is the best possible discrimination of data There is no guarantee to obtain a non decreasing curve

39 Selected ANNs (best AUC in training/test/verify sets) for prediction of late fecal incontinence in prostate cancer Carrara, submitted Need to develop a method for assessing reliability of ANN response over the entire range of possible input variables

40 Need to look into the black box, to understand which relationship was established between toxicity and dosimetric variables/patient features. In the domain of NTCP modelling we know that there are some assumptions that have to be respected. The minimum set is constituted by: probability of toxicity cannot decrease with increasing dose patients receiving 0 Gy cannot have radioinduced toxicity A more intensive treatment (e.g. larger RT volumes) cannot result in less toxicity

41 Machine Learning Does not promote data reduction prior to learning. Promotes a culture of abundance: the more data, the better POPULATION SIZE? 10 6? The model doesn t generalize well from training data to unseen data. This is known as overfitting, and it s a common problem in machine learning and data science.

42 Machine Learning INTERPRETABILITY Predictive accuracy Interpretability Logistic regression Decision Tree Random Forest ANN Knowledge vs clinical utility!

43 Machine Learning USE FOR FEATURE SELECTION Gene expression profile of in blood lymphocytes GENE EXPRESSION ANALYSIS ( gene profiles) UNSUPERVISED CLUSTERING UNSUPERVISED obtained from "unlabeled" data (categorization in toxicity/no toxicity groups was not included in the analysis determining clustering) CLUSTERING

44 De Santis, unpublished A B C 208 genes determined a 3 class clustering

45 Clustering A vs B OR C No difference between clusters B & C

46 We often have overestimated expectations on machine learning results, because finding patterns is hard, and often not enough training data is available

47 Modelli predittivi in radioterapia: modelli statistici vs Machine Learning multivariable NTCP models are based on available data In most cases the weak point of the process is related to DATA ( observations from clinical trials) rather than to the capabilities of the statistical tools Dimension of the dataset Rate of toxicity Definition of endpoints Accuracy of reporting of explaining features Restricted variability of features (e.g. doses in the dataset) Choice of the dosimetric descriptor