Sofosbuvir (SOF) is a recently U.S. Food and

Size: px
Start display at page:

Download "Sofosbuvir (SOF) is a recently U.S. Food and"

Transcription

1 Clinical Evidence and Bioinformatics Characterization of Potential Hepatitis C Virus Resistance Pathways for Sofosbuvir Eric F. Donaldson, Patrick R. Harrington, Julian J. O Rear, and Lisa K. Naeger Sofosbuvir (Sovaldi, SOF) is a nucleotide analog prodrug that targets the hepatitis C virus (HCV) nonstructural protein 5B (NS5B) polymerase and inhibits viral replication. High sustained virological response rates are achieved when SOF is used in combination with ribavirin with or without pegylated interferon in subjects with chronic HCV infection. Potential mechanisms of HCV resistance to SOF and other nucleos(t)ide analog NS5B polymerase inhibitors are not well understood. SOF was the first U.S. Food and Drug Administration (FDA)-approved antiviral drug for which genotypic resistance analyses were based almost entirely on next-generation sequencing (NGS), an emerging technology that lacks a standard data analysis pipeline. The FDA Division of Antiviral Products developed an NGS analysis pipeline and performed independent analyses of NGS data from five SOF clinical trials. Additionally, structural bioinformatics approaches were used to characterize potential resistance-associated substitutions. Using protocols we developed, independent analyses of the NGS data reproduced results that were comparable to those reported by Gilead Sciences, Inc. Low-frequency, treatment-emergent substitutions occurring at conserved NS5B amino acid positions in subjects who experienced virological failure were also noted and further evaluated. The NS5B substitutions, L159F (sometimes in combination with L320F or C316N) and V321A, emerged in 2.2%-4.4% of subjects who failed SOF treatment across clinical trials. Moreover, baseline polymorphisms at position 316 were potentially associated with reduced response rates in HCV genotype 1b subjects. Analyses of these variants modeled in NS5B crystal structures indicated that all four substitutions could feasibly affect SOF anti-hcv activity. Conclusion: SOF has a high barrier to resistance; however, low-frequency NS5B substitutions associated with treatment failure were identified that may contribute to resistance of this important drug for chronic HCV infection. (HEPATOLOGY 2015;61:56-65) Sofosbuvir (SOF) is a recently U.S. Food and Drug Administration (FDA)-approved nucleotide analog prodrug that targets the hepatitis C virus (HCV) nonstructural protein 5B (NS5B) polymerase and inhibits HCV replication. When used in combination with ribavirin (RBV) with or without pegylated interferon-alpha (Peg-IFN-a), SOF treatment resulted in high sustained virological response (SVR) rates 12 weeks after end of treatment (SVR12) in subjects with chronic HCV genotype 1, 2, 3, or 4 infection. 1,2 The approval of SOF was based on data obtained in four pivotal phase III clinical trials P (FISSION), GS-US (POSITRON), GS- US (FUSION), and GS-US (NEUTRINO) for which drug resistance analyses were performed. In addition, resistance data were assessed from a phase IIb liver pretransplant study, P , which evaluated the efficacy of SOF/ RBV in a subgroup of subjects with hepatocellular carcinoma meeting the Milan criteria before Abbreviations: DAVP, Division of Antiviral Products; FDA, U.S. Food and Drug Administration; GSI, Gilead Sciences, Inc.; GT, genotype; HCV, hepatitis C virus; LT, liver transplantation; NDA, new drug applications; NGS, next-generation nucleotide sequencing; NS5B, nonstructural protein 5B; Peg-IFN-a, pegylated interferon-alpha; PVD75, probabilistic variant detection reduced from a default value of 90 to 75 to increase the number of variant calls; QbVD, Quality-based variant detection; RBV, ribavirin; SOF, sofosbuvir; SVR, sustained virological response; SVR12, sustained virological response at 12 weeks after end of treatment; VF, virological failure; WT, wild type. From the Division of Antiviral Products, U.S. Food and Drug Administration, Silver Spring, MD. Received May 12, 2014; accepted August 11,

2 HEPATOLOGY, Vol. 61, No. 1, 2015 DONALDSON ET AL. 57 receiving a liver transplant (LT) to prevent HCV recurrence. The FDA Division of Antiviral Products (DAVP) conducts independent analyses of resistance data in review of new drug applications (NDAs) for antiviral drugs. Historically, genotypic resistance data from pivotal trials of antiviral drugs have been generated using Sanger population sequencing for which data analysis is relatively standardized and straightforward. The sponsor of the SOF NDA, Gilead Sciences, Inc. (GSI), utilized next-generation sequencing (NGS) for genotypic resistance analyses and included raw NGS data as requested in the SOF NDA, representing the first time NGS was used for genotypic resistance analyses of pivotal trials to support an antiviral drug NDA. In contrast to Sanger population sequencing, NGS is an emerging technology that presents many potential data analysis and data integrity issues that must be considered when conducting a regulatory review. Multiple NGS platforms are currently available (e.g., 454, Illumina, Ion Torrent, and PacBio), and these technologies continue to evolve even as newer technologies emerge. Each platform employs different sequencing chemistries that contribute to differences in assay performance, including read lengths achieved and base calling accuracy. Furthermore, there are currently no standardized analysis pipelines for analyzing NGS data, and more than 200 algorithms have been used to assemble small reads, with each algorithm employing unique filtering, trimming strategies, and alignment parameters. Finally, many NGS analysis pipelines use proprietary scripts and programs that are not available in the public domain. All of these aspects of NGS analysis present new challenges in the regulatory review of antiviral drugs. In this report, we describe the methods used by DAVP to process and analyze NGS data submitted as part of the SOF resistance analysis data set. We present the results of our independent analyses that identified treatment-emergent substitutions or polymorphisms in NS5B, including L159F, C316N, and V321A, that were associated with virological failure (VF) in some SOF-treated subjects. The potential for these substitutions to contribute to resistance was further supported by structural bioinformatics analysis of these substitutions in the context of crystal structures of the NS5B polymerase. The analyses described in this report represent an important milestone in the regulatory review of antiviral drugs as the use of powerful NGS and other bioinformatics technologies for clinical resistance analyses of antiviral drugs becomes more common. Materials and Methods NGS Data Analysis Pipeline. CLC Genomics Workbench (Schr odinger, Germany) was used to evaluate each sequence run, trim and filter reads before mapping, and map reads to the appropriate HCV reference sequence. Two independent variant detection algorithms were used to call variants from each mapping, and variant tables were exported from the CLC Genomics Workbench and combined in Excel to generate frequency tables and resistance summary tables. NGS Analysis Parameters and Overview of Data Analysis. Fastq sequence files containing nucleotides and quality scores for all bases sequenced by the Illumina platform for each subject and time point were provided by portable hard drive. The sequences were uploaded with the CLC Genomics interface, using the Illumina specific criteria. Failed reads were removed, read names were discarded, and quality scores were calculated using the National Center for Biotechnology Information/Sanger (Illumina Pipeline 1.8) option. The fastq files were segregated by genotype (GT) and subtype, and the NS5B genes for HCV GT1a (H77: gi ), GT1b (Con1: gi ), GT2a (JFH1: gi ), GT2b (HC-J8CF: gi ), and GT3a (S52: gi ) were imported and annotated as coding sequences for use as reference sequences. The individual sequence reads from each fastq file were trimmed using the default parameters for CLC Genomics Workbench. The sequence reads from each fastq file were aligned to the appropriate reference sequence to generate an NS5B consensus sequence and mapping for each time point, and changes at the amino acid level were analyzed for each sample. The mappings were assessed to determine the depth and uniformity of coverage at each nucleotide position and evaluate read directionality (ratio of forward to reverse reads) to identify regions of bias. Address reprint requests to: Eric F. Donaldson, Ph.D., Division of Antiviral Products, U.S. Food and Drug Administration, New Hampshire Avenue, WO22, Office 6336, Silver Spring, MD Eric.Donaldson@fda.hhs.gov; fax: Published This article is a U.S. Government work and is in the public domain in the USA. View this article online at wileyonlinelibrary.com. DOI /hep Potential conflict of interest: Nothing to report.

3 58 DONALDSON ET AL. HEPATOLOGY, January 2015 NGS Analysis Pipeline Output. Two algorithms were used to call variants from each mapping based on independent criteria, and variant tables were generated for each sequence run and variant detection method. The two independent variant detection systems were: 1. Probabilistic variant detection (PVD75) uses a probabilistic model (combines a Bayesian model and a maximum likelihood approach to calculate previous and error probabilities). Parameters are calculated on the mapped reads without considering the reference sequence. The variant probability parameter was reduced from a default value of 90 to 75 to increase the number of variant calls. 2. Quality-based variant detection (QbVD) based on the neighborhood quality standard algorithm, it uses a combination of quality filters and userspecified thresholds for coverage and frequency to call variants covered by aligned reads. Frequency tables were generated by exporting variant tables for both variant detection methods for each mapping and reformatting the data to reflect variation at the amino acid level. The variant tables were combined by genotype/subtype and study, filtered to remove synonymous substitutions, and reformatted to be directly comparable to the frequency tables submitted by GSI. Three filtering thresholds were tested to identify emergent substitutions from the frequency tables, and the SUBS1012 threshold was used for the analysis. SUBS1012 identified substitutions that were absent at baseline (<0.01 frequency), but present at a frequency of 0.10 or greater at later time points. SUB- S1012 was used to perform resistance analysis because it identified treatment-emergent substitutions that were not detected at baseline. NGS Data Comparison. Amino acid substitutions identified by the three algorithms: The GSI algorithm and QbVD and PVD75 used by us were compared and major differences noted. Structural Bioinformatics Modeling of HCV NS5B Amino Acids. Structural comparisons were generated using PyMol (Schr odinger, LLC, New York, NY) to model the X-ray crystal structures of NS5B for GT1a with the C316 residue and GT1b with the N316 residue using structures with the following Protein Data Bank accession numbers: 3HKW (GT1a- H77) and 3HHK (GT1b-Con1). 3 The two structures were loaded into PyMol and superimposed upon one another to show the positions of residues that were near the NS5B active site. The HCV GT2a (strain JFH-1) NS5B polymerase structure was used to assess interactions with primer-template RNA. 4 Results Comparison of NGS Analysis Approaches and Results. GSI used an NGS analysis pipeline that included internally developed software to process and align NGS sequencing data with a multistep method that included PyroMap, 5 and this analysis approach has been described elsewhere. 6 We conducted an independent assessment of the raw NGS data. The primary difference from GSI s analysis approach was that we used an HCV NS5B reference sequence specific to the HCV genotype and subtype that each subject was infected with to compare all samples (baseline and time of failure) for a given subject. Second, two different variant detection methods were used to identify potential resistanceassociated substitutions compared to one used by GSI. Despite the different analysis approaches, the overall results from the different analysis pipelines were similar (Fig. 1). Overview of Resistance Assessments. Of the 982 subjects treated with SOF/RBV or SOF/Peg-IFN/RBV in phase III studies, 224 failed treatment and a total of 676 NGS files from these subjects were analyzed (Table 1). In addition, 14 samples from 5 on-treatment failures in study P were evaluated by NGS. In total, 690fastqfileswereanalyzed(Table1). Resistance analyses were conducted to determine whether novel treatment-emergent amino acid substitutions in NS5B (predominant or low frequency) were associated with treatment failure in the phase III SOF trials. Potential substitutions of interest were identified for further characterization based on the following criteria: 1. Substitutions occurring in multiple subjects at sites that are highly conserved (greater than 95% identity among isolates) 2. Substitutions that occur in a single direction at polymorphic sites (i.e., L159F without observations of F159L) 3. Novel substitutions that are rarely or never observed at a polymorphic or conserved position 4. Polymorphisms that are present at baseline and associated with treatment failure Treatment-Emergent Resistance Analysis of Four Pivotal Phase III Clinical Trials. Based on their genotypic and phenotypic analyses, GSI concluded that there was no evidence of SOF resistance among subjects who failed to achieve SVR12 in the pivotal phase III trials. Their genotypic analyses focused primarily on whether HCV populations in these subjects

4 HEPATOLOGY, Vol. 61, No. 1, 2015 DONALDSON ET AL. 59 Fig. 1. Comparison of different analysis pipeline results from treatment-emergent analysis of P Treatment-emergent amino acid substitutions detected by GSI (black bar) or by two detection methods employed by DAVP, including PVD75 (gray bars) and QbVD (white bars), generally showed good overlap, indicating that the analysis approaches reproduced comparable results. Positions where GSI did not detect the substitutions were polymorphic sites that were present at baseline and time of failure. Position 341 was determined to be a polymerase chain reaction artifact. carried the NS5B S282T substitution, a prototypical nucleos(t)ide analog NS5B polymerase inhibitor resistance-associated substitution previously shown to affect SOF anti-hcv activity in cell culture. 7 Although S282T was previously detected in an HCV genotype 2b subject who failed 12 weeks of SOF Table 1. Number of Subjects and NGS Samples Analyzed by DAVP for Five Clinical Trials Clinical Trial Phase HCV GT No. of Subjects No. of NGS Files P (FISSION) III 1a 1 1b 1 2a 2 3a 74 Total GS-US (POSITRON) III 2a 2 2b 3 2j 1 3a 35 Total GS-US (FUSION) III 1a 2 1b 2 2a 2 2b 5 3a 65 3b 1 Total GS-US (NEUTRINO) III 1a 18 1b Total Phase III totals P (Liver Pre-LT)* IIb 1a 3 1b 1 2b 1 Total 5 14 Phase IIb totals 5 14 Overall totals *Only subjects experiencing virological breakthrough or partial responders were analyzed by NGS. monotherapy in a phase II trial, it was not detected at baseline or subsequent to treatment failure in any VF subjects analyzed in the phase III trials. GSI reported 95 other NS5B substitutions that each emerged in >2 subjects, but concluded that these substitutions were not associated with resistance, because they occurred at polymorphic sites (defined as <99% conserved) or represented changes toward the wild-type (WT) reference sequence, and were not associated with a >2-fold decrease in susceptibility to SOF or RBV in cellculture phenotype assays. Based on the DAVP resistance criteria, there were no clear treatment-emergent substitution patterns in subjects with GT1 and GT2 HCV, which may have been partly the result of the low number of GT1 and GT2 VF subjects for analysis. For HCV GT3a, 76 treatment-emergent substitutions were detected in 2 or more subjects; however, eliminating substitutions that occurred at clearly polymorphic sites narrowed the list to 16 treatment-emergent substitutions (Table 2). Treatment-emergent substitutions L159F (n 5 6) and V321A (n 5 5) were detected in postbaseline samples from 11 GT3a-infected subjects. According to results submitted by GSI, no detectable reduction in phenotypic susceptibility to SOF was observed for subject isolates with L159F or V321A substitutions. Nevertheless, L159F 8 and V321I 9 have been reported to be associated with resistance to nucleos(t)ide analog NS5B polymerase inhibitors. Resistance Analysis of Liver Pretransplant Study P GSI submitted baseline Sanger population sequencing data for all 61 subjects in this trial, and samples near the time of failure for 16 treatment failures, including 3 subjects who experienced virological breakthrough on-treatment, 2 subjects who experienced

5 60 DONALDSON ET AL. HEPATOLOGY, January 2015 Table 2. Treatment-Emergent Substitutions Detected by NGS That Arose in Subjects Infected with HCV GT3 Subjects POS No. Subj SUB10 (1/2) Ratio Cons (%) Sub Baseline Conclusion GT3a subjects from all phase III trials V116I/I116V/T116I I/V Equal BL vs. Fail Not likely T542A/A542T A Equal BL vs. Fail Not likely R517K/K517R K/R Equal BL vs. Fail Not likely K535N/N535K K/N/R Equal BL vs. Fail Not likely 11 6 V11I I/V Equal BL vs. Fail Not likely L159F 100 Not detected at BL RAS K250R/R250K K/R Equal BL vs. Fail Not likely K114R/R114K K/R Equal BL vs. Fail Not likely V321A 100 Not detected at BL RAS K106R/R106K Equal BL vs. Fail Not likely G113S/S113G S/G Equal BL vs. Fail Not likely A156P/P156S P Equal BL vs. Fail Not likely R520K/V520I I/T Equal BL vs. Fail Not likely M147V M/V Equal BL vs. Fail Not likely V438I I Equal BL vs. Fail Not likely R519K K/R Equal BL vs. Fail Not likely POS: NS5B position. No. Subj: number of subjects that had HCV GT3a with the substitution. SUBS10(1/2): substitutions detected, if present, at 0.10 or greater frequency at failure time point, but not detected at baseline. Ratio: ratio of different substitutions at a position in the order of most to least. Cons: percent conservation reported by GSI based on occurrences in their internal database. Sub: substitutions known to occur at that position. Baseline: comparison of subjects who had the substitution at baseline. Equal BL vs. Fail: present at baseline at equivalent frequencies for subjects that achieved SVR12 and those who failed. RAS: resistance-associated substitution as determined using DAVP criteria (see Results section). a slow, partial virological response on-treatment, and 11 subjects who experienced post-treatment relapse. In addition, NGS data were provided for the 5 ontreatment failures. Analysis of the resistance data indicated that there were baseline polymorphisms and treatment-emergent substitutions potentially associated with SOF treatment failure (Table 3). An N316 polymorphism was detected at baseline in 6 subjects isolates, of which 4 also had an F159 polymorphism detected at baseline. All 6 of these subjects failed treatment. In contrast, these polymorphisms were not detected in subjects who achieved SVR12 (n 5 44). In addition, L159F emerged in 1 VF subject, as detected by Sanger population sequencing (Table 3). Of the 5 on-treatment failures, 1 subject infected with HCV GT1b (subject A) had F159 and N316 polymorphisms detected at baseline and these substitutions were still detected at follow-up (Table 4). This subject experienced virological breakthrough at week 8. The other 2 subjects who experienced virological breakthrough (subject B, GT2b, at week 12 and subject D, GT1a, at week 16) had treatment-emergent L159F detected in a subset of sequences (2.1% and 9.5%, respectively). The L159F substitution was also detected by Sanger population sequencing in subject D. Interestingly, in subject D, there was evidence of a 9-fold reduction in the frequency of this substitution by NGS and it was no longer detected by Sanger sequencing between 1 and 2 weeks after treatment ceased and LT (Table 4). This observation indicates that the relative abundance of viral populations carrying L159F can decline rapidly in the absence of drug pressure. HCV from one of the partial responders (subject C) had emergent S282R and L320F at week 8, before stopping treatment at week 12. Of note, mutations encoding S282R and L320F substitutions were not present on the same sequence reads, indicating that these substitutions were on distinct HCV genomes in the quasispecies. The other partial responder did not have detectable substitutions at previously identified treatment-emergent sites. In total, the assessment of NGS data from all trials identified three substitutions of interest, including L159F (n 5 12), V321A (n 5 5), and S282R (n 5 1; Tables 2-4). In addition, N316 was identified as a baseline polymorphism (n 5 6; 4 also had L159F) that potentially reduced the efficacy of SOF in subjects infected with HCV GT1b (Table 4). To determine how these substitutions might affect SOF activity, these amino acid positions were studied on the crystal structure of HCV NS5B and analyzed in the context of the NS5B polymerase active site. L159 Interacts With S282, Which Is a Known SOF Resistance-Associated Position. We displayed the SOF resistance-associated substitutions on published X-ray crystal structures of the NS5B polymerase 3,4 to determine whether these amino acid replacements could affect the catalytic activity of the enzyme. Structural analysis of S282 and L159 revealed that these amino acids are within 4 angstroms (Å) of one another and could potentially interact (Fig. 2A,B).

6 HEPATOLOGY, Vol. 61, No. 1, 2015 DONALDSON ET AL. 61 Table 3. Baseline and Treatment Emergent Substitutions From Study P Based on Sanger Population Sequencing GT POS No. Subj SSUBS Ratio Cons (%) Sub Baseline Conclusion 1a Baseline 77 2 K77R/K77T K inconclusive Not likely S130N N/S inconclusive Not likely K212R K/R inconclusive Not likely T213S N/T inconclusive Not likely S231N/S231A N/S inconclusive Not likely T235M inconclusive Not likely Q330P/Q330Q/P inconclusive Not likely P461L L/P inconclusive Not likely R517R/K/R517K K/R inconclusive Not likely I520F/I520T I/T inconclusive Not likely Emergent L159F BL in TF RAS 1b Baseline 5 2 T5S/T5T/S BL in 1 TF Not likely 90 3 K90K/M/K90M/K90A inconclusive Not likely T130N/S/T130N S/T inconclusive Not likely D135N inconclusive Not likely L159F F/L BL in TF RAP T181T/S/T181K/T181N inconclusive Not likely S190A A/S inconclusive Not likely N206N/D/N206N/K K/N BL in 1 TF Not likely K212R/K212K/R K/R inconclusive Not likely V235T inconclusive Not likely C316N C/N BL in TF RAP E357A inconclusive Not likely Q461P/Q461L inconclusive Not likely S506A inconclusive Not likely R517K inconclusive Not likely Emergent 77 1 T77T/A BL in 1 TF Not likely E440G/E440E/G inconclusive Not likely 2b Baseline 49 2 A49M inconclusive Not likely E330G inconclusive Not likely Emergent D213N inconclusive Not likely GT: HCV genotype and subtype. POS: NS5B position. No. Subj: number of subjects that had the substitution. SSUBS: substitutions detected by Sanger population sequencing. Ratio: ratio of different substitutions at a position in the order of most to least. Cons: percent conservation reported by GSI based on occurrences in their internal database. Sub: substitutions known to occur at that position. Baseline: comparison of subjects who had the substitution at baseline. BL in TF: only observed at baseline in subjects who failed treatment. BL in 1 TF: detected at baseline in 1 subject who failed treatment. Inconclusive: detected at baseline at equivalent frequencies for subjects that achieved SVR12, subjects who failed, and subjects who dropped out or for whom results were not yet available. RAS: resistance-associated substitution as determined using DAVP criteria (see Results section). RAP: resistance-associated polymorphism present at baseline in subjects who failed treatment. C316 and V321 Interact With the Catalytic Triad. Structural analysis of positions C316 and V321 showed that these amino acids are in close proximity to the catalytic triad of the HCV NS5B polymerase (D220, D318, and D319). V321 is positioned on the same beta sheet as D319, and C316 is proximal to, and interacts with, D319 and D318. Given their close proximity to the catalytic triad, changes at these amino acid positions are predicted to alter the conformation of the active site (Fig. 2B). Potential Impact of N316 on Efficacy of SOF in Subjects Infected With HCV GT1b. In the SOF phase III trials, subjects infected with HCV GT1a exhibited an overall SVR12 rate of 92% (206 of 225) and HCV GT1b-infected subjects had an overall SVR12 rate of 82% (54 of 66). However, in previous direct-acting antiviral drug development programs, it has been commonly observed that HCV GT1binfected subjects have better response rates than HCV GT1a-infected subjects. To determine whether viral genetic factors contributed to overall response rates, structural analyses were performed to determine whether amino acid differences between these subtypes in the NS5B polymerase protein sequence correlated with the difference in efficacy. A structural comparison was made by modeling the X-ray crystal structures of NS5B for GT1a with the C316 residue and GT1b with the N316 residue (Fig. 2C,D). 3 The catalytic triad residues (D220, D318, and D319) were highlighted, and all residues within 8 Å of the catalytic triad were analyzed. Only one difference was noted between these genotypes within 8 Å of the catalytic site and that was at position 316, where C316 was present in the GT1a structure and N316 was present in the GT1b structure (Fig. 2C,D). Importantly, C316 is highly conserved in GT1a (99.89%), but polymorphic in GT1b (81.83%) based on frequencies provided by GSI.

7 62 DONALDSON ET AL. HEPATOLOGY, January 2015 Table 4. Baseline and Treatment Emergent Amino Acid Substitutions from Study P On-Treatment VFs, Detected by NGS SUBJ GT RESPONSE WK VISIT SUBST TCOV VCOV AAFREQ Baseline A 1b VBT W8 BL L159F FU4 L159F BL C316N FU4 C316N Emergent B 2b VBT W12 W12 L159F C 1a Partial W12 W8 S282R a Partial W12 W8 L320F D 1a VBT W16 PTXW1 L159F a VBT W16 PTXW2 L159F Shown are amino acid substitutions/polymorphisms at key NS5B positions of interest (159, 282, 316, and 320). Four of the five on-treatment failures had substitutions at these positions. Baseline: amino acids detected at baseline and at time of failure. Emergent: amino acid substitutions detected at time closest to failure, but not detected at baseline. SUBJ: random identifiers used to distinguish between subjects. GT: HCV genotype and subtype. RESPONSE: VBT, viral breakthrough or Partial 5 partial response that led to failure. WK: time point closest to time of failure. VISIT: BL, baseline; FU, post-treatment follow-up week; PTXW, post-transplant week). SUBST: the position and substitutions of interest. TCOV: total coverage at the codon. VCOV: number of NGS reads with the codon that encodes the substitution. AAFREQ: frequency of the substitution. Of note, N316 found in the HCV GT1b polymerase structure is a larger, more bulky amino acid with two additional functional groups in its side chain. The larger amino acid is predicted to interfere with the ability of SOF to enter the active site by blocking the space required to accommodate the additional 2 Me and 2 F groups of SOF (Fig. 2C,D). Fig. 2. The enzymatic pocket of the HCV NS5B polymerase. (A) Cartoon view of chain A of the HCV GT2a NS5B polymerase showing the rear view of the right-handed notation, colored by palm (red), fingers (green), and thumb (blue) domains with the catalytic triad residues (yellow), primer-template RNA (magenta and beige), and substitutions of interest (cyan and orange). (B) Zoomed-in look at (A) showing the active site and key residues. The RNA primer-template is shown to highlight that it does not interact with S282. A distance of 4 Å (dashed black line) occurs between S282 and L159. (C) The catalytic triad (yellow) with C316 present (cyan spheres). (D) The catalytic triad with N316 superimposed (orange spheres) on C316. Orange spheres indicate that N316 is more proximal to D318 and D319 than C316.

8 HEPATOLOGY, Vol. 61, No. 1, 2015 DONALDSON ET AL. 63 Among subjects in Study GS-US (NEU- TRINO) infected with HCV GT1b, no subjects developed substitutions at position C316 over the course of treatment. However, 6 GT1b subjects had N316 (n 5 4) or H316 (n 5 2) at baseline. Of the 4 subjects with baseline N316, 2 achieved SVR12, 1 experienced viral relapse, and 1 was lost to follow-up. Of the 2 subjects with H316, 1 achieved SVR12 and 1 discontinued at week 2 of treatment. These small numbers of subjects and the SVR rate of the background regimen made it difficult to determine whether this substitution had an effect on efficacy in NEUTRINO. Among GT1b subjects with C316 in NS5B at baseline, the SVR12 rate was 85% (51 of 60). An examination of each of the GT1b subjects with N316 or H showed that some of the subjects who experienced VF also had confounding baseline factors, such as non-cc IL28B genotypes, high viral loads, and cirrhosis that could have contributed to failure. In total, C316N/H/ F was associated with SOF failure in 7 subjects infected with HCV GT1b in studies P and GS-US In addition, C316F emerged in 1 subject infected with HCV GT1a who experienced relapse. Discussion SOF was the first antiviral drug approved by the FDA with resistance data from pivotal trials generated by NGS. Given the emerging nature of this technology, and that there are currently no standardized NGS analysis pipelines, DAVP felt it was necessary to independently assess these data. More than 200 algorithms are available to assemble short sequence reads generated by NGS, and results can vary depending upon the algorithm used. 10 This is particularly problematic when the goal is to identify resistance pathways that may be infrequent, but potentially important for public health. In the future, we anticipate the development of a standardized NGS analysis pipeline that will provide a reproducible data analysis route sufficient for generating consistent and robust results from NGS data. For this report, we developed an NGS analysis pipeline and generated NGS summary data in the same format submitted by GSI. For calling variants at the nucleotide level, we employed two variant detection methods, PVD75 and QbVD, which allowed for direct comparison to the variants detected by GSI and provided a second algorithm for cases where there was disagreement with a variant (or its frequency). Direct comparison of the NGS data showed that the two approaches generated comparable results (Fig. 1). However, the different analysis pipelines used much different default filtering and mapping criteria and this was apparent when comparing frequency table values, such as total coverage, variant coverage, and amino acid frequency. Despite these differences, the general trends observed were very similar, although interpretation of the results differed. Of the 240 subjects who failed treatment with SOF in the clinical trials analyzed, the majority of these subjects experienced relapse and had no clear resistance-associated substitutions. However, lowfrequency, treatment-emergent substitutions that occurred at conserved amino acid positions in the virus of subjects who failed treatment were detected by both NGS analysis approaches. In general, GSI concluded that no clinically relevant treatment-emergent substitutions were identified in these trials, a conclusion that was based, in large part, on the lack of a clear reduction in susceptibility to SOF (>2-fold) in cell-culture phenotypic assays. However, our interpretation of phenotypic data is that a cell-culture based phenotypic assay cannot rule out emergent substitutions that are associated with resistance in clinical trials, because phenotype assays do not always reliably predict clinical resistance. For example, clinically relevant shifts in phenotypic susceptibility can be too small to detect in cell-culture systems, a substitution may play a role in replicative fitness or have an effect on drug activity only when present in combination with other substitutions, 14 and population phenotype assays often cannot detect less-fit, drug-resistant viral populations that are present in a mixture with WT viral populations. 15 Assessment of the NGS data from all trials identified three substitutions of interest: L159F (n 5 12), V321A (n 5 5), and S282R (n 5 1; Tables (2 and 4)). In addition, N316 was identified as a baseline polymorphism that potentially reduced the efficacy of SOF in subjects infected with HCV GT1b (n 5 6; 4 also had L159F; Table 4; Fig. 2). Structural bioinformatics was used to study these substitutions in the context of the NS5B active site and make predictions about how these substitutions might reduce SOF activity. Substitutions at L159F and V321A Emerged After Treatment With SOF. L159F emerged in 6 subjects infected with HCV GT3a, was present at baseline in 4 subjects infected with HCV GT1b, all of whom failed in the LT study; and emerged in 2 subjects (GT1a and GT2b) who experienced breakthrough in that trial. Structural analysis showed that L159 is present in the catalytic pocket of the NS5B polymerase, where it potentially interacts with S282 (Fig. 2b), but is too far

9 64 DONALDSON ET AL. HEPATOLOGY, January 2015 away (>7 Å) to interact with the RNA primer template. 4 Given that no cocrystal structure with SOF binding to an HCV NS5B polymerase is publically available, it was not possible to determine how these substitutions affect SOF binding. However, given that SOF inhibition is affected by S282T and that L159 potentially interacts with S282, presumably, substitutions such as L159F alter amino acid side-chain interactions, which alter the position of the S282 side chain leading to reduced inhibition by SOF. Additional evidence that L159F has an effect on SOF activity was shown in a recent study. 8 This study reported that L159F emerged in combination with L320F in a subject who failed treatment with the nucleoside analog NS5B polymerase inhibitor, mericitabine, and this combination of substitutions affected both mericitabine and SOF anti-hcv activity in cell culture. 8 Of note, in 1 subject (subject D), the NGS data showed that viral populations carrying L159F were rapidly displaced by L159-expressing viruses (Table 3), indicating that some HCV viruses bearing L159F are less fit. Importantly, L159F was not detected in this subject by Sanger population sequencing at the later time point. This observation not only demonstrates the increased sensitivity that NGS provides over traditional Sanger sequencing, but also indicates that posttreatment resistance analyses may underestimate the number of subjects with treatment-emergent substitutions associated with VF. V321A emerged in 5 subjects infected with HCV GT3a, and a V321I substitution has been shown to affect the activity of other nucleotide analog NS5B inhibitors. 9 Moreover, position 321 is proximal to the catalytic triad of the NS5B polymerase (D220, D318, and D319), and changes at this position could prevent access of SOF to the site of catalysis (Fig. 2). Substitutions at S282 Reduce SOF Efficacy. Although S282T was not detected in any subjects in the trials analyzed here, it did emerge in 1 subject infected with HCV GT2a in a phase II clinical trial. In addition, S282R was observed in 1 subject (subject C; Table 4) infected with HCV GT1a who failed treatment in Study P L320F was also observed in samples from this subject, but the two substitutions were present in different HCV genomes. Role of N316 Polymorphism in SOF Efficacy. N316 was detected at baseline in 6 of 13 (4 also had L159F) subjects infected with HCV GT1b who failed treatment in study P , but was not detected in subjects who achieved SVR12. In the phase III clinical trials, 6 GT1b subjects had N316 (n 5 4) or H316 (n 5 2) at baseline, but these numbers were too small to draw conclusions on efficacy. In total, C316N/H/F was associated with SOF failure in 7 subjects infected with HCV GT1b in studies P and GS-US In addition, C316F emerged in 1 subject infected with HCV GT1a who experienced relapse. Of note, position C316 is highly conserved in HCV GT1a and substitutions C316N and C316H are less likely to occur in this genotype because each of these require two mutations in the codon for these substitutions to occur. Sequence and structural analyses indicated that the only residue within 8 Å of the NS5B polymerase catalytic site that varied between HCV GT1a and GT1b was position 316. The predominant amino acid at that position is C for both genotypes; however, 12% of the time, naturally occurring HCV GT1b has N present at this position. Although we are not aware of a publicly available crystal structure showing an HCV NS5B polymerase in complex with SOF, nucleoside analogs are known to directly interact with the active site to facilitate incorporation into the nascent RNA template leading to chain termination. Structural assessments predict that N316 or additional substitutions that add bulk (i.e., histidine and phenylalanine) in the NS5B of HCV GT1a may alter the ability of SOF to interact with the active site and incorporate into the nascent viral RNA strand (Fig. 2D). It is also important to note that C316N has been shown to affect the activity of certain NS5B-palm-targeting nonnucleoside polymerase inhibitors. 16 In this article, we described our independent assessment of NGS resistance data from five clinical trials supporting the FDA approval of SOF. SOF has a high barrier to resistance; however, we observed that lowfrequency NS5B substitutions (L159F, V321A, C316N, and S282R) were associated with SOF failure in a subset of subjects. We used structural bioinformatics to show how these substitutions could feasibly affect the interaction of SOF with the NS5B polymerase active site. Viruses carrying these substitutions, although relatively infrequent, may reduce the effectiveness of SOF and affect future treatment options. Other SOF clinical studies, for example, retreatment with SOF-containing regimens, are needed to fully understand the clinical impact of these potential resistance-associated substitutions. Acknowledgment: The authors acknowledge (by agency in alphabetical order): Debra Birnkrant, Ed Cox, John Jenkins, Jeff Murray, Dave Roeder, Dana Schuly, and the CDER document room staff for help in obtaining the NGS data sets and for support

10 HEPATOLOGY, Vol. 61, No. 1, 2015 DONALDSON ET AL. 65 throughout this project; Chuck Cooper, David Epstein, and Lilliam Rosario for help and support in developing the NGS pipeline; Brian Fitzgerald, Fu-Jyh Luo, Michael Mikailov, and Lohit Valleru for help with implementing and accessing CLC Genomics Workbench on the high-performance computing system; the FDA SOF review team for its excellent work in approving this drug, and, in particular, Poonam Mishra who provided editorial support for the manuscript; the Gilead Sciences, Inc., development team for providing the NGS data in a format that could be used by the division and for providing feedback on these procedures; the investigators and study subjects who organized, conducted, or participated in SOF clinical trials; and Vahan Simonyan for technical support with analysis of NGS data. References 1. Jacobson IM, Gordon SC, Kowdley KV, Yoshida EM, Rodriguez-Torres M, Sulkowski MS, et al. Sofosbuvir for hepatitis C genotype 2 or 3 in patients without treatment options. N Engl J Med 2013;368: Lawitz EJ, Mangia A, Wyles D, Rodriguez-Torres M, Hassanein T, Gordon SC, et al. Sofosbuvir for previously untreated chronic hepatitis C infection. N Engl J Med 2013;368: Nyanguile O, Devogelaere B, Vijgen L, Van den Broeck W, Pauwels F, Cummings MD, et al. 1a/1b subtype profiling of nonnucleoside polymerase inhibitors of hepatitis C virus. J Virol 2010;84: Mosley RT, Edwards TE, Murakami E, Lam AM, Grice RL, Du J, et al. Structure of hepatitis C virus polymerase in complex with primer-template RNA. J Virol 2012;86: PyroMap. Available at: 6. Svarovskaia ES, Dvory-Sobol H, Hebner C, Doehle B, Gontcharova V, Martin R, et al. No resistance detected in four phase 3 clinical studies in HCV genotype 1-6 of sofosbuvir 1 ribavirin with or without peginterferon. 64th Annual Meeting of the American Association for the Study of Liver Diseases, Washington, DC, November 1-5, Lam AM, Espiritu C, Bansal S, Micolochick Steuer HM, Niu C, Zennou V, et al. Genotype and subtype profiling of PSI-7977 as a nucleotide inhibitor of hepatitis C virus. Antimicrob Agents Chemother 2012;56: Tong X, Le Pogam S, Li L, Haines K, Piso K, Baronas V, et al. In vivo emergence of a novel mutant L159F/L320F in the NS5B polymerase confers low-level resistance to the HCV polymerase inhibitors mericitabine and sofosbuvir. J Infect Dis 2014;209: Lam AM, Espiritu C, Bansal S, Micolochick Steuer HM, Zennou V, Otto MJ, et al. Hepatitis C virus nucleotide inhibitors PSI and PSI exhibit a novel mechanism of resistance requiring multiple mutations within replicon RNA. J Virol 2011;85: Fonseca NA, Rung J, Brazma A, Marioni JC. Tools for mapping highthroughput sequencing data. Bioinformatics 2012;28: Lin PF, Samanta H, Rose RE, Patick AK, Trimble J, Bechtold CM, et al. Genotypic and phenotypic analysis of human immunodeficiency virus type 1 isolates from patients on prolonged stavudine therapy. J Infect Dis 1994;170: Larder BA, Chesebro B, Richman DD. Susceptibilities of zidovudinesusceptible and -resistant human immunodeficiency virus isolates to antiviral agents determined by using a quantitative plaque reduction assay. Antimicrob Agents Chemother 1990;34: Komatsu TE, Pikis A, Naeger LK, Harrington PR. Resistance of human cytomegalovirus to ganciclovir/valganciclovir: a comprehensive review of putative resistance pathways. Antiviral Res 2014;101: Sun JH, O Boyle DR, Zhang Y, Wang C, Nower P, Valera L, et al. Impact of a baseline polymorphism on the emergence of resistance to the hepatitis C virus nonstructural protein 5A replication complex inhibitor, BMS HEPATOLOGY 2012;55: Verbinnen T, Jacobs T, Vijgen L, Ceulemans H, Neyts J, Fanning G, et al. Replication capacity of minority variants in viral populations can affect the assessment of resistance in HCV chimeric replicon phenotyping assays. J Antimicrob Chemother 2012;67: Hang JQ, Yang Y, Harris SF, Leveque V, Whittington HJ, Rajyaguru S, et al. Slow binding inhibition and mechanism of resistance of nonnucleoside polymerase inhibitors of hepatitis C virus. J Biol Chem 2009;284: Author names in bold designate shared co-first authorship.