OBP s MARJORIE SHAPIRO ON ANALYTICAL SIMILARITY IN BIOSIMILAR APPROVAL

Size: px
Start display at page:

Download "OBP s MARJORIE SHAPIRO ON ANALYTICAL SIMILARITY IN BIOSIMILAR APPROVAL"

Transcription

1 OBP s MARJORIE SHAPIRO ON ANALYTICAL SIMILARITY IN BIOSIMILAR APPROVAL In a session on biosimilar regulatory issues at the 2016 PDA/FDA Joint Regulatory Conference in September, Office of Biotechology Products Team Leader Marjorie Shapiro provided an FDA Perspective on Analytical Similarities with Biosimilar Approval. She discussed: the difference between the 351(a) and (k) pathways understanding the protein and the relevance of the A-Mab QbD case study the first three biosimilar approvals as case studies, and clarifications of IND, expectations, use of statistics, critical risk ranking, and protein content analysis. I am going to talk specifically on the analytical similarity side of the biosimilar development program. [OBP s Emanuela Lacana] talked a lot about the differences between a 351(a) and (k) pathway, and I am just going to touch on that a little bit. I am going to talk about how important it is to understand the protein that you want to develop as a biosimilar. I am going to use what I call the A-Mab case study to pick up on the A-Mab QbD and continued process verification studies that were put together by industry working groups.then I am going to talk about the three real case studies of the three approved biosimilars we have. And I always feel I need to give some clarifications becomes sometimes it seems that sponsors misinterpret our advice. So I like to clarify things when possible. And then I will summarize at the end. The Difference Between the 351(a) and (k) Pathways The biosimilar development pathway is distinct from the novel biological entity pathway in that the foundation is the analytical similarity and then the clinical PK and PD. If you are highly similar analytically and you are highly similar in your clinical PK study then you have a lower risk of clinical differences, and this allows for the abbreviated development pathway. Now if you are developing a novel product, you think a lot about the quality target product profile (QTPP) and what your critical quality attributes (CQA) will be. When you think about them for a biosimilar product, the two are a little bit different in how you approach them. So for a novel product development, the QTPP forms the basis of the design for its development, and it can change during the course of product development. You might start out with I.V. administration and switch to subcutaneous. You might start with a liquid in a vial and then want go to prefilled syringes. You might have some general ideas of what your critical quality attributes are going to be, but as you develop your product and your manufacturing process, you get a better idea of what they are, and hopefully by the time of the BLA application you have a good understanding of that. For a biosimilar product the QTPP is already defined by the reference product. You don t have to figure out, do you want a vial, a liquid, a lyophilized product. You already know what the dosing regimen is and what the content of the container should be.

2 You may not know the critical quality attributes that were decided upon by the sponsor and the agency that were in the BLA. But you should have an idea from publicly available information and from your own studies of understanding the reference product, what some of the critical quality attributes are going to be, and maybe what you should focus on in your manufacturing process development. A very important point that is different between a novel product development pathway and a biosimilar, is that for the novel product there is some R&D work and IND enabling studies, which include maybe formulation studies, and you may have a good bioassay in place. But your full characterization studies continue to occur during phase 1, 2, 3 in clinical development. You might not have a full understanding of the product until you are ready to submit a BLA. For a biosimilar product, you have to know that before you even come to the agency for the BPD [Biosimilar Product Development] Type 2 or 3 meeting. You have to know that you can make a product that is going to be highly similar. So you have to have all your methods in place. You have to analyze the reference product. You have to have a good expression construct and system. You have to do that way earlier than you would for a novel product development pathway. The scientific considerations guidance document talks about the stepwise approach that should start with extensive structural and functional characterization. Again, this forms the foundation of the biosimilar development program. The stepwise approach allows you to sort of sit back and say, okay, this is what we know at this stage. What is the residual uncertainty? How do we address that with the next step in our development program? The question should drive the study design. When you are designing a study, you want to evaluate and understand the question that you are trying to answer. What is the residual uncertainty? What analytical differences have been observed, and how do you evaluate the potential impact of those differences to determine if they are going to be clinically meaningful or not? And what will the data tell you, and will they answer the question? Finally, you need to provide a sound rationale for the methods, assays, design and analysis of the studies. And you need to understand your tools, including the limitations of those tools. Then the quality considerations guidance document focuses on the types of analytical studies that may be relevant to assess the similarity. The general principles lay out the importance of extensive analytical, physicochemical and biological characterization methods, including orthogonal methods an assessment of product impurities, understanding the expression system, the identification of the lots used for the various analyses for biosimilarity determination. This includes the reference product lots as well as biosimilar lots. And then it also encourages advances in manufacturing science, quality-by-design approaches and hopefully, may facilitate fingerprint-like analysis. Understanding the Protein and the A-Mab Case Study It is important to understand the protein that you want to develop as a proposed biosimilar. You need to understand what is important for the biological activity of the protein. If there are multiple mechanism of action and multiple indications, do you understand which mechanism of action is important for each of the indications and which critical quality attributes are important for that mechanism of action?

3 And you need to understand the impact of potential post translational modifications. For example, there are known cases where the oxidation of methionine or the deamidation of asparagine residue in some proteins may impact its function or the immunogenicity. But oxidation and deamidation may not have that effect in every protein. And then you need to understand how the combinations of quality attributes interact to impact the clinical performance of your protein. Insights from the A-Mab Case Study So here is the A-Mab case study: This is a structure model of IgG that is based on years of research on human, mouse and rat IgG molecules. It is useful because we know which residues are important for binding FC gamma receptors. We know which residues bind complement. We know which residues make it a certain type of IgG-1 allotype, or an IgG-1 vs IgG 4 isotype. We know where the carbohydrate is attached. And obviously we know about the Fab region which is the antigen binding and the Fc region which has the effector functions. Again, typical antibody post-translational modifications include pyroglutamic acid at the N terminus of a heavy chain or light chain and C terminal lysine. If we just take half of the antibody molecule right now, and you have a pyroglutamic acid in the heavy chain and that can be either on or off in that position. Similarly, for the C terminal lysine. Then in this model we have three deamidation sites, two oxidation sites, two glycation sites and a variety of glycan structures. And what that means is that each one of those can be on or off in that site.

4 So if you take just half the antibody, there are about 9600 different combinations using these figures. If you look at the whole molecule, there are 10 8 variants in any given lot of material. So obviously for one individual molecule we are not necessarily going to be able to know which one of these post-translational modifications is present. So it is important to understand them for each lot then on a lot to lot basis and then which one of these post translational modifications may be more important or less important for being highly similar. If we think about charge variants in methionine oxidation, pyroglutamic acid at the N terminus and C terminal lysines usually contribute to the main charge differences in antibodies. But there is a lot of literature that says they don t impact function or in vivo behavior. So we would tolerate larger differences on these post-translational modifications than perhaps on others.

5 Similarly, the deamidation of the PENNY peptide in the C terminus is usually the asparagine that has the highest levels of deamidation in antibodies, and there is literature to suggest that these typical levels of deamidation don t impact function. But there is literature that also shows that deamidation in heavy chain or light chain CDR region can impact the binding. So something like this, we would be more concerned about the differences between the biosimilar and the reference product than we would be say for deamidation of the PENNY peptide. Similarly, for oxidation: Typically, in an antibody the most highly oxidized peptides are the ones in the C terminus that potentially could impact binding to the neonatal receptor. But again the literature suggests that levels much higher than you typically see for a therapeutic monoclonal antibody would be needed to impact binding the FcRn. But theoretically you could have an oxidized amino acid in a CDR that could impact its function. Finally, high levels of glycation: there is literature to suggest that levels higher than what you typically see in a therapeutic monoclonal antibody don t impact the function. The good news is that all of these post translational modifications happen to endogenous antibodies. So if your therapeutic monoclonal antibody has some of these, you are not introducing something to a patient that the patient has not seen before. The one thing to think about though, is if you are delivering your product subcu., it is possible that large charge differences, say greater than a pi unit, can impact the PK because of how the subcutaneous tissue is packed tightly in human skin. This is a residual uncertainty if you have a large charge difference. But that residual uncertainty is something that clearly the PK study would address. So more important than some of the post-translational modifications, we would then look at the functional aspects of the molecule antigen binding and effector function. Clearly we want to see binding to antigen. If the binding to antigen triggers a biological signal, we want to see biological assays that address that and show the biosimilar product is highly similar to the reference product. We want to understand binding to complement, and to the Fc gamma receptors and Fc neonano receptor. These sorts of biochemical assays should all be performed and hopefully will demonstrate that the biosimilar is highly similar to the reference product.

6 Then we know that glycan is very important and certain glycan structures can impact effector function and PK. Here is the standard saccharide that is attached to asparagine 297 in an antibody and it shows each of the monosaccharide units. We know for example that galactose is associated with complement activity. So if complement activity is important for your molecule, then you should be highly similar with the levels of galactose G0, G1 and G2. We know that fucose can inhibit binding to the FcyRIII receptor which results in a decrease in ADCC activity, so we would want sponsors to assess levels of afucosylated proteins that s generally afucosylated G0, G1, G2 and some high mannose structures. We know that the bisecting N acetylglucosamine can inhibit the addition of fucose, which would result in an increased ADCC activity. And although most therapeutic monoclonal antibodies don t have very high levels of sialic acid, there s literature that says when it is present in high levels it can also inhibit binding to FcyRIII receptor. So if this were a unique situation for your antibody in an expression system and you had high sialic acid, we would want that to be highly similar. Then again, incompletely processed high mannose glycans can be cleared faster and impact the PK, and they also lack fucose, which could impact ADCC. So your high mannose glycan forms should be highly similar. Experience with the First Three Biosimilars Approved Here are some real case studies that are taken from Advisory Committee slides that were presented for the [first three biosimilar products approved by FDA]. These links [provided above at the end of the IPQ narrative] will take you to the Advisory Committee web page where you can get all the details if you are interested.

7 Filgrastim The first product is biosimilar filgrastim. GCSF filgrastim is a pretty simple structure it is a non-glycosylated protein. It is only 18.8 kda and it is manufactured in E.coli. It is really purified to homogeneity. It is amenable to extensive analytical characterization. And there is good knowledge on the structure-function relationship. There is knowledge of chemical modification, methionine oxidation can reduce potency, and we understand the critical role of binding to the GCSF receptor. This is a summary of the list of the quality attributes that were used to assess the analytical similarity between the biosimilar product EP2006, otherwise known as Zarxio, and US-Neupogen and EU-Neupogen. The threeway analytical bridge was needed because clinical studies used EU-Neupogen. You can see what the conclusions were for each of these quality attributes that the biosimilar product was highly similar to the reference product, as well as to the EU-Neupogen. In addition, the three products had highly similar stability profiles.

8 The biologic activity was assessed by our highest statistical method which is statistical equivalence. This is a graph of the biological activity showing the biosimilar product in the squares, US-Neupogen in the triangles and EU-Neupogen in the circles. This is the results of the statistical equivalence test. There was equivalence between the biosimilar and the US-Neupogen, between the biosimilar and EU-Neupogen, and between EU-Neupogen and US-Neupogen. So this supported the three-way analytical bridge, which allowed the use of the clinical data collected with EU-Neupogen. All three products were highly similar for this statistical equivalence.

9 Infliximab The next product was a biosimilar infliximab. The advisory committee was in February. Again you can see that the quality attributes that were assessed were in the same categories we used for filgrastim. But because it is a different type of molecule, there were molecule-specific type of assays. Infliximab, and although we do not have any approved adalimumabs at this time they are some of the more complex molecules to demonstrate similarity to because of the number of indications that they have.

10 If you look at the top row across, it is approved for RA [rheumatoid arthritis], ankylosing spondylitis, psoriatic arthritis, plaque psoriasis, and two inflammatory bowel disease indications Crohn s disease and ulcerative colitis. For the first four indications, there is high confidence across medical specialists and people who understand these products that it works primarily by blocking the binding of site membrane TNF to its receptors. This is likely an important mechanism of action for the inflammatory bowel disease indications, but there are a bunch of other mechanisms of action that could potentially play a role in the inflammatory bowel diseases that are likely or even plausible. And the likely mechanisms of action include things like reverse signaling via membrane bound TNF and potential Fc effector function mechanisms. There were a lot more biological assays that needed to be assessed for this program. For the TNF antagonists, binding to TNFα and in vitro neutralization of TNFα are considered the highly critical quality attributes that we would want to assess by statistical equivalence. In this case the biosimilar product and the US and EU products all met the tests for statistical equivalence. So for these important mechanisms of action, the products were highly similar. There were some questions about Fc effector function. But you can see using an assay with peripheral blood mononuclear cells, which is maybe less sensitive than using cell lines, but maybe more relevant to the patient situation we used the quality range approach. You can see that the biosimilar product was well within the quality range of both the US and the EU material.

11 However, when using sort of a worst case scenario assay, an NK cell line and then a cell line that was engineered to over express membrane bound TNF, most of the lots were within the quality range, but a couple grew without. However, FDA has determined that because of assay variability and method variability and lot to lot variability, it would be acceptable if 90% of the lots were within the quality range.

12 So this was the case with the more sensitive worst case scenario assay that 90% of the lots fell within the quality range. So we were able to make a conclusion that this biosimilar product was highly similar to the reference product. Etanercept The most recent product was approved August 30. It is a biosimilar etanercept. The advisory committee was in July. Again, this is the etanercept structure. It is a TNF receptor 2, the extracellular portion of that fused to the Fc region of an antibody. It has a complex structure: it has 3 N-linked and 10 O-linked glycans, and it has 13 intra-chain disulfide bonds. I am going to spend a lot of time talking about that. First, here are the methods used to evaluate analytical similarity. Just like the previous two products we had studies to assess primary structure, content, higher order structure, biologic activity very similar approaches, just appropriate methods for this product.

13 This is the TNF portion of the molecule. You can see that it has a complex structure of disulfide bonds. What is known in the literature is that etanercept contains a certain amount of misfolded protein due to wrongly bridged disulfide bonds, and that misfolded protein has a reduced potency. So this became an important quality attribute for us to evaluate.

14 So if you look at the fourth row down, there is a disulfide bond between cysteine 74 and 88. Those disulfide bonds in that area are among the ones that can be misfolded improperly. This sponsor was Sandoz. They identified a peptide as a T7 peptide that contains this area that they did some studies with. I will come back to that in two slides, but first I want to show you that the reason this was a concern for us is that the biosimilar product on average had lower levels of this misfolded protein, which is identified as a hydrophobic variant by reverse phase chromatography. So it had lower levels relative to the US-licensed product as well as the EU product. Sandoz showed by measuring the percentage of T7 peptide in their lots and reference product lots, and also by understanding the bioactivity of those same lots of product, they could show an inverse correlation between the bioactivity and the percent of the peptides. At the higher levels that you had a misfolded peptide, the lower levels of bioactivity. We had several discussions with the sponsor about this. We asked them to explore the possibility if, in vivo, the wrongly bridged variants could refold properly. Unfortunately, the bioanalytical assay in the PK samples was not designed in a way that could help us answer that question. So they did some in vitro studies. Most disulfide bonds are structural and important for holding together the 3D conformation of the protein. But there is a growing body of literature about allosteric disulfide bonds that are very dynamic in vivo, wherever that protein is, where they change. So they can impact. You can have a protein that is maybe not active and then the disulfide bond changes at the appropriate time and now it becomes active. It is important for virus entry for HIV and Ebola and other viruses. There is some literature that suggests TNF receptor proteins themselves have allosteric disulfide bonds. We know for example IgG2 molecules have IgG2 isomers, and IgG4 molecules can form half antibodies and reform bispecifics in vivo. And these are all examples of allosteric disulfide bonds. So Sandoz, if you look at the examples in the top red box, they took a couple of process intermediates that had higher levels of this T7 peptide before they are purified out. And they also took examples of the reference product, both the US reference product and the EU version of the product. You can see that those lots had higher levels of the T7 peptide relative to their purified drug substance or their drug product. And they have lower potency. So they used in vitro redox conditions which were designed to mimic in vivo conditions. You can see they were able to restore potency and reduce the level of the T7 peptide. So based on this information and this correlation they have here, they were able to develop a computed potency model based on level of T7 peptide and what they would expect to happen in vivo. So this sort of summarizes the methods to assess biologic activity.

15 TNF-α binding met statistical equivalence without having to do the redox in vitro studies. The TNF-α neutralization reporter gene assay (RGA) before the redox conditions, did not meet statistical equivalence, although all the lots were within the quality range. But after the redox conditions, the material met statistical equivalence. It is not clear to us why the TNF-α neutralization reporter gene assay was the most sensitive to this, because although nothing else was assessed by statistical equivalence. But everything else fell within the quality ranges. And so we were able to make the determination that the biosimilar was highly similar to the reference product. So here are some clarifications: Issues Warranting Clarification At the Clinical Stage We have defined what it means to be not similar, similar, highly similar, and highly similar with fingerprint-like similarity in the FDA draft guidance on clinical pharmacology. But one thing I want to clarify is that the statutory requirement to be highly similar is for a BLA application. You don t necessarily have to meet that statutory requirement when you submit your IND. So you do not need to be highly similar to initiate clinical studies. You could submit your IND and we might think you are looking highly similar, but you have not made enough lots yet to convince us. Or maybe if we see some differences maybe in, say, high mannose the PK study

16 might address that residual uncertainty. So it is hard to be highly similar when you initiate your IND unless you have already manufactured 20 lots of everything. But we want you to submit your analytical similarity data with you BPD 2 or 3 meeting packages, as well as with the initial IND and any IND amendment at any time during the development, because that allows us to communicate with our clinical colleagues how we think your development program is going and that will help them to assess the extent of additional clinical studies you might need. Again, the purpose of the development continuum is to determine the extent of additional clinical studies. Statistics So this is the advice on statistics we have been giving for a little bit over a year: Again, the statistical analysis is for the BLA submission. It is not needed to initiate clinical studies, although sponsors seem to misunderstand that. We want you to perform a risk assessment of the quality attributes, the same you would for a novel product. You should consider the criticality risk ranking with regard to the potential impact on activity PK/ PD, safety, immunogenicity. That is not anything different than what you would do for a novel product. Our concern is that some sponsors are worried that if they put ten things, twenty things as highly critical, then they will all have to be assessed by equivalence. And that is not the case. We are only likely to look at things like the potency assays, biological activity, in equivalence, and obviously not even all of them. Some of them would be assessed by the quality range. We want you to ask for advice for your individual development programs. And as Manu said, we hope to have the guidance out, but we do not know when yet. Criticality Risk Ranking The criticality risk-ranking approach is the same for novel products and biosimilars. Many attributes should be ranked as critical or highly critical. They won t all be tested using the most rigorous statistical method. In general, assays that assess mechanisms of action should be evaluated using Tier 1 statistical equivalence. Because the first biosimilar protein content was analyzed by statistical equivalence, for Filgrastim, a lot of sponsors thought automatically that that would be a Tier 1, but it is not. So we would consider it for statistical equivalence, if it has a narrow therapeutic index, when the dosing is on the linear portion of the response curve although we might be rethinking that a little bit. But we certainly would not include it for statistical equivalence for something like antibodies with enzymes, which are dosed to saturation. Protein Content Protein content has been much more complicated than I think we thought it would be. We want you to use the correct extinction coefficient. Start by determining the theoretical value. Confirm experimentally for both the biosimilar product, the reference product, and the non-us comparator. Content can be controlled by manufacturing parameters. And an incorrect determination of protein content can impact your dosing and PK.

17 Winning the Marathon It is a marathon. It is a lot of work. We don t want you to quit, but sometimes it feels like you are asking us to pull you across the finish line. Because across a variety of sponsors, variety of products at different points in development, we see several packages that look like there are analytical differences that could impact PK or mechanism of action, and the package is silent on it. The sponsor does not say anything about it. So we sit there and scratch our heads. Are they hoping we won t notice? Or are they hoping we will figure out for them why it is ok? So the pitfalls in these are that you might have poor clone selection, inadequate process development, or you might have just analyzed an insufficient number of biosimilar reference product lots. There are a variety of reasons for this, but you really need to think carefully about it. How different can a product be and still be analytically highly similar? It is not a simple question. Is the attribute or product variant in both the biosimilar and the reference product? Or is it something only in the biosimilar product? If it s present in both, then what is the level of the variant? Again, for something like the C-terminal lysine or an N-terminal pyroglutamic acid, we are not going to care that much about it as we would if it were a deamidation in the CDR that reduces binding. For other product-related variants, such as charge or glycan structures, it depends again on whether the variant impacts the mechanism of action (MOA), PK, PD or immunogenicity or if there is residual uncertainty. So to win the marathon, start by analyzing several lots of reference product, develop your analytical methods, develop your expression system, develop your manufacturing process in particular, the production bioreactor process and expect it to be an iterative process. So it is not just a panel of methods and results. It is the biosimilar product lots and the number of lots. It is the US-licensed reference product and the non-us comparator lots. It is understanding the process to produce a product with consistent quality attributes. And then it is the timing of analytical data submission during clinical development, and it is the totality of the evidence. I would like to thank the first three people on this list [OBP s Tere Gutierrez, Kurt Brorson, and Peter Adams] who reviewed the first three BLA s that were approved, and these were their slides.