Introduction
The requirements for dissolution profile comparison have evolved since the adoption of the ‘Guideline on the investigation of bioequivalence’ (CPMP/EWP/QWP/1401/98 Rev. 1/ Corr **), as reflected in the EMA Clinical Pharmacology and Pharmacokinetics Q & As 3.9 and 3.11 (available atClinical pharmacology and pharmakinetics: questions and answers).
Appendix I of the ‘Guideline on the investigation of bioequivalence’ described the state-of-the-art on dissolution profile comparison at the time of its adoption. The f2 similarity factor was recommended to evaluate dissolution profile similarity and the conditions for its calculation were defined. When the ƒ2 similarity factor was not suitable, (which was generally due to an excessive variability in the percentage dissolved at any of the sampling times selected for the dissolution profile characterisation in any of the formulations under comparison), Appendix I recommended alternative methods.
The possibility to choose these or other additional methods can cause result-driven reporting, which may increase the probability of incorrectly concluding on dissolution similarity. Therefore, identification of the most adequate methodology for dissolution profile comparison was addressed in Q & A 3.9, considering that the acceptance criterion defined in Appendix I of the Guideline was: “The similarity acceptance limits should be pre-defined and justified and not be greater than a 10% difference”.
Furthermore, the most frequently used additional methods were the 90% confidence region of the Mahalanobis Distance (MD) of the percentage dissolved at the different sampling times that characterise the dissolution profiles and the bootstrap 90% confidence interval of the f2 similarity factor. The 90% confidence region of the Multivariate Statistical Distance of the parameters of a dissolution model, e.g., the Weibull function, was not considered adequate due to the difficulty of defining the acceptance limits for the model parameters that would result in a correct characterisation of a 10% difference in the percentage dissolved.
Q & A 3.9 recommended “any approach based upon confidence intervals for f2” and that “similarity could then be declared if the confidence interval for f2 were entirely above 50”. “The properties of the f2 sampling distribution do not allow the derivation of exact confidence intervals to adequately quantify the uncertainty of the f2 estimate. To address this, bootstrap methodology could be used to derive confidence intervals for f2 based on quantiles of re-sampling distributions, and this approach could actually be considered the preferred method over f2 and MD”.
Q & A 3.11 was published subsequently to clarify some of the issues that had not been addressed previously and to avoid some problems observed in regulatory submissions.
For example, the criteria to conclude on the low or high variability of the dissolution profiles based on the relative standard deviation (RSD) or coefficient of variation (CV) of the percentage dissolved in the sampling times was updated according to the ‘Reflection paper on the dissolution specification for generic solid oral immediate release products with systemic action’ (EMA/CHMP/CVMP/QWP/336031/2017), where high variability of the dissolution results have been exemplified as greater than 20% RSD at time-points up to 10 minutes and greater than 10% RSD in the later phase for a sample size of 12. This change was implemented because a tendency to avoid early sampling times (e.g., at 5 minutes) had been observed, which might cause an incomplete characterisation of the profile, and to avoid excessive variability in more than one sampling time (e.g., at 5 and 10 minutes).
Q & A 3.11 also reinforced that all sampling times pre-defined in the dissolution study protocol until the sampling time where one of the products reaches > 85% dissolved should be considered in the f2 calculation since the unjustified exclusion of some sampling time data has been observed when more than 3 valid sampling times were available and more than one sampling time with percentage dissolved exceeding 85% are also frequently considered.
The need to conduct these comparative dissolution studies according to a predefined protocol was also highlighted since pooling of previous data obtained for other purposes and that comply with the desired conclusion is not rare.
More importantly, Q & A 3.11 defined the bootstrap methodology to be employed, since Q & A 3.9 had not defined the confidence level required for the confidence interval and any approach based on confidence intervals for f2 was considered appropriate.
Q & A 3.11 also addressed the expected reporting of study results and the need to confirm the correctness of the calculations of the software packages employed since the use of software packages that provide incorrect results had been observed.
Finally, the ‘Reflection paper on statistical methodology for the comparative assessment of quality attributes in drug development’ (EMA/CHMP/138502/2017) highlights that the ‘Guideline on the investigation of bioequivalence’ introduces dissolution similarity assessment as 'Bioequivalence surrogate inference', which actually implies that inferential statistical methodology would ideally be applied to e.g. infer a similarity-in-dissolution claim from the tablet sample to the whole tablet population (all tablets ever produced by a given manufacturing process) and the Q & A 3.9 noted that from a statistical point of view “in cases when f2 is considered suitable, i.e. can be used as outlined in Appendix 1 of the CHMP guideline on the investigation of bioequivalence [CPMP/EWP/QWP/1401/98 Rev. 1/ Corr **], guideline-compliant evaluation of dissolution similarity does not involve confidence interval estimation to decide upon similarity. The recommended decision criterion is based only upon the derived numerical value for f2 (point estimate ≥ 50). This means that the uncertainty related to the f2 sampling distribution is not accounted for and f2 employed on its own does not have any inferential element. Therefore, it was concluded that to address this, bootstrap methodology could be used to derive confidence intervals for f2 based on quantiles of re-sampling distributions, and this approach could actually be considered the preferred method over f2.
However, it is also essential to consider that use of the point estimate of the f2 similarity factor is widely applied and accepted worldwide in spite of its deficiencies (e.g., ‘ICH M9 Guideline on biopharmaceutics classification system-based biowaivers’ (EMA/CHMP/ICH/493213/2018)).
The present Q & A 3.13 summarises the current expectations for comparing dissolution profiles as surrogate for bioequivalence (e.g., when variations are supported by in vitro dissolution data or additional strengths are biowaived), especially when variability is large in the dissolution dataset.
Sampling times for characterisation of the dissolution profiles
Dissolution profile similarity testing and any conclusions drawn from the results (e.g., justification for a biowaiver) can be considered valid only if the dissolution profile has been satisfactorily characterised using a sufficient number of time points. Sampling time points should be sufficient to obtain meaningful dissolution profiles, and at least every 15 minutes. More frequent sampling during the period of greatest change in the dissolution profile is recommended. For oral immediate release formulations with intestinal absorption, a sampling time for comparison at 15 min is essential to know if complete dissolution is reached before gastric emptying. For rapidly dissolving products, where complete dissolution is within 30 minutes, generation of an adequate profile by sampling at 5- or 10-minutes intervals may be necessary. At least three time points are required: the first time point before 15 minutes, the second at 15 minutes and the third when the release is close to 85% (e.g., 5, 10, 15, 20 and 30 min). This sampling schedule should be extended as necessary if the dissolution rate is slower (e.g., 5, 10, 15, 20, 30, 45, 60 min). For modified release products, the advice given in the relevant guidance should be followed.
Similarity of dissolution profiles
For immediate release oral dosage forms with systemic action, where more than 85% of the drug is dissolved within 15 minutes, dissolution profiles may be accepted as similar without further mathematical evaluation.
The f2 similarity factor is calculated as follows:
In this equation, ƒ2 is the similarity factor, nis the number of time points, R(t) is the mean percent reference drug dissolved at time t after initiation of the study, and T(t) is the mean percent test drug dissolved at time t after initiation of the study. For both the reference and test formulations, percent dissolution should be determined.
The evaluation of the similarity factor is based on the following conditions:
- a minimum of three time points (zero excluded),
- the time points should be the same for the two formulations,
- at least twelve individual values for every time point for each formulation,
- not more than one mean value of > 85% dissolved for any of the formulations, and
- the RSD or CV of any product should be less than 20% at early time points (up to 10 minutes) and less than 10% at other time points.
All sampling times that were pre-defined in the dissolution study protocol until the sampling time where one of the products reaches > 85% dissolved should be used in the f2calculation.
An f2 value between 50 and 100 suggests that the two dissolution profiles are similar. The results should be reported rounded to the nearest integer without decimal units. The similarity acceptance limit of 50 corresponds to a mean difference of 10% in percentage dissolved over all sampling times used. In addition, the dissolution variability of the test and reference product data should also be similar, however, a lower variability of the test product may be acceptable.
Bootstrap methodology
The use of two-sided 90% confidence interval of f2 is the recommended methodology for dissolution comparison in those cases where the variability in the data in any product is greater than 20% in the sampling times in the first 10 minutes or greater than 10% at sampling times after 10 minutes since the use of f2 could be highly influenced by the experimental dissolution data and might not represent the population (true) f2.
The bootstrap analysis should be performed using at least 5,000 samples and the number of samples should be reported. The calculation of the 90% confidence interval should be conducted using any of the percentile methods described by Hyndman and Fan [1]. All sampling times pre-defined in the dissolution study protocol until the sampling time where one of the products reaches >85% dissolved should be considered in the bootstrap dissolution samples.
The estimation of the f2 value on each bootstrapped dissolution sample should be performed using the Expected-f2 (f2,EXP) formula [2].The following equation summarises the mathematical description of the Expected-f2:
Dissolution study protocol
The dissolution study protocol should indicate the study objectives, pre-specify the batches to be compared, the dissolution test conditions (apparatus, media composition, and agitation rate), media de-aeration, sample filtration and analytical methodology, sampling approach, sampling times, and include the full description of methodology employed for dissolution profile comparison (e.g. f2 or bootstrap 90% CI of expected f2 with a percentile method for bootstrap 90% CI, including software to be used, number of bootstrap samples, seeds, etc.).
Reporting of study results
When reporting dissolution profile comparisons, the Applicant should provide individual results of the percentage dissolved at the different sampling times pre-defined in the protocol as well as mean percentage dissolved with its variability (CV(%)) in order to allow the replication of the calculations.
In addition, the applicant should discuss the basis for the similarity conclusion:
- dissolution is > 85% in 15 min for oral products with systemic action,
- f2 similarity factor calculation ≥50, or
- 90% confidence interval of f2,EXP based on bootstrapping. Similarity in dissolution profiles will be concluded when the lower limit of the 90% confidence interval for the f2,EXP is ≥ 50.
The results should be reported rounded to the nearest integer without decimal units.
Software packages
Specific software packages for the calculation of the 90% confidence interval of f2,EXP or in-house platforms could be used. In any case, the software should be adequately documented. In the case of specific software packages, the selected options (e.g., whole vectors, one profile, number of bootstrap samples, seed number (if available)) should be described. In the case of in-house platforms, the code of the platform should be provided, and it should be demonstrated that the employed software is able to calculate the 90% confidence interval of f2,EXP correctly. See [3] for examples of datasets and results.
References
[1] R.J. Hyndman, Y. Fan. Sample quantiles in statistical packages. The American Statistician. 50 (1996) 361–365. https://doi.org/10.1080/00031305.1996.10473566
[2] M.-C. Ma, B.B.C. Wang, J.-P. Liu, Y. Tsong. Assessment of similarity between dissolution profiles., J Biopharm Stat. 10 (2000) 229–249. https://doi.org/10.1081/BIP-100101024.
[3] L. Noce, L. Gwaza, V. Mangas-Sanjuan, A. Garcia-Arieta. Comparison of free software platforms for the calculation of the 90% confidence interval of f2 similarity factor by bootstrap analysis. Eur J Pharm Sci. 2020;146:105259. https://doi.org/10.1016/j.ejps.2020.105259.