Abstract

The approval and granting of marketing authorization for a putative biosimilar is based on strong comparability studies with its biological reference product. This is due to the complexity of the structure and nature of the manufacturing process of biological drugs. Hence, a rigorous analytical workflow for chemical characterization and clinical trials to evaluate the efficacy and safety is required to demonstrate their high similarities to the reference drug. This work is focused on the comparison of the originator of filgrastim with three of its biosimilars by evaluating their structural similarity and biological activity. Qualiquantitative analyses were performed by MALDI-TOF/TOF-MS and RP-HPLC-UV. An innovative functional assay using zebrafish as the animal model was developed to evaluate the biological activities of the drugs. The different analyses performed in this study highlighted the structural similarity of biosimilar drugs and their originator. This result was further confirmed by a similar in vivo biological activity.

1. Introduction

Biosimilar drugs have been on the market for some decades now. However, the use of biosimilars in Italy and entire European Union remains to be controversial. A biosimilar, in order to be authorized in the market, must guarantee similar quality, safety, and efficacy as its reference drug [1].

It is known that biosimilars have structures that are closely homologous to the original drug. Indeed, the complexity of the protein structure and the manufacturing process selected (i.e., host cell, production system, and purification) could lead to minor differences in the physiochemical properties of the biological drugs. But these differences should not affect their biological activity, efficacy, and safety. Therefore, comparability studies, based on a thorough physiochemical and functional characterization, are mandatory to demonstrate the similarity between biosimilars and their originator [1].

To address this, a robust analytical workflow should be used to determine the primary, secondary, and tertiary structures of the protein and to identify posttranslation modifications (e.g., glycosylation pattern). It is also necessary to assess possible product-related impurities and to evaluate the stability of the biodrug using its degradation profile. Lastly, functional assays should be performed to test their biological activities as well as clinical efficacy and safety [2, 3].

This information is mandatory for the development of biosimilars and grinding of their marketing authorization [4, 5].

Biosimilars available on the market can be categorized based on chemical and therapeutical criteria. The most common representative drugs belong to the growth factor family, vaccines, and monoclonal antibodies. For this study, the focus is on a group of the hematopoietic growth factor called the granulocyte colony stimulating factor (G-CSF).

G-CSF is a member of regulatory proteins known as cytokines. It stimulates differentiation, survival, and migration of granulocytes as well as activating neutrophils in vitro and in vivo [6, 7]. Native G-CSF is a 22,000 Daltons glycoprotein composed of a single polypeptide chain of 207 amino acids, with glycosylation at Thr-166 (uniprot_P09919) [8]. The native G-CSF is coded by a gene on chromosome 17 where two isoforms are derived from alternative splicing. Both isoforms are made up of a signal peptide of 29 amino acids (1–29) followed by a peptide chain with 178 amino acid residues for isoform A and 175 amino acid residues for isoform B. The difference is accounted by the addition of three more residues (Val-Ser-Gln) in isoform A inserted after Leu65. Isoform B, which is more acidic, elicits a higher biological activity and greater stability than the longer isoform A. Because of this property, its amino acids sequence was chosen as a template for the comparison of the commercially produced G-CSF filgrastim [9].

Filgrastim is produced by recombinant DNA technology using bacteria, particularly Escherichia coli as host cells. It is primarily used to reduce the effect and duration of neutropenia due to different etiologies [10]. This protein consists of 175 amino acids with a molecular weight of approximately 18,800 Da [11]. The sequence of filgrastim is identical to isoform B of G-CSF isolated from human cells, except for two modifications: (1) the presence of a methionine in the N terminal position of the recombinant protein instead of an alanine (r-met Hu G-CSF) and (2) the absence of glycosylation [9, 12], which is due to the lack of posttranslational modifications mechanism in the E. coli expression system. Despite these differences, filgrastim preserves the biological activity of the isoform B of human G-CSF [13].

The molecule contains a free cysteine (Cys) at position 18 and two intramolecular disulfide bonds Cys37-Cys43 and Cys65-Cys75 that are essential to preserve a properly folded tertiary structure of the rhG-CSF molecule [14, 15].

Filgrastim was first sold in the market in 1991. Since the expiration of its patent, more than fifteen biosimilars are commercially available in the world. Although several comparability studies on their structures and function have already been reported in the literatures [1620], this study is the first one, to the best of our knowledge, to compare all four drugs distributed in Italy based on the structural and functional point of view. This is important to properly inform end users (i.e., patients and medical staff) about biosimilar drugs as an alternative, rather than preventing them from using it due to lack of trust.

In this paper, we showed the comparison between the originator of filgrastim Granulokine® (Italian brand name of Neupogen®, GRA) with its biosimilars supplied on Italian market, namely, Nivestim™ (NIV), Tevagrastim® (TEV), and Zarzio® (ZAR). We particularly focused on the in-depth characterization of the peptide sequences for all drugs to unravel possible posttranslational modification that could lead to changes in amino acid composition and structure. By using sophisticated instruments such as reversed-phase high-performance liquid chromatography RP-HPLC and mass spectrometry, we were able to confirm the structural similarity of the four drugs and identified the presence of one disulfide bridge, which was found to be essential for the biological activity of these proteins.

Moreover, in order to provide information about their functional activity, we performed an innovative functional analysis by utilizing zebrafish embryo (Danio rerio), to compare the effect of the drugs on immune system cell activation. Recently, zebrafish has become a very interesting animal model in different fields, including pharmacology and toxicology [2128]. Aside from small molecule discovery and screening, it is now used for complex biological drugs characterization, such as biosimilars [29, 30]. Zebrafish is a less expensive and more manageable organism than the conventional animal models, such as the mouse or rat. It allows to conduct fast and reproducible tests for high throughput screenings [31, 32].

Zebrafish is widely used as the animal model to study vertebrate hematopoiesis in vivo. Genetic and molecular pathways are highly conserved between zebrafish and mammals. The morphology and function of their blood cells are also very similar. Monocytes/macrophages and granulocytes neutrophils, together with erythrocytes, are the first blood cells to enter the bloodstream at around 24 hours postfertilization (hpf) [33, 34]. The white blood cells of the innate immune system are able to respond quickly to proinflammatory stimuli, reaching the site of the stimulus in few minutes. Because the embryos are transparent and there is no adaptive immunity yet until the third day of development, microinjection of exogenous substances to the embryos is widely used as a validated protocol for in vivo studies of innate immune response (neutrophils and macrophages) [30, 3537].

Filgrastim, GRA, and its biosimilars are analogues of zebrafish G-CSF, which promotes the proliferation and differentiation of granulocytes. G-CSF and its receptor G-CSFR have been identified and well characterized in zebrafish, revealing conservation of gene and protein sequences among different species, including human and mouse. Moreover, it was demonstrated that the functional G-CSF/G-CSFR mechanism of action is highly conserved [38]. Zebrafish embryos may thus represent a suitable in vivo animal model to study the effects of exogenously administered G-CSF.

The results obtained indicate similar functional and structural properties of all tested compounds supporting the assessment of biosimilarity among them.

2. Materials and Methods

2.1. Materials

Salts and solvents with the highest degree of purity available on the market have been used for chemical analyses. Sodium dihydrogen phosphate buffer (NaH2PO4), ammonium bicarbonate (NH4HCO3), acetonitrile LC-MS CHROMASOLV®, trifluoroacetic acid (TFA), N-methyl N-(trimethyl-silyl) trifluoroacetamide (MSTFA), glutamic acid, 1,4-dithiothreitol (DTT), iodoacetamide (IAA), calibration kits for MALDI-TOF-MS analysis (ProteoMass™ Peptide and Protein MALDI-MS calibration kit), sinapinic acid (SA), α-cyano-4-hydroxycinnamic acid (CHCA), Endoproteinase Glu-C from Staphylococcus aureus V8 and chymotrypsin sequencing grade from bovine pancreas from Sigma-Aldrich (Roche_cod.n.11420399001 and 11418467001, respectively), Zip-Tips® SCX, and Zip-Tips® C18 were purchased from Sigma Italia (Milan, Italy).

The wild-type AB zebrafish strain (cat. #1175) was purchased from the EZRC—European Zebrafish Resource Center, Institute of Toxicology and Genetics (Eggenstein-Leopoldshafen, Germany).

2.2. Drugs Tested

All drugs were obtained in their commercially available forms as an injection solution in prefilled syringes (one batch for each compound), as reported in Table 1. Filgrastim reference standard (Y0001173_ European Pharmacopoeia Reference Standard) was purchased from Sigma Italia (Milan, Italy).

2.3. pH Value Determination

The pH of the pharmaceutical preparation GRA and its biosimilars NIV, TEV, and ZAR was measured on three different sampling batches at 25°C by using the pH meter Basic 20 (Crison). The high quality sensor probe 50 28 (Crison), a specific electrode for pH measurement in small sample volumes, possesses the technical specifications required by the European Pharmacopoeia [12]. The pH value is expected in a range between 3.8 and 4.2 [39], and it must not differ by more than 0.05 pH units from the value corresponding to the sample in analysis [12].

2.4. Characterization of Intact Proteins
2.4.1. Qualitative and Quantitative RP-HPLC-UV

Fifteen microliters of each biological drugs were analyzed, on three different sampling of each batch, by RP-HPLC-UV using a Dionex™ UltiMate™ 3000 Thermo Fisher Scientific S.p.A (Milan, Italy) equipped with a LPG-3400SD quaternary analytical pump, a WPS-3000SL analytical autosampler, a VWD-3100 UV-Vis detector, a TCC-3000SD thermostatted column compartment, and an AFC-3000 automatic fraction collector. Chromatographic separation was performed using a Thermo Scientific Hypersil GOLD C8 column (100 mm × 3 mm ID, particle size 3 μm). The specific liquid chromatographic (LC) parameters were as follows: mobile phase (A) water : acetonitrile : TFA (90.00 : 10.00 : 0.10 v/v/v) and (B) acetonitrile : water : TFA (70.00 : 30.00 : 0.10 v/v/v); the mobile phase flow rate was 0.5 mL/min; gradient program: 25% B for 1 min, from 25% to 95% B in 25 min, and 95% B for 1 min and then from 95% to 25% B in 0.5 min and reequilibration to 25% B for 5 min. All analyses were performed at 30°C. The detection wavelength (ƛ) was set at 215 nm.

LC method optimization was carried out using the filgrastim reference standard at concentration of 340.00 μg/ml (data not shown).

For the quantitation of filgrastim, a calibration curve was done using 5 different dilutions of the filgrastim reference standard (340.00 μg/ml, 170.00 μg/ml, 85.00 μg/ml, 42.50 μg/ml, and 21.25 μg/ml). Fifteen microliters of each standard dilution was analyzed by RP-HPLC-UV as reported above.

The calibration curve was plotted using the area under peak versus filgrastim standard reference concentrations. GRA, NIV, TER, and ZAR concentrations were calculated by determining their area under peak and comparing them to the area with the calibration curve (Electronic Supplementary Material Figure S1).

2.4.2. MALDI-TOF/MS

Before mass spectrometry analyses, proteins were desalted and concentrated by using ZipTip®SCX, according to the manufacturer’s protocol.

An equal volume of purified proteins was mixed with an equal volume of the appropriate matrix, SA (10 mg/ml, in acetonitrile : water : TFA 70.00 : 30.00 : 0.10, v/v/v). One microliter of the mixture was loaded on the MALDI plate and allowed to dry at room temperature (rt).

Experiments were carried out on three different sampling for each batch on an AB Sciex 5800 MALDI TOF/TOF-MS, equipped with a nitrogen laser (k = 337 nm). Protein samples were analyzed using the midlinear mode, setting the laser intensity at 5,000 μJ with a pulse laser of 400 Hz, the detector voltage multiplier at 0.68, and recordering the spectra in a mass range from 4,000 to 45,000 Da with a focus mass of 18,800 Da.

2.4.3. GC-MS Analysis

An aliquot (2 µl) of the peak eluted at 1.1 min in ZAR samples was collected during three independent RP-HPLC qualitative analysis, and it was analyzed by GC-MS using the method previously reported by Gianoncelli et al. [29] with the capillary column HP-5MS (30 m, 0.25 mm ID, 0.25 mm film thickness; J&W Scientific, Folsom, CA, USA).

2.5. Enzymatic Digestion with Endoproteinase Glu-C

The whole procedure was carried out in a laminar flow with sterilized hood and powder-free gloves, in order to reduce keratin contamination.

The enzymatic digestion was conducted in both reducing and nonreducing conditions, on three different samplings for each batch.

2.5.1. Reducing Condition

Five micrograms of the filgrastim standard (corresponding to 8.30 μl, 0.27 nmol) was diluted with 16.70 μl of 0.02 M NaH2PO4 with pH 7.80. This is necessary to maintain the pH value between 7.50 and 8.00, which is the optimum value for the enzyme activity. The solution containing the protein was reduced with 2 μl of 0.1 M DTT solution and was kept at 56°C for 1 h. In order to alkylate the thiol residues, 2 μl of 0.027 M IAA solution was added, and the mix was kept at rt for 20 minutes. The solution was digested with 1 μl of an aqueous solution of Glu-C (0.50 μg/μl, enzyme : substrate ratio of 1 : 10 (w/w)), and it was kept overnight (o/n) at 37°C. Subsequently, GRA, NIV, TEV, and ZAR were digested using the same protocol.

2.5.2. Nonreducing Condition

Ten micrograms of filgrastim standard (corresponding to 16.70 µl, 0.53 nmol) was digested with 2 μl of aqueous solution of Glu-C (0.50 μg/μl, enzyme : substrate ratio of 1 : 10 (w/w)) and diluted with 50 μl of NaH2PO4 (0.02 M pH 7.80), which is necessary to maintain the pH value of 7.50, the optimum value for the enzyme activity. The solution was kept o/n at 37°C. Subsequently, GRA, NIV, TEV, and ZAR were digested using the same protocol.

2.6. Enzymatic Digestion with Chymotrypsin

Five micrograms of each previously digested drug (corresponding to 13.50 µl, 0.27 nmol) were collected and digested with 2.50 μl of chymotrypsin solution (in 1 mM HCl) (0.1 μg/μl), at an enzyme : substrate ratio of 1 : 200 (w/w) diluted in 1 : 10 NH4HCO3 (0.05 M, pH 8). The solution was kept o/n at 37°C. Each experiment was conducted on three different sampling for each batch.

2.7. Peptide Mapping
2.7.1. RP-HPLC-UV

Twenty microliters of each drug digested with Glu-C were separated on a Waters XSELECT CSH C18 column (150 mm × 2.1 mm ID, particle size 3.5 μm); the mobile phase composition was (A) water : acetonitrile : TFA (95.00 : 5.00 : 0.05 v/v/v) and (B) acetonitrile : water : TFA (95.00 : 5.00 : 0.05 v/v/v); the mobile phase flow rate was 0.2 ml/min. The gradient program was as follows: from 10 to 50% B in 40 min, then to 90% B in 5 min for 2 min, then to 10% B in 3 min, and reequilibration to 10% B for 15 min. All analyses were performed on three different samplings for each batch at 30°C; the detection wavelength was set at 215 nm.

LC method optimization was carried out using filgrastim reference standard digested with Glu-C (data not shown).

2.7.2. MALDI-TOF/TOF-MS

Before mass spectrometry analyses, peptides were desalted and concentrated by using ZipTip®C18, according to the manufacturer’s protocol.

An equal volume of purified peptides was mixed with an equal volume of the appropriate matrix, CHCA (10.00 mg/ml, in acetonitrile : water : TFA 70.00 : 30.00 : 0.10, v/v/v). One microliter of the mixture was loaded on the MALDI plate and allowed to dry at rt.

Experiments were carried out on three different samplings for each batch on an AB Sciex 5800 MALDI TOF/TOF-MS, equipped with a nitrogen laser (k = 337 nm). Peptide samples were analyzed using the reflector mode, setting the laser intensity at 3,500 μJ with a pulse laser of 400 Hz, the detector voltage multiplier at 0.52, and recording the spectra in a mass range from 700 to 2,600 Da with a focus mass of 1,600 Da. All mass spectra resulted from accumulation of at least 1,500 laser shots using a random search pattern.

The most intense peptides were subjected to tandem mass spectra, using the MS/MS mode setting with the laser intensity at 5,000 μJ with a pulse laser of 1,000 Hz and the detector voltage multiplier at 0.60. All mass spectra resulted from the accumulation of at least 17,500 laser shots using a random search pattern.

MALDI-TOF/TOF mass spectrometer was calibrated using ProteoMass™ Peptide and Protein MALDI-MS Calibration Kit.

2.8. Fish Maintenance and Egg Collection

All zebrafish embryos were handled according to national and international guidelines, following protocols approved by the local Committee (OPBA protocol nr 211B5.24) and authorized by the Ministry of Health (authorization nr 393/2017-PR). Healthy adult wild-type zebrafish (AB strain) were used for eggs production. Fish were maintained under standard laboratory conditions as described [40], at 28°C on a constant of 14 h light/10 h dark cycle. Immediately after spawning, fertilized eggs were harvested, washed, and placed in 10 cm Ø Petri dishes in fish water. The developing embryos were incubated at 28°C and maintained in 0.003% (w/v) 1-phenyl-2-thiourea to prevent pigmentation.

2.9. In Silico Analysis

The human G-CSF protein information collected in the UniProt database (uniprot_P09919) were used to obtain the human G-CSF receptor (CSF3R) entry, in the “Protein Interaction” section [8]. Ensembl full-length sequences of the CSF3R transcript and protein were deduced and used to search the zebrafish assembly on BLAST. The Ensembl genome sequence database at http://www.ensembl.org supplies the more updated genome assemblies, which are GRCh38.p12 (Db version 92.38) for human and GRCz11 (Db version 92.11) for zebrafish [41]. One full-length zebrafish csf3r transcript, encoding for one Csf3r protein, was identified. A comparative analysis of gene organization between human and zebrafish G-CSF receptor genes was carried out, employing information supplied by the Ensembl database [41]. Synteny analysis was performed using the Synteny database and the Genomicus genome browser [42, 43]. The protein sequences of human and zebrafish G-CSF receptor were employed to perform multiple sequence alignment by the Clustal Omega program [44].

2.10. Leukocytes Quantification

The reference standard of filgrastim, as well as the originator GRA and its biosimilars TEV, NIV, and ZAR, was diluted in 0.05% (w/v) phenol red solution (Sigma-Aldrich) to the final concentration of 250 ng/μl. At 48 hours post fertilization (hpf), 1 nl of each dilution was injected into the otic cavity of dechorionated zebrafish embryos. As a negative control, 0.05% (w/v) phenol red solution without the pharmaceutical compounds was used. Escherichia coli JM109 bacteria in 0.05% (w/v) phenol red solution were used as positive control [37]. Embryos were incubated at 28°C for 2 h after injection and then fixed in 4% (w/v) paraformaldehyde in PBS overnight at 4°C. Fifty embryos for each injected compound were used to perform whole-mount in situ hybridization (WISH), according to Thisse protocol [45]. The probes pu1, lplastin, and mpx were selected to identify early zebrafish myeloid progenitors, monocytes/macrophages, and granulocytes neutrophils, respectively [46]. Embryos were mounted in agarose-coated dishes, and images were taken with a Leica MZ16 F stereomicroscope equipped with DFC 480 digital camera and LAS Leica Imaging software (Leica, Wetzlar, Germany). Leukocytes quantification was performed using ImageJ 1.45 s image analysis software. Quantifications are expressed as a mean ± standard deviation of three independent experiments.

2.11. Statistical Analysis

Statistical analyses were done using GraphPad Prism software 6.01 version (La Jolla, CA, USA). One-way ANOVA followed by Dunnett’s test was performed to identify statistically significant differences among the different groups of data, considering a value < 0.05 as the threshold for a significant difference.

3. Results

3.1. pH Values

The measured pH values are the following: GRA 3.95 ± 0.03, NIV 3.90 ± 0.02, TEV 4.12 ± 0.02, and ZAR 4.28 ± 0.02. All values were within the range of variability due to the presence of acetic acid and sodium hydroxide (sodium acetate) as excipients in the pharmaceutical formulation (Table 1). As claimed by the manufacturer, ZAR had a different composition of the excipients; i.e., the sodium acetate was replaced by glutamic acid. However, this difference did not result in significant changes in the pH value.

3.2. Qualitative Analyses of Intact Proteins by RP-HPLC-UV and MALDI-TOF-MS

In order to evaluate the purity of the protein present in the formulations, as well as the possible presence/absence of degradation products or aggregates, we performed RP-HPLC-UV analyses on the pharmaceutical preparations under study.

The results obtained from RP-HPLC were confirmed also by MALDI-TOF-MS and analytical technique that allow the measurement of intact proteins with molecular weight higher than 100,000 Da.

3.2.1. RP-HPLC-UV Qualitative Analysis

RP-HPLC-UV analyses demonstrated the presence of a single molecular species (presumably filgrastim) in the originator GRA as well as in its biosimilar products NIV, TEV, and ZAR with no sign of product-related impurities.

The overlapping chromatograms of the drug under study are reported in Figure 1. It is possible to note the presence of only one major peak that eluted at 17.500 min, which was the same retention time of the filgrastim reference standard (data not shown). A zoom of chromatograms highlighted a small variation on the retention time of each individual peak probably due to the instrumental error. However, this difference was still in the acceptable range of variability. The filgrastim protein present in the originator GRA and in its biosimilar products NIV, TEV, and ZAR eluted at 17.567, 17.505, 17.572, and 17.466 min, respectively.

Interestingly, only in the chromatogram of ZAR (Figure 1), it was possible to detect a second minor peak, which eluted at 1.100 min, close to the solvent front. This peak was recovered during RP-HPLC-UV analysis by a fraction collector and analyzed by GC-MS (Electronic Supplementary Material Figure S2A, S2B, S3A, and S3B). It was identified as the glutamic acid used as excipient only in ZAR pharmaceutical formulation (Table 1).

3.2.2. MALDI-TOF-MS Qualitative Analysis

MALDI-TOF-MS analysis was performed to determine the molecular weight of the proteins contained in the pharmaceutical formulations. As shown in Figure 2, the mass spectra of the samples were similar and characterized by 3 peaks with the m/z value of about 9,400, 18,800, and 37,700, respectively.

While the main peak with an m/z value of approximately 18,800 represented the single charge of filgrastim, the m/z value of about 9,400 represented the double charge of filgrastim and the m/z value of about 37,700 could represent the single charge of the filgrastim dimer.

The major peak had an average mass value in the range of 18,831–18,842 Da, which was in agreement with the molecular weight of the filgrastim standard reference (18,799 Da). This result suggested that the main component of these preparations is indeed filgrastim. No peaks corresponding to contaminants or impurities were observed in the mass profile.

3.3. Quantitative Analysis of Intact Proteins by RP-HPLC-UV

In order to quantify the protein amount from different pharmaceutical preparations, a calibration curve was prepared using the filgrastim reference standard at different concentrations (Electronic Supplementary Material Figure S1). The linearity of the calibration curve was assessed over a range of 21.25–340.00 µg/ml. The method proved to be linear with an R2 = 0.9999.

The quantitative analyses of GRA, NIV, TEV, and ZAR showed that their respective average concentrations were 579.9305 ppm (with a recovery of -3.345%), 592.2905 ppm (with a recovery of −1.285%), 596.1610 ppm (with a recovery of −0.640%), and 570.4979 ppm (with a recovery of −4.917%) (Electronic Supplementary Material Figure S4 and Table S1). Results obtained for each pharmaceutical preparation correspond to the claims on the label by the individual manufacturers.

3.4. Structural Analyses of Drugs: Peptide Mapping

To better characterize the chemical structure of filgrastim, the originator and biosimilar drugs were studied using a proteomic approach. Samples were digested by Endoproteinase Glu-C followed by a chymotrypsin digestion, in order to obtain the optimum length of the peptides for the analysis in MALDI TOF/TOF-MS. Both digestions were performed in nonreducing conditions to verify the presence of disulfide bridges, which are essential for the biological activity of this protein. On the contrary, Glu-C digestion was conducted in reducing conditions; this has been helpful in order to demonstrate the presence of the disulfide bridge. Peptides resulted from both digestions were purified by Zip-Tip C18 and analyzed by MALDI-TOF/TOF mass spectrometry and RP-HPLC-UV.

3.4.1. RP-HPLC-UV Qualitative Analysis

Figure 3 shows the RP-HPLC-UV chromatograms of the drug samples previously digested in the nonreducing condition. As shown in panel A, the chromatographic profiles of peptides were identical for each drug.

The data were then compared to the peptides obtained from digestion of the filgrastim reference standard (data not shown). Although all digested proteins had a similar profile, the intensity of some individual signal was slightly different; this was illustrated better by overlapped chromatograms (Figure 3(b)). The apparent discrepancy was probably due to the enzymatic digestion process which, although performed under the same experimental condition, may result in different cutting efficiencies, thereby generating different concentrations of peptide products. In general, sometimes during the digestion process, proteases do not cleave perfectly the protein even if we are in the same experimental condition; so, one or more missed cleavages must be taken into account when analyzing the peaks resulting from mass spectrometry and liquid chromatography analyses.

3.4.2. MALDI-TOF/TOF-MS Qualitative Analysis

In Figure 4, we report the comparison of MALDI-TOF/TOF MS analysis of filgrastim after proteolytic digestion in nonreducing (Figure 4(a)) and reducing (Figure 4(b)) conditions. In both cases, the mass range comprised between m/z 1,522 and m/z 1,660 was zoomed to better highlight possible differences in the spectrum. As shown in Figure 4(a′), we were able to identify one peak with the m/z value of 1,532 that corresponded to the peptide “35KLCATYKLCHPEE47,” in which C37 and C43 are linked by an intramolecular disulfide bond that is necessary for the activity of the compound (Figure 5) [47].

To confirm this result, we performed protein digestion under reducing conditions to break the disulfide bond. As expected, the peak with m/z of 1,532 was not detectable in the relative mass spectrum (Figure 4(b′), while we were able to identify two new peaks (m/z values of 1,534 and 1,648) which are consistent with the theoretical molecular weight of the peptide “35KLCATYKLCHPEE47” with reduced (+2 Da) and alkylated (+57 Da for each Cys) disulfide bond.

These data indicate that the analytical approach employed in this study can be used to verify the integrity of chemical bonds within the filgrastim molecule.

A similar pattern of results was obtained when the MALDI-TOF/TOF MS analysis was applied to the originator GRA and its biosimilar products NIV, TEV, and ZAR.

The relative data are reported in Table S2 (Electronic Supplementary Material Table S2). The peptide mass values, experimentally determined, matched with the theoretical peptide mass of filgrastim obtained from the in silico digestion [48]. As expected, the small peptides consisting of few amino acids were lost during sample purification or suppressed due to interference by matrix ions in the low m/z range and thus were not detected.

In peptide maps, peak patterns of GRA and its biosimilars (NIV, TEV, and ZAR) were comparable, with no additional or missing peptides detected, indicating the identical primary structure. The tandem mass analysis allowed us to confirm the sequence, amino acid for amino acid, of each peptide corresponding to the peak found during analysis in MALDI-TOF/TOF-MS (data not shown).

Overall, these results showed that the masses of the peptides generated by the Glu-C/chymotrypsin digestion of the originator GRA and its biosimilar products NIV, TEV, and ZAR were identical and superimposable to the theoretical and experimental mass of filgrastim. In addition, disulfide bonds were preserved in all drugs as shown by the data obtained after digestion in nonreducing conditions.

3.5. In Silico and In Vivo Analysis on Zebrafish Embryo

In order to evaluate if zebrafish could be used as an animal model for filgrastim functional analysis, we first assessed the similarity between human and zebrafish G-CSF receptors by in silico analysis. We then investigated the effects of filgrastim on the activation of the innate immune response in vivo.

The information about human G-CSF protein (P09919) supplied by the UniProt database [8] allowed us to identify, in the “Protein Interaction” section, the full length of the human G-CSF receptor (CSF3R) entry: Q99062. This was used to search the Ensembl database for CSF3R 836 amino acids full length protein sequence, with accession number of ENSP00000362198 [41]. The human CSF3R protein is encoded by the 3,373 bp transcript ENST00000373106.5, which is the product of the CSF3R gene ENSG00000119535 located on human chromosome 1. The Ensembl human G-CSF receptor protein and transcript sequences correspond to the NCBI database RefSeq: NP_000751 and RefSeq: NM_000760, respectively [49]. The human CSF3R sequence was used to BLAST search the zebrafish GRCz11 Ensembl genome assembly [41]. We identified one full-length zebrafish Csf3r transcript ENSDART00000063986.6 (RefSeq: NM_001113377), which is the product of csf3r gene ENSDARG00000045959, located on zebrafish chromosome 16. The 2,687 bp zebrafish transcript encoded for the 810 amino acids Csf3r protein ENSDARP00000063985 (RefSeq: NP_001106848.1).

The gene organization of both human and zebrafish G-CSF receptor appeared very similar, constituted by 16 exons, of which 15 coding, and 15 introns. It is well established that a conserved colocalization of gene clusters among different species often correspond to a conserved protein function [50]. A synteny analysis was performed between the human chromosome 1 and the zebrafish chromosome 16, in the genomic region that contains the G-CSFR gene. By using the Genomicus genome browser [43], we found two paralogs and five ortholog genes in the G-CSFR syntenic region, two of which maintained the same orientation and three orientated in the opposite direction. The analysis was repeated by using the Synteny database [42], which allowed the evaluation of a more extended genomic region. By analyzing a 100-gene window, one more ortholog gene associated with the G-CSFR gene was highlighted, while enlarging the window size to 200 genes, 74 gene pairs were found.

The amino acid sequences of human and zebrafish G-CSF receptor were employed to perform Clustal Omega multiple sequence alignment [44]. The human CSF3R amino acid sequence covered 93% of zebrafish Csf3r sequence, with 29% identity and an overall similarity of 43%. By using the information collected in the UniProt database [8], we identified in the human G-CSF receptor sequence (UniProt: Q99062) the most important domains and the amino acids responsible for posttranslational modifications. The eight cysteines located in position 26, 46, 52, 101, 131, 142, 248 and 295 on human CSF3R protein were conserved in the zebrafish Csf3r sequence in position 26, 46, 52, 100, 129, 141, 234, and 284, respectively. These amino acids are essential to form four disulfide bonds, necessary for the correct protein folding and function. Six out of eight N-glycosylation sites in the human G-CSF receptor sequence in position 51, 93, 128, 134, 579, and 610 were conserved in the zebrafish csf3r sequence in position 51, 92, 126, 132, 552, and 589, respectively. The WSXWS motif in position 318–322, necessary for proper protein folding and receptor binding, was also present in the zebrafish Csf3r sequence in position 306–310. Moreover, box 1 motif in position 658–666, required for JAK interaction and activation, was well conserved in position 638–646 of the zebrafish Csf3r sequence. Data obtained by computational analysis strongly suggested that the zebrafish G-CSF receptor could recognize and bind human recombinant G-CSF protein.

To perform a functional analysis in vivo of filgrastim, as well as the originator GRA and its biosimilars TEV, NIV, and ZAR, we conducted preliminary experiments in order to set the optimal drug concentration to be used in zebrafish embryos. Based on the data available in the literature, several increasing doses of filgrastim reference standard were selected to be administered to the embryos. In a particular study on the effect of gamma radiation, the dose of filgrastim (Neupogen®) injected subcutaneously in mice was 300 µg/kg [51]. In another preclinical study, filgrastim (Neupogen®) was administered subcutaneously to neutropenic and nonneutropenic rats at the following doses: 10, 20, 30, 100, and 500 µg/kg [52]. In human patients, filgrastim is used in clinical practice at 1–10 µg/kg/day, depending on the pathology to be treated and on its severity, as reported in the manufacturer leaflets. Based on the collected data, the following range of doses administered to the zebrafish embryos were 1, 5, 10, 100, and 500 µg/kg, which correspond to pg/mg. Since the average weight of a 48 hpf embryo is 0.5 mg, the final doses of filgrastim reference standard administered were 0.5, 2.5, 5, 50, and 250 pg/embryo.

Following a well-established protocol [36, 37], the filgrastim reference standard was diluted in 0.05% phenol red solution to the selected concentrations and was injected into the otic capsule of healthy zebrafish embryos at 48 hpf. Negative control embryos were injected with the 0.05% phenol red solution without the drug, while Escherichia coli JM109 in 0.05 % (w/v) phenol red solution was used as the positive control. Twenty-five embryos for each point were injected. Treated embryos were incubated at 28°C for 2 h after injection to let the drugs act, and then a WISH was performed. The mpx probe was selected to identify granulocytes neutrophils in the preliminary experiments [46]. Concentration curves demonstrated a concentration-dependent increase in the neutrophils number in the site of injection. Moreover, none of the tested doses resulted lethal or harmful for the embryos (data not shown). Based on the results obtained from the preliminary experiments, we chose the highest dose of 250 ng/μl to be used in succeeding experiments.

It was verified if the originator GRA and its biosimilars TEV, NIV, and ZAR could stimulate granulocytes in zebrafish embryos with similar potency compared to the reference standard of filgrastim, as well as compared among them. Following the same protocol used in preliminary experiments, all the samples were injected into the otic capsule of healthy zebrafish embryos at 48 hpf. The probes pu1, lplastin, and mpx were selected to identify early zebrafish myeloid progenitors, monocytes/macrophages, and granulocytes neutrophils, respectively [46]. Results are reported in Figure 6, in which the amount of the three leukocyte populations is shown as a percentage of the embryo area marked by the corresponding probe. Both myeloid progenitors, monocytes/macrophages, and granulocytes neutrophils were significantly attracted () to the site where E. coli (positive control, black column) was injected as compared to the ones injected with negative control (white column). The percentage of leukocytes in the positive controls was 2.95-, 1.71-, and 2.19-fold higher than negative controls for pu1, lplastin, and mpx probes, respectively. The number of myeloid progenitors, macrophages, and neutrophils was increased in a statistically significant way () when compared with the respective negative controls. Likewise, it was observed in embryos treated with all the tested drugs (gray columns), when compared with the respective negative controls. The amount of the three leukocyte populations in treated embryos was comparable to that observed in the respective positive control embryos. The increase of pu1-positive cells was around 2.5 times higher than negative controls in embryos treated with the reference standard of filgrastim, as well as GRA, TEV, NIV, and ZAR. The percentage of the lplastin-positive area in treated embryos was about 1.5-fold higher than negative controls, while the increase of mpx-positive neutrophils was around 2 times higher than negative controls in embryos treated with all the tested drugs.

These data showed that the analyzed pharmaceutical compounds can efficiently activate the innate immune response in vivo, with similar potency in the positive immunomodulatory action when compared to the reference standard of filgrastim, as well as when compared among them.

4. Discussion and Conclusion

Here we show the results of a comparative study among the biotechnology drug GRA and three of its biosimilars NIV, TEV, and ZAR. This is the first time, to the best of our know how, in which all four drugs distributed in Italy are compared at the same time and in the same way both from a structural and functional point of view. Quantitative and qualitative chemical evaluation has been assessed by recognized techniques such as RP-HPLC-UV and MALDI-TOF-MS while biological activity has been studied in vivo using an innovative experimental animal model represented by zebrafish embryos.

Quantitative analyses showed that concentration of filgrastim, in all analyzed drug preparations, matched the label claims made by the corresponding manufacturers. The qualitative analysis performed by RP-HPLC-UV demonstrated that all four drugs were characterized by a single main peak. The only difference is in ZAR, which had an additional peak, and was identified by GC-MS as glutamic acid.

However, it should be emphasized that molecular weight determination, even when associated with known DNA sequence inserted in the known cell system, is not sufficient to identify a molecular species. Coding errors as well as posttranscriptional modifications may indeed alter the primary, secondary, or tertiary protein structure. Thus, with MALDI-TOF/TOF-MS, we were able to demonstrate that all four drugs had the same amino acid sequence, and in nonreducing conditions, both disulfide bridges were conserved.

Finally, by functional in vivo analysis using the zebrafish embryo, we showed that there is no difference in their biological activities. We have confirmed in vivo that TEV, NIV, and ZAR are similar to their originator GRA in terms of efficiency in activating the innate immune response, with similar positive immunomodulatory action.

Today’s complex biologic formats and the ever-increasing regulatory demands necessitate accurate and robust analytical methods to characterize a molecule. Thus, the use of the most effective tools to assess a molecule’s physicochemical properties, especially in the case of biosimilars and innovators, will ensure that the end product is on a par with the original, safe, and efficacious.

The robust comparative analytical and functional data reported in this study are important in order to increase the knowledge of end users (i.e., patients and medical staff) so that there is not restriction on the use of biosimilar due to lack of trust because it supports the biosimilarity among the biosimilar drugs Nivestim™, Tevagrastim®, and Zarzio® and the reference product Granulokine®.

Abbreviations

CHCA:α-Cyano-4-hydroxycinnamic Acid
Cys:Cysteine
DTT:1,4-Dithiothreitol
G-CSF:Granulocyte colony-stimulating factor
GRA:Granulokine®
hpf:Hours postfertilization
IAA:Iodoacetamide
MALDI TOF-MS:Matrix-assisted laser desorption/ionization-time-of-flight mass spectrometry
NIV:Nivestim™
RP-HPLC:Reverse-phase high-performance liquid chromatography
SA:Sinapinic acid
TEV:Tevagrastim®
ZAR:Zarzio®.

Data Availability

All data generated or analyzed during this study are included in this published article and its additional files.

Ethical Approval

Zebrafish were maintained and used in accordance with the Italian and European rules on animal use following protocols approved by the local Committee (OPBA protocol nr 211B5.24) and authorized by the Ministry of Health (authorization number 393/2017-PR).

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.

Authors’ Contributions

AG, MB, SV, and MG conducted the experiment and analyzed data. SB and AM helped in the experiment and were major contributors in writing the manuscript. SS and MM conceived the design and supervised all the process of the research. All authors read and approved the final manuscript. Alessandra Gianoncelli, Michela Bertuzzi, and Michela Guarienti contributed equally to this work.

Acknowledgments

Special thanks are due to Prof. Piefranco Spano (1940–2018) who has always enthusiastically supported research in this area and Dr. Malem Flores who has done the proofreading of this manuscript. A special thank goes to “XXII. International Mass Spectrometry Conference.” This work was supported by Research Institutional Grant from the University of Brescia and by Agenzia Italiana del Farmaco AIFA (Project PHARM-Q).

Supplementary Materials

Figure S1: calibration curve and correlation coefficient obtained from analysis of filgrastim reference standard. Five different dilutions of the filgrastim reference standard (340.00 μg/ml, 170.00 μg/ml, 85.00 μg/ml, 42.50 μg/ml, and 21.25 μg/ml) were analyzed by RP-HPLC-UV. The calibration curve was plotted using the area under peak versus filgrastim standard reference concentrations. Figure S2: GC-MS spectrum of the second minor peak found only in ZAR formulation and recovered during RP-HPLC analysis. In A, the chromatogram is reported, while in B, the related mass spectrum is reported. Figure S3: peak identification by comparing the experimental mass spectrum against mass spectra in a specific library. In A, is reported the mass spectrum of the unknown peak, while in B, the mass spectrum is matched in the library and corresponding to the glutamic acid. Figure S4: quantitative RP-HPLC-UV analyses of Granulokine® (GRA), Tevagrastim® (TEV), Nivestim™ (NIV), and Zarzio® (ZAR). Each drug was injected two times, and the chromatogram shows in blue the first analysis and in black the second one. Table S1: data analysis of quantitative double analyses of Granulokine® (GRA), Tevagrastim® (TEV), Nivestim™ (NIV), and Zarzio® (ZAR) and relative average and recovery calculation. Each drug was injected two times, in blue the values of the first analysis and in black the values of the second one. Table S2: matrix-assisted laser desorption/ionization time-of-flight/time-of-flight mass spectrometry- (MALDI TOF/TOF-MS-) positive ion spectrum of Granulokine® (GRA) and its biosimilars, Tevagrastim® (TEV), Nivestim™ (NIV), and Zarzio® (ZAR) after endoproteinase Glu-C, in the nonreducing condition, and chymotrypsin digestion and purification with Zip-Tip C18. (Supplementary Materials)