Abstract

Azoospermia is a severe problem that prevents couples from having their own children through natural pregnancy. In nonobstructive azoospermia (NOA), microdissection testicular sperm extraction (micro-TESE) is required to collect sperm and, at 40%–60%, the sperm retrieval success rate is not very high. Previous studies identified no single clinical finding or investigation that could accurately predict the outcome of sperm retrieval. It would be very valuable to have a factor for predicting the possibility of sperm retrieval in patients with NOA before performing micro-TESE. We retrospectively obtained data from the medical records of 430 patients who underwent micro-TESE from 2011 to 2020. Parameters extracted were age, height, body weight, body mass index, luteal hormone, follicle-stimulating hormone, PRL, total testosterone, E2, T/E2, sperm retrieval, G-band, AZF, medical history, Rt testis, and Lt testis. Prediction One, which does not require coding, was used to create the AI prediction model for sperm retrieval. Prediction One makes the best prediction model using an artificial neural network with internal cross-validation. Prediction One also evaluates the “importance of variables” using a method based on permutation feature importance. The AUC for the AI model was 0.7246, which is acceptable. In addition, among the variables, T/E2 ratios contributed most to predicting whether sperm retrieval was possible or not. However, the difference in T/E2 between successful and unsuccessful sperm retrieval was not statistically significant. In addition, our analysis of data from 20 patients who underwent micro-TESE in 2021 found that in 85%, the actual result matched the result predicted using our novel AI model. We created an AI model for predicting sperm retrieval in patients with NOA before undergoing micro-TESE. In addition, we found that T/E2 ratios contributed most to predicting possibility of sperm retrieval in NOA using machine learning.

1. Introduction

Almost all couples who have normal sexual intercourse and do not use contraception achieve pregnancy within 1 year. However, a certain percentage are unable to have a child due to infertility, which affects about one in six couples. Infertility can be a problem for both women and men, affecting both genders equally [1]. Azoospermia is a severe type of male infertility that prevents couples from having their own children through natural pregnancy. Azoospermia means the inability to detect any sperm at all under a microscope when semen is centrifuged, and the process is performed at least twice [2]. Azoospermia was reportedly observed in ∼16% of infertile Japanese men [3].

There are two patterns of azoospermia. The first is obstructive azoospermia (OA) and the second is nonobstructive azoospermia (NOA). In OA, spermatogenesis is normal, but sperm fails to be delivered into the ejaculate because of ductal obstruction. NOA is defined as minimal or no production of fully developed spermatozoa in the seminiferous tubules. It has been estimated that NOA accounts for 60% of azoospermia [4]. It is conventionally classified according to where it arises, as pretesticular or testicular. Pretesticular NOA is due to hypothalamic–pituitary disorders in which follicle-stimulating hormone (FSH) and luteal hormone (LH) are not produced by the pituitary, whereas testicular NOA is a disorder of spermatogenesis. There are few cases of pretesticular NOA but many of testicular NOA. In this manuscript, NOA means testicular NOA.

Testicular sperm extraction (TESE) is required for obtaining sperm from patients with azoospermia. Conventional TESE (cTESE) is performed to obtain sperm from patients with OA and NOA, while microdissection testicular sperm extraction (micro-TESE) is the gold standard for obtaining sperm from patients with NOA. Micro-TESE involves the use of optical magnification to target specific seminiferous tubules that contain mature sperms [5].

Schlegel reported that the rate of successful sperm retrieval by micro-TESE was 50%–60% in patients with NOA [5], while Amer et al. [6] reported that micro-TESE had a higher sperm retrieval rate than cTESE (47% versus 30%). Yumura et al. [3] noted a sperm retrieval rate of 34.0% when a total of 695 micro-TESEs were performed during 1 year at 24 hospitals in Japan. Thus, reported sperm retrieval rates have not been very high so far.

Abdel Raheem et al. [7] stated that there was no single clinical finding or investigation that can accurately predict the outcome of the sperm retrieval. Although clinical (patient age, smoking, testicular volume, and cryptorchidism) and hormone parameters (FSH and Inhibin B) have been previously investigated as potential predictors of sperm retrieval, the evidence for them has been conflicting and a specific biochemical marker predicting the possibility of sperm retrieval has yet to be established [819].

It is very valuable to know the possibility of sperm retrieval for patients with NOA before performing micro-TESE. In this regard, Zeadna et al. [20] evaluated the performance of machine-learning models in predicting successful sperm retrieval in patients with NOA but the size of the study population was a limitation on their study, as it included only 119 patients.

Artificial intelligence (AI) is used in many areas of medicine including radiology, ophthalmology, pathology, and oncology, and we previously reported the use of an AI-based algorithm for predicting Johnsen scores to evaluate spermatogenesis in the testis [21].

In this study, we attempted to make a model for predicting the possibility of sperm retrieval in patients with NOA before performing micro-TESE, using machine learning.

2. Materials and Methods

2.1. Study Population

We retrospectively obtained data from the medical records of 430 patients who underwent micro-TESE from 1 January 2011 to 31 December 2020. We also obtained data for 20 patients who underwent micro-TESE from 1 January 2021 to 31 December 2021. Azoospermia was defined as a semen sample evaluated as having no sperm according to the WHO criteria on two different occasions [2]. In addition, a centrifuged semen sample had to have no sperm under  ×400 magnification (Olympus, Inc., Tokyo, Japan). We did not use the WHO 2021 criteria because our retrospective data did not go beyond 2020. Therefore, we used the WHO 2010 criteria for this research [2].

The medical history of patients who underwent micro-TESE was recorded, which included left varicocele, cryptorchidism, inguinal hernia, torsion of spermatic cord, orchitis, cancer treatment, spinal injury, spina bifida, and antisperm antibodies.

All patients underwent a medical evaluation of secondary sexual characteristics, clinical testicular volume, and varicoceles, as well as sonography of the testis and varicoceles. Varicoceles were classified according to the Dubin and Amelar varicocele grading system [22].

We measured serum hormonal levels of FSH, LH (luteinizing hormone), PRL (prolactin), total testosterone, and E2 (estradiol). In addition, the T (testosterone)/E2 (estradiol) ratio was calculated using T in ng/dL and E2 in pg/mL. A karyotype analysis was performed in 407 patients and a Y chromosomal microdeletion analysis in 211 patients.

The Ethics Committee of Toho University Omori Medical Center has waived informed consent for this study. The study protocol was approved by the Ethics Committee of Toho University Omori Medical Center. All methods were performed in accordance with the relevant guidelines and regulations as well as with the Declaration of Helsinki. The presented study design was accepted by the Ethics Committee on the condition that a document declaring an opt-out policy, by which any potential patients and/or their relatives could refuse inclusion in this study, was uploaded to the website of the Toho University Omori Medical Center.

2.2. Surgical Technique

Micro-TESE had been performed under local anesthesia by groups of three surgeons, as described by Schlegel with some modifications [5]. The three surgeons were board certified by Japan Society for Reproductive Medicine.

A 3.0-cm transverse incision is made on the scrotal skin and, wrapped in the tunica vaginalis, the testis is delivered outside the scrotum. The tunica vaginalis is grasped with curved mosquito forceps and opened widely with curved ophthalmic scissors. At the level of the minor axis of the testis, the tunica albuginea is opened widely with a #11 blade. The bleeding from the tunica albuginea is stopped with curved mosquito forceps. The surgeon’s nondominant hand holds the everted testicle while the dominant hand, with microtweezers, carefully and systematically dissects the seminiferous tubules under high-powered 15–25 optical magnification. An assistant of the surgeon irrigates the field with saline for optimal visualization. Bipolar electrocautery is used for meticulous hemostasis. Using microtweezers, the surgeon looks for prominent seminiferous tubules for collection. Regarding features, prominent seminiferous are large and tortuous and show white turbidity. After dissection is completed, the tunica albuginea is carefully closed with a running suture and interrupted suture. The tunica vaginalis is closed, the testicle is delivered back into the scrotum and the wound is closed.

Testis tissue samples were collected in four or five 1.5 mL microtubes with 200 μL HEPES buffer solution (P + HEPES Medium, Nakamedical, Inc., Tokyo, Japan) and were mechanically dispersed using ophthalmic scissors. The testis tissue samples in the 1.5-mL microtubes were placed in a 35-mm Petri dish containing HEPES buffer solution and an embryologist searched for sperm under an inverted microscope at  ×400 magnification in the operation room. If no sperm was found in the immediate search in the operation room, total testis tissue was spread in 5–10 Petri dishes containing HEPES buffer solution and was searched for sperm under an inverted microscope at  ×400 magnification by three embryologists in the embryo culture room.

2.3. Database

Age, height, body weight, body mass index (BMI), LH, FSH, PRL, total testosterone, E2, T/E2, sperm retrieval, G-band, AZF, medical history, Rt testis, and Lt testis were extracted from patient records, and Excel (Microsoft Corporation, Redmond, Washington, USA) sheets were created from the data. A total of 16 variables were extracted—consisting of numeric variables (age, height, body weight, BMI, LH, FSH, PRL, total testosterone, E2, T/E2, Rt testis, and Lt testis) and sperm retrieval, G-band, AZF, and medical history as binary variables.

2.4. Statistical Analysis

Statistical analysis was performed using IBM SPSS statistics software (Version 27) (IBM, Armonk, New York, USA). Quantitative variables are shown as averages. Quantitative variable data were investigated using the unpaired t-test. The homoscedasticity of data was confirmed by the Levene test. Qualitative variable data were investigated using the χ2-test. was considered statistically significant.

2.5. Creation of Machine Learning Prediction Model Requiring No Coding Using Prediction One

Prediction One (https://predictionone.sony.biz; Sony Network Communications Inc., Tokyo, Japan) was used to make the prediction model for sperm retrieval. Prediction One is software only available in Japanese. It generates feature vectors from datasets using standard preprocessing methods, such as one-hot encoding for categorical variables and normalization for numerical variables. A gradient-boosting tree and a neural network are used as supervised machine learning models, each trained with hyperparameter tuning. An ensemble model of both trained models was constructed. Missing values are automatically handled by common machine learning techniques, such as a gradient-boosting tree. The area under the curve (AUC) was calculated using internal validation to evaluate the accuracy of the AI model. Prediction One makes the best predictive model using an artificial neural network with 5-fold cross-validation. Prediction One also evaluates the “importance of variables” using a method based on permutation feature importance. This method was used to calculate the difference in the model output when a single variable was removed. The value of the difference in the model output indicated how much the model depended on the variables. The value of the difference was computed for each covariate and then averaged over those in the dataset [23].

Prediction One read in the data of the 430 patients who underwent micro-TESE and automatically divided them into internal training and cross-validation datasets, in more or less equal halves. Prediction One automatically adjusted and optimized the variables to make it easy to process them statistically and mathematically and select an appropriate algorithm with ensemble learning. The missing values were automatically compared and Prediction One made the best prediction model using an artificial neural network with internal cross-validation. The details are trade secrets and cannot be provided.

The data from the 20 patients who underwent micro-TESE from 1 January to 31 December 2021 were used as an external validation dataset.

3. Results

The mean age of the 430 male patients who underwent micro-TESE targeted by this study was 36.78 ± 7.58 years. Sperm retrieval was successfully achieved in 151 (35.1%) patients.

Table 1 shows the clinical characteristics of patients with successful (+) and unsuccessful (−) sperm retrieval by micro-TESE. Statistically significant differences between successful and unsuccessful sperm retrieval were observed for age, LH, FSH, total testosterone, Rt testis, Lt testis, and Johnsen score count. Patients in the successful sperm retrieval group were older (mean age 39.80 ± 9.50 years), had lower levels of LH (mean LH 8.53 ± 5.80 mIU/mL) and FSH (mean FSH 18.05 ± 14.88 mIU/mL), a higher total testosterone level (mean total testosterone 4.43 ± 2.56 ng/mL), bigger Rt testis (mean Rt testis 12.11 ± 5.06 mL), bigger Lt testis (mean Lt testis 12.06 ± 5.03 mL), and a higher Johnsen score count (mean JSC 6.58 ± 2.45).

However, no statistically significant differences were observed for height, body weight, BMI, PRL, E2, and T/E2 between unsuccessful and unsuccessful sperm retrieval.

We have provided these data for the 430 patients who underwent micro-TESE in the supplement.

Table 2 shows differences between sperm retrieval (+) and (−) for surgical site, G-band chromosome test, azoospermia factor (AZF) analysis, and medical history (none, Lt varicocele, cryptorchidism, inguinal hernia, torsion of spermatic cord, orchitis, cancer treatment, spinal injury, spina bifida, antisperm antibody (+), and others) when micro-TESE was performed. Statistically significant differences were observed in medical history between successful and unsuccessful sperm retrieval. However, differences in surgical site, G-band chromosome test, and AZF analysis between successful and unsuccessful sperm retrieval were not statistically significant.

We have provided these data for the 430 patients who underwent micro-TESE in the supplement.

Figure 1 shows the AI prediction model for sperm retrieval in micro-TESE, generated using the Prediction One software. In the accuracy evaluation of the AI prediction model, AUC was 72.46%. In addition, in a ranking of contribution of variables from 1st to 5th, “T/E2” ranked 1st.

The confusion matrix presented in Figure 2 indicates a threshold of 0.47. The values for accuracy, precision, and recall are 72.09%, 64.76% and 45.03%, respectively, when the F-value is 53.13%.

Table 3 presents a ranking of variables contributing to sperm retrieval (+) and (−). Rankings are shown from 1st to 10th. The variable contributing to sperm (+) and (−) ranked 1st is “T/E2 ratio.” “Age” and “medical history” also rank highly for contribution to sperm (+) and (−). “FSH” ranks highly for contribution to sperm (−), while “G-band” ranks highly for contribution to sperm (+).

In 2021, we performed micro-TESE in 20 patients and investigated differences between actual results for sperm retrieval and those predicted using our AI model when the threshold was 0.47. Table 4 presents the data. Surprisingly, the actual results and predicted results were matched in 17 of the 20 patients (85.0%). In the remaining three patients, sperm could not be retrieved according to the AI model but could be retrieved in the actual micro-TESE. The patient indicated as no. 1 had cryptozoospermia or azoospermia due to rapid weight loss. We performed micro-TESE to collect sperm in all cases, even when FSH was normal.

4. Discussion

Requiring no coding, the AI approach that we adopted has the potential to accelerate access to deep learning for clinicians and clinical researchers [24]. The ability to create an AI model with no need for coding is a big advantage for doctors. We developed an AI-assisted image classifier that provides scores for histological testis images of patients with azoospermia using Google Cloud AutoML Vision. With it, testis images could be classified at 82.6% accuracy [21].

The successful sperm retrieval rate is reportedly between 40% and 60% in men with NOA [5]. Since the rate is not very high, it is important to have predictive criteria for micro-TESE outcomes in order to better counsel patients before performing micro-TESE. Previous large sample size studies were consistent in demonstrating that sperm retrieval can only be predicted with limited diagnostic accuracy even when using machine learning models [25]. To investigate this further, Zeadna et al. [20] presented a “machine learning perspective” on predicting the presence of sperm before surgery in NOA patients. Their main conclusion was that a model with FSH, LH, testosterone, semen volume, age, BMI, ethnicity, and testicular volume as predictors could predict sperm retrieval with moderate accuracy (AUC 0.8).

However, there were some problems in their study. First, the number of patients in the study population was relatively small (n = 119). Second, regarding criteria for NOA, 7.6% of the patients showed normal spermatogenesis in the testicular histology examination. Third, 25% of patients had a normal FSH level, and the mean testicular volume was 13 mL. Finally, the sperm retrieval rates were much higher (65%) than those found in the literature for TESE [26]. Therefore, it is necessary to strictly define the study population when creating a predictive model.

Although more than 20 years have passed since the first use of micro-TESE was described, an ideal model for predicting the presence or absence of sperm before micro-TESE remains elusive. In this study, our objective was to create an AI model to predict the presence or absence of sperm before micro-TESE using AI predictive modeling software with no need for coding. We successfully created the model using data from 430 patients who had undergone micro-TESE. Among the variables used, T/E2 ratio was found to be a critical factor for the presence or absence of sperm before micro-TESE.

It has been reported that the local balance between estrogen and androgen may be important for maintaining spermatogenesis [27]. In this regard, male patients with severe infertility, particularly those with NOA, have low serum T/E2 ratios. A cutoff point of 10 has been proposed as the lower limit of the ratio for adults [28]. In addition, it has been suggested that, when used together, T/E2 ratios and seminal testosterone levels may serve as good indicators for predicting the success of surgically retrieving sperm from the testes of patients with NOA [29]. Furthermore, Shiraishi et al. [30] recently reported that compared to patients with OA (n = 18), in those with NOA (n = 72), serum testosterone () and T/E2 ratios () were significantly decreased and E2 levels were significantly increased () [30]. They also observed that levels of aromatase, an enzyme responsible for the aromatization of androgens into estrogens, were increased in patients with NOA.

These researchers also performed univariate and multivariate analyses to predict sperm retrieval by micro-TESE in patients with NOA. Univariate analysis revealed no significant difference in T/E2 ratios between the sperm retrieval group (n = 22) and no sperm retrieval group (n = 50). In the multivariate analysis, while there was no statistically significant difference in T/E2 ratios, there was a slight difference in terms of odds ratio (odds ratio: 4.78, 95% confidence interval: 0.24–93.8) compared with LH (odds ratio: 0.91, 95% confidence interval: 0.71–1.18), FSH (odds ratio: 0.98, 95% confidence interval: 0.90–1.07), testosterone (odds ratio: 0.96, 95% confidence interval: 0.89–1.03), and estradiol (odds ratio:1.36, 95% confidence interval: 0.72–2.55) [30].

Although Shiraishi et al. [30] did not mention the odds ratio for T/E2 ratios, we have found that they have the potential to be a characteristic variable for predicting sperm retrieval by micro-TESE in patients with NOA and therefore included T/E2 ratios in the variables for creating the AI prediction model. Surprisingly, we found that, among the variables, T/E2 ratios contributed most to predicting possibility of sperm retrieval.

However, the reason for the lack of a statistically significant difference in T/E2 between successful and unsuccessful sperm retrieval by micro-TESE is unknown. So far, various clinical and hormone parameters have been investigated as potential predictors of successful sperm retrieval but the evidence for clinical and biochemical markers is conflicting [15]. One study found that a decreased seminal testosterone/estradiol ratio could be a good indicator for identifying absence of sperm production in NOA patients [29].

A limitation of our study is that the internal mechanism of the AI analysis system is a “black box,” so it cannot be readily understood [31]. We also consider the skill of surgeons and embryologists to be a limitation for making a highly accurate AI model like the one described in our study. Prediction One found that T/E2 ratios contributed most to predicting the possibility of sperm retrieval, but we cannot ignore these limitations completely.

Based on our results, we need to further explore a value of T/E2 ratio that would help to determine whether sperm could or could not be recovered by performing micro-TESE in patients with NOA. T/E2 ratios could be a useful clinical predictor of sperm retrieval in NOA.

In our analysis of data from 20 patients who underwent micro-TESE in 2021, which were not used for training the AI model, we found that in 85% of the cases, the actual and predicted results matched. We now use the AI model to make predictions and share the information with the embryologist before micro-TESE is performed.

In conclusion, we created an AI model for predicting sperm retrieval in patients with NOA who are to undergo micro-TESE. The AUC of our AI model was 0.7246, which is acceptable. In addition, we found that, among the variables, T/E2 ratios contributed most to predicting possibility of sperm retrieval. T/E2 ratios have the potential to be a clinical predictor of sperm retrieval in NOA.

Data Availability

H.K. has ownership for the data used with Prediction One. The data collected during this study are patient data obtained with the Ethics Committee’s approval and can be shared in other research. All data generated or analyzed during this study are included in this published article.

Disclosure

A preprint has previously been published [32].

Conflicts of Interest

All authors have no competing interests to declare.

Authors’ Contributions

All authors contributed to the conception and design of the study. H.K. collected the data to build an AI model from clinical records, provided training in the automated machine learning models, and drafted the manuscript.

Acknowledgments

H.K. has received support in the form of a grant-in-aid for Scientific Research (C) from the Japan Society for the Promotion of Science (JSPS) (JSPS KAKENHI grant number JP22K09486).