Validation of novel Japanese indication criteria and biomarkers among living donor liver transplantation recipients with hepatocellular carcinoma-a single center retrospective study

Aim: To validate a novel Japanese indication criteria for liver transplantation (LT) for hepatocellular carcinoma (HCC), i.e., the 5-5-500 criteria (nodule size ≤ 5 cm in diameter, nodule number ≤ 5, and alfa-fetoprotein (AFP) value ≤ 500 ng/mL) and the Japanese double eligibility criteria (DEC) (patients meeting the Milan or the 5-5-500 criteria) in the University of Tokyo cohort. The usefulness of biomarkers in predicting the recurrence of HCC was

In Japan, living-donor liver transplantation (LDLT) has been the mainstay for end-stage liver disease patients with HCC due to the severe scarcity of deceased donors. While the gold standard has long been the Milan criteria, several center-oriented expanded criteria have been reported [3,6,8] . We proposed the Tokyo criteria in 2007, the detail of which was as follows; the number of tumors should be five or less, and the maximum diameter should be 5 cm or less, without distant metastasis nor vascular invasion [3] . Similarly, Kyoto and Kyushu advocated their own expanded criteria and included DCP as a biological marker [6,8] . These expanded criteria, however, had not been approved by the government, and those beyond the Milan criteria but still within each expanded criterion had to undergo LDLT as private practice, which led us to establish the government-approved expanded criterion. Most recently, the 5-5-500 criteria (nodule size ≤ 5 cm in diameter, nodule number ≤ 5, and AFP value ≤ 500 ng/mL) was established based on retrospective data analysis of the Japanese Liver Transplant Registry by our colleagues [24] . This expanded criteria was approved as the new national selection criteria for liver transplant candidates with HCC and started in August 2019. Now, the double eligibility criteria (DEC), Milan + 5-5-500, has been adopted as the new indication criteria for Japanese patients with HCC.
The aim of the present study was to validate the Japanese DEC and the 5-5-500 criteria in our single-center cohort. In addition, the usefulness of biological markers (AFP, AFP-L3, DCP, NLR, and PLR) in predicting the recurrence of HCC after LT was also verified.

Patients
From January 1996 until the end of 2019, a total of 563 adult patients underwent LDLT at the University of Tokyo Hospital. Among them, 153 patients were treated for HCC and were the subjects of the present study. Preoperative diagnosis of HCC was based on dynamic multi-detector computed tomography (MDCT) performed within a month before LT in all cases. Lesions presenting with typical radiological characteristics of classical HCC, that is, lesions with arterial phase enhancement and low density during the portal phase, were diagnosed as HCC to be counted and measured. In cases that underwent pretransplant locoregional treatments, only the size of the viable lesion was measured on the basis of MDCT before LDLT, and on the basis of pathological findings after LDLT. Essentially, we used the Milan criteria as a standard indication for LT for HCC; however, we allowed the expanded criteria, i.e., Tokyo criteria, in a private practice setting as mentioned above. Six cases exceeding the Tokyo criteria exceptionally, underwent LDLT in the early period. We did not use biomarkers such as AFP and DCP in patient selection.

Donor selection and postoperative management
Until 2015, the estimated graft volume to the recipient standard liver volume (SLV) ratio must be over 40% for LT at our institution. Since 2016, we have changed the threshold of the graft volume criteria to 35% of the recipient SLV. The left liver was the first choice for the graft if it satisfied the lower limit. Otherwise, right liver procurement was indicated if the estimated right liver graft volume was less than 70% of the donor's total liver volume, and a right lateral sector graft was used in selected cases. Details of donor evaluation and graft selection are described elsewhere [25] . The basic immunosuppression regimen comprised tacrolimus and steroid for all recipients, and the doses of each drug were gradually tapered over 6 months after LDLT. Our detailed postoperative recipient management including the immunosuppression protocol has been described elsewhere [26] . We do not modify immunosuppression for HCC recipients and do not use m-TOR inhibitors nor adjuvant chemotherapies. All patients were followed up at our department after LT according to the following protocol: monthly measurements of AFP and DCP, abdominal ultrasound performed every 3 months, and contrast-enhanced dynamic MDCT every 6 months. Recurrence was defined as the emergence of radiological findings in MDCT or magnetic resonance imaging compatible with typical HCC.

Statistical analysis
Categorical variables were expressed as number (%) and continuous variables were expressed as median with range. NLR and PLR were calculated by dividing the number of neutrophils or platelets, respectively, by the number of lymphocytes. Patient overall survival and recurrence rates were calculated using Kaplan-Meier with Log rank test. A receiver-operating characteristics (ROC) curve analysis and Youden index were used to define the ideal cut-off values for AFP, AFP-L3, DCP, NLR, and PLR to detect recurrence. Univariate and multivariate analysis was performed using a Cox proportional hazards model to identify the predictors of recurrence. Factors with a P value less than 0.05 in a Cox proportional-hazard model as a univariate analysis were considered potential risk factors and further analyzed in a multivariate Cox model. The hazard ratio (HR) and 95% confidence interval (CI) were calculated for each variable. Although seventeen variables listed in the table were examined as potential risk factors, AFP-L3 was excluded from multivariate analysis because of the quantity of missing data (AFP-L3 was not checked in 19 patients). Beyond the Milan, 5-5-500, and Japanese DEC were also excluded from multivariate analysis because they were not considered to be independent factors but composite factors having a strong relation to tumor number, size, and AFP value. All statistical calculations were performed using JMP Pro 15 (SAS Institute Inc., Cary, NC, USA). P values less than 0.05 were considered statistically significant.

Validation of the 5-5-500 criteria and the Japanese DEC
The relationship of the Milan criteria, the 5-5-500 criteria, and the Japanese DEC is presented in the Venn diagram [Supplementary Figure 1]. The number of patients and patients with recurrence meeting each indication criteria was summarized in Supplementary Table 1. The recurrence rate was the lowest in patients meeting the 5-5-500 criteria (6.5%) followed by the Milan criteria (6.9%) and then the Japanese DEC (8.2%). All criteria achieved the target of a recurrence rate below 10%. When focusing on each area of the Venn diagram, the recurrence rate was the highest (42.9%) in patients within the Milan but beyond the 5-5-500 criteria and in patients beyond the Japanese DEC. Meanwhile, the recurrence rate in patients within the 5-5-500 but beyond the Milan criteria was lower (20%) than these patients. As for the comparison of the number of patients, the number of patients included in the 5-5-500 criteria was larger than that included in the conventional Milan criteria by eight (6.1% increase). In the Japanese DEC, 15 additional patients were included (11.5% increase). The overall survival and recurrence rate curves in patients meeting each indication criteria are presented in Figure 1. The 5-year overall survival and the 5-year recurrence rate of all the patients, patients meeting the Japanese DEC, 5-5-500 criteria, and Milan criteria was 76.9%, 77.9%, 79.0%, and 76.2%, and 10.9%, 9.2%, 7.4%, and 7.6%, respectively. There was no significant difference both in the 5-year recurrence and 5-year survival rates amongst each criterion.

Usefulness of biomarkers in predicting the recurrence of HCC
The results of the ROC curve analysis for biomarkers is presented in Figure 2. Among the five biomarkers, the area under the curve (AUC) value of AFP was the highest (0.852). The sensitivity of AFP was also the highest (86.7%). Meanwhile, the false-positive rate (1-specificity) of AFP-L3 was the lowest (8.3%). Patient recurrence rate curves stratified by each biomarker using the cutoff value obtained from the ROC curve analysis are presented in Supplementary Figure 2. Though recurrence rate curves were well stratified with AFP, AFP-L3, and DCP (P < 0.0001), significant results were not obtained with NLR and PLR (P = 0.076 and = 0.263 respectively).

Factors associated with HCC recurrence
Risk factors associated with HCC recurrence were evaluated with univariate and multivariate analyses. Univariate analysis revealed that beyond the Milan, 5-5-500, and Japanese DEC were all significant predictors [ Table 2]. Among these three criteria, the hazard ratio and P value beyond the 5-5-500 criteria was the highest (7.99) and the smallest (0.0005), respectively. Except for factors associated with these three criteria, the high AFP value ≥ 60 ng/mL, high AFP-L3 value (≥ 35%), high DCP value (≥ 130 mAU/mL), and large tumor size (≥ 2.0 cm) were all identified as significant predictors by univariate analysis. Among the five biomarkers evaluated, the hazard ratio and P value of a high AFP value was the highest (11.50) and the smallest (< 0.0001), respectively. Multivariate analysis revealed that high AFP and DCP values were the independent significant predictors.

DISCUSSION
The results of the present study suggest that the 5-5-500 criteria and the Japanese DEC are appropriate and acceptable since the 5-year recurrence rate in patients meeting these criteria were both below 10% in our cohort. Compared with the conventional Milan criteria, the 5-5-500 criteria and the Japanese DEC could increase the number of eligible LDLT candidates by 6.1% and 11.4%, respectively. As for the usefulness of biomarkers in predicting the recurrence of HCC, AFP seems to be the most reliable. Though there were some missing data, AFP-L3 also seems promising.
In Japan, the national insurance system had restricted LDLT to those falling within the Milan criteria until recently, although some centers have been performing LDLT in private practice with a center-oriented expanded criteria that has achieved a 5-year patient survival over 80% and a 5-year recurrence rate of 10% [27,28] . Consequently, a few patients had given up the chance for LDLT because of financial reasons, despite the potential of a live donor, and there has been strong demands to expand insurance coverage for those beyond the Milan criteria. When establishing government-approved expanded criteria, achieving a 5-year recurrence rate of less than 10% and a 5-year survival rate of over 70% seems reasonable and socially acceptable in the setting of LDLT for HCC [29] , which was achieved in the benchmark study by  Mazzaferro et al. [1] . In line with this recommendation, the 5-5-500 criteria was established with the intent to enable the maximal enrollment of candidates while securing a 5-year recurrence rate below 10% and a 5-year survival rate over 70% based on a retrospective data analysis of the Japanese nationwide survey [24] . Because the exclusion of patients within the Milan but beyond the 5-5-500 criteria seems not socially acceptable nor rationale, and considering the worldwide prevalence and acceptance of the Milan criteria, the Japanese DEC, Milan + 5-5-500, was adopted as the new indication criteria now in Japan.
In the present study, the 5-year recurrence and survival rate in patients meeting the 5-5-500 criteria and the Japanese DEC were superior to those socially accepted as mentioned above [ Figure 1]. In addition, the number of LDLT candidates increased considerably using these criteria [ Supplementary Figure 1 and Supplementary Table 1]. The outcomes of survival and recurrence were similar to our previous national report [24] though the increase of LDLT candidates was a bit modest in the present study. Univariate analysis revealed that both beyond the 5-5-500 criteria and beyond the Japanese DEC were significant predictors of recurrence [ Table 2]. Meanwhile, the recurrence rate was higher in patients beyond the Japanese DEC [ Supplementary Figure 1 and Supplementary Table 1]. On the basis of these findings, we consider that the Japanese DEC are the appropriate selection criteria to maximize the number of LDLT candidates while securing acceptable outcomes. The major concern is that the recurrence rate was considerably high (42.9%) in patients within the Milan but beyond the 5-5-500 criteria in the present study [Supplementary Table 1]. The exclusion of patients within the Milan criteria, however, seems not socially acceptable at present. In addition, when the Japanese DEC was adopted, the 5-year recurrence and survival rate still fell within the target as a whole.
Amongst five biomarkers, AFP seems to be the most reliable marker with the highest AUC value [ Figure 2]. The usefulness of AFP in predicting recurrence after LT has been investigated by many researchers [12][13][14][30][31][32] , and AFP is incorporated in some selection [7,10,12,33] and prognostic models [12][13][14] . The AFP model, developed by the Liver Transplantation French Study Group, combines serum AFP level, tumor size, and tumor number [12] . Another famous prognostic model is the RETREAT score, which incorporated microvascular invasion, tumor diameter, and tumor number other than the AFP value as prognostic variables [13] . Another prognostic model, the TRAIN score, incorporated the AFP slope, which was defined as [(final-AFP)-(initial-AFP)]/time [14] . The cut-off value of AFP differs from study to study, ranging from 15 ng/mL to 1000 ng/mL [7,10,12,13,30 -33] . The AFP cut-off value of 60 ng/mL, used in the present study, is relatively low compared with those used in other studies, however, the cut-off value was shown to be useful in predicting recurrence [Supplementary Figure 2]. The present results as well as the previous reports justifiy the use of pretransplant AFP values in the expanded indication criteria of LT for HCC patients.
AFP-L3, a reliable marker for the diagnosis of HCC [20] , proved to be a promising marker for recurrence after LT since the specificity of AFP-L3 was the highest [ Figure 2] and patient recurrence rate curves were well stratified using AFP-L3 [Supplementary Figure 2]. However, there has been little study [34] investigating the usefulness of AFP-L3 in predicting HCC recurrence after LT. Highly sensitive AFP-L3 became available around 2010 in Japan, which enabled the measurement of AFP-L3 even in patients with total AFP levels below 20 ng/mL [20,[35][36][37] . Highly sensitive AFP-L3 is reported to be 5-10 times more sensitive than conventional AFP-L3 [37] . Along with these studies in non-transplant HCC patients, the present results warrant further investigation and validation for the usefulness and efficacy of AFP-L3 in predicting HCC recurrence after LT.
The AUC value of DCP was the 3rd highest [ Figure 2] and multivariate analysis revealed that DCP is one of the independent risk factors for recurrence [ Table 2] in this cohort. Though DCP has not been commonly used in the west [38] , some argued that DCP is more predictive than AFP [8,39] , and indeed, DCP is incorporated in the extended indication criteria of LT at two major centers in Japan [6,8] . A new prognostic model was developed in Korea, i.e., the MoRAL score, using only serum levels of AFP and DCP [15] , which was shown to be more effective than the Milan criteria in predicting recurrence after LT. While DCP is criticized for not being a routine laboratory test in the West and for its dependence on vitamin K status and warfarin administration in clinical settings, reports from Asia as well as the present study warrant further study on the DCP in predicting HCC recurrence after LT.
NLR and PLR are indicators of inflammatory status previously reported as prognostic markers for the recurrence of various cancers, including HCC [16,17] . As the usefulness of NLR has been presented in both DDLT and LDLT settings [40,41] , NLR is incorporated in some prognostic models [14,42] . Although the usefulness of PLR has also been reported since 2012, supporting evidence is still limited [40,41] . NLR and PLR were not as useful as AFP, AFP-L3, and DCP in predicting HCC recurrence after LT in the present study [ Figure 2, Table 2, Supplementary Figure 2]. One of the drawbacks of these inflammatory markers may be the inconstant nature of neutrophil, platelet, and lymphocyte counts. This is more so in cirrhotic patients who suffer from portal hypertension, splenomegaly, and consequently, pancytopenia. As for other biomarkers, we could not evaluate the usefulness of FDG-PET, one of the promising biomarkers reported previously [18,19] , because FDG-PET was not routinely performed at our institute.
Our analysis has several weaknesses related to its retrospective design and the limited number of patients included. Both the present and the national cohorts used in the establishment of the 5-5-500 criteria were based on the long time-course with a considerable number of cases from nearly 20 years ago. As the developments and advances in imaging modalities, anti-viral treatments, and immunosuppression regimens might have changed practice in the management of LT considerably over the last two decades, it seems mandatory to validate the criteria in the recent cohort or in the prospective study. Although the usefulness of tumor downstaging before LT has been reported recently [43] , unfortunately there was no case of intentional downstaging in the present cohort. In Japan, where the indication of LT for HCC is restricted to those with decompensated cirrhosis by the national insurance system, HCC patients with compensated cirrhosis are usually recommended for locoregional treatments and will be referred for LT when they develop decompensated cirrhosis not amenable to locoregional treatments. The downstaging strategy for those beyond the selection criteria and the expansion of the indication criteria are two opposite ways to expand the indication of LT for candidates, which should be compared and discussed in future studies.
In conclusion, the present study suggests that both the 5-5-500 criteria and the Japanese DEC are appropriate for patients with HCC in LDLT. AFP, including AFP-L3, was demonstrated to be a reliable biomarker and could reasonably be incorporated into the expanded selection criteria. Further validation with more recent cases and a prospective study is warranted.

Availability of data and materials
The data used in the present study were submitted to the journal.