Novel high-throughput applications for NAFLD diagnostics and biomarker discovery

Nonalcoholic fatty liver disease (NAFLD) is the most common chronic liver condition worldwide due to the global proliferation of obesity, which has become an insidious healthcare epidemic. While nonalcoholic fatty liver is recognized as a multi-system disease, benign and pernicious in its unfolding, nonalcoholic steatohepatitis is the more severe form progressing from cirrhosis to hepatocellular carcinoma. Unfortunately, liver biopsy beset by many limitations is the only accurate diagnostic tool setting the benchmark for a plethora of non-invasive biomarkers which have so far proved limited in their reliability and take-up. As a result, we need better diagnostic and prognostic tools to aid in the identification and stratification of patients at risk of disease progression in order to enhance treatment and monitoring strategies. In this review, we explore the performance as well as pros and cons of three novel technologies that could have the potential to become the next generation in NAFLD diagnostic testing. To harness these technologies, however, we suggest that more work needs to be done to refine and validate the technology features under review, while suggesting ways in which personalized medicine could be mobilized to discover the next generation in non-invasive diagnostics.


NAFLD definition and epidemiology
Nonalcoholic fatty liver disease (NAFLD) is a clinicopathologic syndrome encompassing several clinical entities ranging from nonalcoholic fatty liver (NAFL) -the non-progressive phenotype of the diseasecharacterized by the accumulation of lipids in hepatocytes with no cellular damage to the progressive phenotype known as nonalcoholic steatohepatitis (NASH), involving hepatocyte injury (ballooning), presence of inflammatory infiltrates and fibrogenesis -with an elevated risk of cirrhosis and liver cancer pathogenesis.
NAFLD is associated with obesity, hypertension, type 2 diabetes mellitus (T2DM), insulin resistance (IR), hyperlipidemia or the sum of several of these parameters that are defined as metabolic syndrome (MS). NAFLD is recognized as the most common cause of chronic liver disease worldwide [1] . In 2015, the global obesity pandemic was estimated to affect 107.7 million children and 603.7 million adults. Data from 195 countries established that the prevalence of obesity has doubled from 1980 to 2015 in more than 70 countries [2] , where obesity has shown to be the main contributing factor in the global NAFLD burden [3,4] . Mathematical modeling of these epidemics estimates that about 1.12 billion and 100.9 million people are affected by obesity [5] and NAFLD [6] , respectively. Without intervention, NAFLD will represent the first cause of liver transplantation by 2025 [7] .
NAFLD is commonly silent with no clinical manifestations or specific symptoms; thus, its diagnosis is often based on exclusion criteria. Although NAFL or NASH can be strongly suspected based on clinical and imaging features (such as abnormal lab tests or presence of steatosis), liver biopsy remains the gold standard for the definitive diagnosis of NAFLD [ Figure 1]. However, it is an invasive technique with costs and patient risks attached. It is also subjected to bias due to sampling variability and observer interpretation, and thus cannot be relied upon as an appropriate procedure for screening or prognosis [8] . Moreover, the identification of subjects at risk of developing NAFL and NASH with fibrosis is impreciseand, although significant work is ongoing, accurate and precise non-invasive markers for the diagnosis of NASH -and fibrosis in NAFLD -are still to be identified [9] .
In this review, we describe three modern technologies and the status of their current application and contribution to the discovery of novel biomarkers for the diagnosis of NASH and liver fibrosis in NAFLD.

Brief description of the current non-invasive diagnostic tools
The majority of individuals with NAFLD are asymptomatic or paucisymptomatic (i.e., asthenia, abdominal pain at the upper quadrant). The need for generalized screening and surveillance tools is debatable due to the high direct and indirect costs of diagnostic tests and the low predictive value of surrogate markers (e.g., transaminases and others, alone or combined in diagnostic scores). However, it is desirable to correctly identify those subjects at risk (i.e., age > 50 years, T2DM or MS) with NASH alone or with NAFL/NASH particularly when associated with advanced fibrosis. Patients with NAFL are thought to be at low risk of adverse consequences and progression to cirrhosis/hepatocellular carcinoma (HCC) or other harmful outcomes such as cardiovascular disease and malignancy. In contrast, the presence of NASH increases the risks of liver and possibly non-liver-related outcomes compared to those patients with NAFL alone. The risk of liver-related mortality in NAFLD grows exponentially as the stage of fibrosis increases [10] . However, some studies have highlighted the development of HCC in patients with NAFL even in the absence of fibrosis [11,12] .
Serology markers of hepatic synthesis (e.g., total bilirubin, albumin, prothrombin time NS creatinine), platelet count (predictive of portal hypertension), together with increases in aminotransferase (ALT), γ-glutamyltranspeptidase and an AST/ALT ratio, e.g., < 1 are all biochemical markers that are initially useful in the diagnosis of NASH, but they are all unspecific. Furthermore, serological markers have been combined in several diagnostic algorithms [ Figure 1]. For example, the predictive model HAIR combines hypertension, increased ALT and IR for the diagnosis of NASH [13] . The performance of predictive models based on surrogate markers of NASH and liver fibrosis has been extensively reviewed in the literature [14][15][16] .
Abdominal ultrasound (US) remains the simplest and most widely used method for detecting hepatic steatosis [ Figure 1]. Quantification of liver fat can be achieved with the fibroscan CAP modification (Controlled Attenuation Parameter) -a non-invasive measurement tool that is proportional to the attenuation of the ultrasound beam through the liver parenchyma. It can also be used in conjunction with magnetic resonance (MRI), which is now able to quantify limited amounts of intra-hepatocyte triglycerides (TG) and can sample large parenchymal volumes.
However, US has limited sensitivity and does not reliably detect steatosis when it is < 20% or in individuals with high BMI (> 40 kg/m 2 ). Despite observer dependency, both MRI and US robustly diagnose moderate and severe steatosis when high costs do not limit their use. These techniques allow us to identify and easily quantify intrahepatic TG -the predominant lipids that accumulate in hepatic steatosis. However, TG are not hepatotoxic per se; indeed, they have a protective role by providing a buffer against toxic fat by storage under their neutral form. Besides, it has been proven that there is no difference in the TG content between NAFL and NASH [17] . These data seriously reflect on the real utility of TG quantification and the clinically relevant information that can be extrapolated from these findings. Conversely, other reactive lipids have harmful effects such as lysophosphatidylcholine, ceramides and cholesterol [17,18] . However, none of these toxic lipids can be identified or quantified by any of the previously mentioned techniques.
Concerning non-invasive techniques for the quantification of fibrosis, one of the most commonly used is transient hepatic elastography [ Figure 1]. The diagnostic accuracy of this technique has been widely validated in patients with HCV chronic hepatitis but not in patients with NAFLD. Potential limitations consist of reduced sensitivity in mild forms of fibrosis and technical difficulties in detecting and interpreting data in the presence of high BMI and/or thoracic fold thickness. Even when using an XL probe, the failure rate remains high (i.e., 35%). Alternatively, magnetic resonance elastography provides a highly accurate measurement of fibrosis, inflammation and steatosis; however, its application in clinical practice is limited due to its scarce availability (the instrumentation is only available in academic centers) and high cost [19] .
Although histology represents the only reliable diagnostic method, high associative costs and nonnegligible patient risks limit its use on a large scale. Due to all these limitations, there is still an unmet clinical need to distinguish between those individuals presenting an early form of the disease from those patients at the highest risk of clinical complications. Thus, to date, new alternatives are urgently required.

Novel technologies for biomarkers discovery applied to NAFLD
The search for new non-invasive applications has been initiated due to the limitations of liver biopsy. On the other hand, no single or multiple parameter blood test -even when enhanced by diagnostic algorithms -had sufficient negative or positive predictive value to replace liver biopsy. The identification of novel candidate biomarkers is of utmost importance to overcome the current limitations in obtaining reliable non-invasive NAFLD diagnostic tests.
Fortunately, the development of modern high-throughput technologies such as transcriptomics, proteomics, metabolomics and now "glycomics" has favored the development of novel candidates during the discovery phase, enabling the rapid measurement of thousands of molecules in one run. Below, we detail, arguably, three of the most promising technologies to-date that have the potential to transform noninvasive liver diagnostics over the course of the decade.
The first concerns a proteomics technology known as SOMAscan® (SomaLogic). It is the first platform able to overcome the difficulties in developing high throughput assays for candidate protein biomarkers. The detection system is based on SOMAmers (Slow Off-rate Modified Aptamers) which are DNA-modified aptamers with high affinity and specificity for their target analytes [20,21] . Through the iterative selection and amplification process, using Systematic Evolution of Ligands by Enrichment, SOMAmers are selected from large randomized nucleic acid libraries to target the intramolecular signatures of proteins in their native folded conformations. Moreover, owing to the use of unique modified nucleotides, SOMAmers are more resistant to nuclease activity than conventional aptamers and show a higher affinity than antibodies. Currently, the platform enables the quantification of more than 1,100 proteins simultaneously in a highly multiplexed assay using only 65 µL of the sample. The sensitivity of the test is generally comparable to those of sandwich ELISA performance (e.g., median lower limit quantitation of 100 fM and limit of detection of 40 fM). Samples from a wide variety of sources are amenable to analysis: serum, CSF, cell/tumor extracts, synovial fluid, etc.
As far as application development in the field of chronic liver disorders, Wood et al. [22] tested the multiplexed proteomic assay (1,129 proteins), identifying candidates to be included in a multi-component classifier of fatty liver. The study included a training set of 443 patients for variable identification (discovery) and 134 patients in the validation set. In the reported multivariate analysis, the following eight proteins were associated with steatosis: Aminoacylase-1 (ACY1), Sex hormone-binding globulin (SHBG), Cathepsin Z (CTSZ), Hepatocyte growth factor receptor (MET), Gelsolin (GSN), Galectin-3 binding protein (LGALS3BP), Neural cell adhesion molecule L1-like protein (CHL1) and Antithrombin III (SERPINC1). The diagnostic evaluation provided an AUROC of 0.86 (0.79-0.92) for the proteomic component alone and 0.91 (0.871-0.957) when combined in the multi-component classifier.
In a separate study, Lai et al. [23] recently evaluated the SOMAscan® platform for diagnosis and monitoring of NASH. The preliminary study involved the evaluation of 60 serum samples from NAFLD subjects: 20 with steatosis; 20 with NASH and minimal/moderate (F0/F1/F2) fibrosis; and 20 with NASH and advanced fibrosis (F3/F4). Of the 1,310 targeted proteins, they reported 88 proteins useful to categorize steatosis and NASH. Utilizing prediction analysis, the top 10 candidates were combined in two different panels. For example, a five-protein signature [thrombospondin-2, Bcl-2-related protein A1, Collectin-11, Methyltransferase N6AMT1 (N6AMT1) and growth/differentiation factor 15] was able to differentiate NASH from steatosis. Curiously, a six-protein signature [SELE (E-selectin), insulin growth factor binding protein-7, insulin growth factor binding protein-5, N-acetyl-D-glucosamine kinase, Decorin and interleukin-1 receptor type 2] was able to discriminate NASH subjects with minimal to moderate fibrosis from NASH subjects with moderate to advanced fibrosis. The diagnostic accuracy presented AUROCs between 0.93 and 1. Moreover, Povero et al. [24] recently described the application of SOMAScan® to the proteomic characterization of circulating extracellular vesicles (EVs). The data demonstrated that the quantity and protein constituents of circulating EVs provided strong evidence for the utility of EV proteinbased liquid biopsies for NAFLD/NASH diagnosis. While convincing, further validation using a larger number of patient samples would be required to confirm the efficacy of this type of liquid biopsy.
Alternatively, a specific metabolomics method has provided persuasive results in favor of further elaboration. Surface-Enhanced Raman Spectroscopy (SERS) is based on the collection of light scattered inelastically from a sample upon illumination with a low-power laser -in the presence of nanostructured metal surfaces such as metal nanoparticles [25] . SERS spectra can be easily obtained from aqueous samples and carry molecular information in terms of the chemical composition of the sample, making it suitable for bioanalytical applications. Moreover, recent developments in photonics have enabled the development of cost-effective and compact, portable SERS analyzers capable of rapid analyses within seconds, pushing their potential from the bench to the bedside [26] . In a "label-free" approach, SERS spectra contain information about the species that freely adsorb on the nanostructured metal surface (i.e., the SERS substrate) driven by the affinity with the surface itself [27] . Label-free SERS spectra of biofluids contain information mainly due to low-molecular-weight metabolites. Thus, SERS spectra can be considered as partial metabolic fingerprints, reflecting the metabolic profile of a patient. Such SERS "fingerprints" can be exploited for diagnostic applications in a sort of "untargeted metabolomics" approach [26,27] .
In relation to liver disease diagnosis, the methodology has been applied using this type of label-free SERS modality, for example, in the non-invasive identification of hepatocellular carcinoma through fingerprint metabolomics [28] . In a unique study, SERS technology was utilized to analyze the plasma from a morbidly obese NAFLD cohort [29] . Our data demonstrated that the concentrations of specific metabolites, e.g., uric acid and hypoxanthine, changed in relation to the stage of the disease. For example, the uric acid/ hypoxanthine ratio was able to discriminate between NAFL and NASH in females. While the investigation was a pilot study, based on the rapid spectroscopic analysis of a few microliters of plasma using portable and compact instrumentation, the confidence intervals indicated that all figures of merit for the classification model were between 60% and 90%, with specificity being slightly lower than sensitivity with an AUROC of about 0.8. The promising results of this pilot study suggest that it would be worth performing an extensive validation of this label-free SERS-based method to classify NAFLD on a larger cohort to include male subjects. Curiously, similar exploratory studies using free-label SERS serum metabolomics have been performed to distinguish patients with HCC from those with compensated and decompensated liver cirrhosis [28,30] .
The third approach refers to an "omics" that has received greater attention over last two decades. Today, glycomics is recognized as a powerful toolbox utilizing sophisticated technologies and diagnostic biomarkers for a plethora of disease applications. N-glycans are synthesized in the liver and plasma-B cells and play significant roles in the structure, function, regulation, signaling, mediation and binding of cellular and protein interactions. We know therefore that glycosylation, and aberrant forms thereof including the formation and abundance of sugars, are closely bound with the etiologies and pathogenesis of liver disease [31] . In recent history, liver glycomics has emerged to provide novel analytical tools and clinical applications for the typing and quantitation of N-glycans for the prediction of NAFLD [32] , NASH [33] and HCC [34,35] .
Unfortunately, today, the quantification of glycans and other analytes through mass spectrometry methods is operationally complicated, expensive and not widely available. Notably, in one study, it cost $605 for one multi-omics test [36] , which would far exceed the price expectations of clinical laboratories in the public and private sectors. On the other hand, it is now possible to apply cost-effective glycomics for liver disease diagnosis on affordable capillary electrophoresis (CE) systems used routinely in clinical laboratories for high-throughput screening of rare cancers and blood diseases.
For example, the Glyco Liver Profile, as highlighted elsewhere [31] , provides a simple method to cleave, label and then separate N-glycans from serum by automated CE, thus offering an all-in-one panel of glycan indices that report separate results for inflammation, fibrosis staging, compensated cirrhosis and HCC risk.
In principle, this approach measures an increase of core fucosylated galactosylated biantennary glycan (e.g., NGA2F) released from IgG in relation to a decrease of agalactosylated nonfucoslyated biantennary glycan (e.g., NA2) cleaved from liver synthesized glycoproteins. In previous studies, this approach proved very promising and highly sensitive to NAFLD and NASH with AUCs of 0.74 [37] , 0.66-0.75 [34] and 0.72 [33] where the degree of NASH-related fibrosis was independent of steatosis severity and lobular inflammation [37] . This methodology could have a significant role to play in the stratification and monitoring of NAFL patients with significant fibrosis generating a prognostic indicator of HCC risk [31] , thus requiring further evaluation in clinical practice. Figure 2 schematically shows the technologies explored.

Strengths and limitations
In this review, we discuss the use of three novel high-throughput methods for the discovery of biomarkers in chronic liver disease. Indeed, there are several pros and cons to be mentioned. One of the most notable advantages is the possibility to perform proteomics and metabolomics in a more user-friendly and low-cost format when compared with more traditional methods (LC-MS or protein arrays and LC-MS/GC-MS). Additionally, the case of glycomics technology -which has been simplified for commercial use on routine capillary electrophoresis systems, i.e., Glyco Liver Profile -demonstrates how scientific translation can bring new inventions to the market in a user-friendly and cost-effective format [31] . In addition, applications of the SOMAscan platform are becoming available for use in clinical practice [22] [ Table 1].
Moreover, these methodologies may be suitable for a number of candidate biomarkers present in blood, urine, serum and tissue sample in order to target and extrapolate multiple analytes from single sample extractions. Additionally, these types of technologies are scalable at different levels. For example, the SOMAScan® platform collects thousands of readings from large cohorts of patients with different Figure 2. Schematic representation of underlying principles in the technologies. A: In SOMAscan ® technology, SOMAmer reagents (single-stranded DNA with modified nucleotides), containing three tags (a fluorophore, a photocleavable linker and biotin), bind to streptavidin beads in the assay wells. Then, the sample (i.e., plasma) is added and SOMAmer-plasma protein complexes are formed. Unbound proteins are washed away, and complexes are tagged with biotin; the addition of polyanionic competitors (negatively charged) breaks up the unspecific complexes. The suspension is then exposed to UV light, which causes the breakdown of photocleavable linkers, releasing all SOMAmer complexes. Streptavidin beads are then used to capture the biotinylated SOMAmer -protein complexes, and SOMAmer reagents are removed from beads with denaturing buffer. SOMAmer reagents are hybridized to complementary sequences on a microarray chip and quantified by florescence. Figure adapted from Rohloff et al . [21] ; B: in SERS label-free technology, the biological sample is loaded on the SERS surface (a surface containing silver or gold nanoparticles). After 25-30 min of sample drying, the SERS spectra are collected during 10 s with a Raman spectrometer (equipped with a laser at 785 nm -the optimal wavelength for blood fraction analysis) connected to optical microscopy; C: in the Glyco Liver Profile technology, protein samples are denatured and treated with buffers containing specific enzymes to reach glycosylation sites. Glycans are release by the enzymatic treatment and labeled for further visualization. Then, the samples are loaded onto the fully automated V8 Nexus machine for separation by capillary electrophoresis (glycans are separated based on the difference in their mass to charge ratio). Data analysis is performed on the electropherograms of four key N-glycans, and liver profile indices are obtained pathologies, generating virtual proteomes for disease prediction and diagnosis and providing the capability to read high-throughput and multiplex biomarker signatures for clinical diagnosis and prognosis. This "scalability factor" facilitates the possibility to utilize computational power and machine learning, pattern recognition and data analytics to develop and simplify disease pathology interpretation from complex datasets.
To overcome the immediate limitations, significant improvements would need to be made to the consistency in the acquisition of quantitative data. For example, the technologies discussed here measure relative concentrations using external controls [38,39] . Without internal controls and standard curves, there would be a degree of uncertainty in relation to measurements within the linear dynamic range due to batch or plate effects outside of standard parameters. In the case of SERS technology, improvements could be made to the spectral band assignment for reliable interpretation of biomedical SERS. For example, standard operating protocols and a unified library of reference biomaterials would facilitate repeatability and automation in data acquisition and analysis [39] . Current advancements in automated microfluidics, lab-ona-chip devices and machine learning integrated with SERS would expand the potential and development of applications to create a diagnostic and therapeutic toolbox for clinicians working with complex pathologies.

CONCLUSIONS AND FUTURE DIRECTIONS
NAFLD is a multi-system disease underpinned by alternating networks of genes, proteins, metabolites, glycans and hormones, which are also affected by personalized health conditions, dietary habits and environmental factors. The currently available diagnostic methods are inefficient in depicting the real ongoing situation in the liver, mainly because they all provide a "static picture" of a moment in time. By no means is this useful in predicting the progression/regression of NAFL/NASH. Thus, the real challenge for the future is to develop techniques that can provide a "dynamic movie" for understanding the ongoing disease pathogenesis.
In the search for non-invasive discriminators of NAFL/NASH and the associated fibrosis stage with optimum histological accuracy, no single biomarker is emerging as the optimum solution. On the other hand, thanks to the application of high-throughput technologies in genomics, proteomics and metabolomics (glycomics, lipidomics, etc.) which generate an enormous quantity of data, panels of biomarkers or disease marker signatures are emerging as part of the next generation in diagnostic tools. Nevertheless, the clinical utility and cost-effectiveness of these new tests will need to be proven against available biomarkers and scanning modalities, although it is encouraging that these specialist tests are becoming available on routine chemistry systems and POC platforms.
In conclusion, there is a need to rethink the approach by accepting the use of a more efficient, complex and heterogeneous strategy by integrating all the available resources and methodologies. To tackle this ambitious issue, we need a way to consolidate and analyze reliable and complex datasets, while taking advantage of the "connectivity" and "integration" of systems and data powered by artificial intelligence, to delineate, target and measure NAFL/NASH disease for diagnostic and prognostic objectives. Following this logic, patients would be stratified according to their metabolic, genomic, proteomic, lipidomic and microbiotic changes, providing a more precise and "personalized" picture of patient health captured comprehensively as a moving target through time. If we accept this challenge, perhaps healthcare systems could be mobilized to connect IVD, big pharma, laboratory infrastructures and patients through primary care -including access to untapped longitudinal patient datasets -so that technology translation and therapy development move symbiotically in an accountable framework with the broader goal of creating predictable disease pathways and personalized treatment strategies.