## INTRODUCTION

Keratoconus is a corneal disorder where the central or paracentral cornea undergoes progressive steepening causing irregular astigmatism and corneal scarring and an associated vision deterioration due to the progression of the corneal irregularity. The general incidence of keratoconus has been estimated in 1:2,000 of the general population, with a prevalence of 54:100,000 in the USA and 229:100,000 in some Asian countries.^{1,2}

Early grades of the disease are still a diagnostic challenge for clinicians because the patient does not show symptoms or clear clinical evidence of the disease. The distance corrected visual acuity is normal or might be minimally diminished due to the development of higher-order aberrations. The clinical diagnosis of these cases can only be confirmed by corneal topography as they may show a diversity of atypical topography changes.^{2,3}

Identifying and distinguishing such abnormal early cornea cases is a major concern for refractive surgeons because such corneas constitute a main risk factor for ectasia after laser-assisted *in situ* keratomileusis.^{4} The distinction is difficult as there is not a clear threshold defined between the earliest grades of keratoconus and a normal cornea. This is due to the large disparity that exists in each one of the corneal parameters. However, it seems evident from recent investigations that different corneal parameters change in association when keratoconus develops, even without noticeable changes in best corrected vision.^{5-7} Due to the absence of visual symptoms in the early grades of the disease, the majority of such cases are identified when performing preoperative corneal refractive surgery screening.

Therefore, both for diagnosis and therapeutic purposes such early keratoconus cases with normal visual acuity offer a unique possibility to analyze the clinical profile changes that make them different from the normal cornea and the further evolution of such clinical profile toward visual deterioration. Pattern recognition analysis of early keratoconus with normal vision not only allows to better recognize the early changes that characterize the development of keratoconus but also those patterns which make its evolution toward visual loss, with the consequent therapeutic indications more possible.

The aim of this study is to use, for the first time, pattern recognition analysis as a method to ascertain the clinical profile of eyes suffering from early keratoconus but with normal levels of vision taking into consideration multiple clinical variables which are usually available in most of regular ophthalmology offices.

## MATERIALS AND METHODS

### Subjects

The investigation was designed as a multicenter observational comparative study. It included a total of 995 eyes divided into two groups: 625 eyes of 451 subjects diagnosed of early keratoconus group with normal visual acuity (spectacle CDVA ≥ 0.9, 0.05 LogMAR) with a mean age of 36.07 ± 10.54 (13—70) years. The control group consisted of 370 healthy eyes of 210 subjects with a mean age of 34.21 ± 7.18 (21—54) years. Of total, 421 (42.3%) were women and 574 (57.7%) were men. No statistically significant differences between the study groups were observed in the age variable (p > 0.05).

Cases with any other ocular comorbidity with a potential impact in any of the variables of the study were excluded. Those in which ocular surgical procedures were performed were also excluded. The use of contact lens was discontinued 4 weeks prior to the date of the corneal topography map study.

The cases were included from five ophthalmology centers from Spain: Vissum Instituto Oftalmologico de Alicante, Instituto of Ophthalmobiologia Aplicada (IOBA), Novovision Clinic, Barraquer Ophthalmology Center, and University of Navarra.

The inclusion criteria for the keratoconus group were: subjects diagnosed with early keratoconus,^{8} CDVA ≥ 0.9 on the decimal scale, or 0.05 LogMAR, no slit-lamp findings, no or doubtful scissoring on retinoscopy, and presence of irregular topography defined as asymmetric bowtie (AB), inferior steepening (IS), skewed radial axis (SRAX), or AB/SRAX patterns.

The cases in the control group were selected at random from subjects that were evaluated for corneal refractive surgery. Data from these cases were taken in the presurgery appointments.

This observational study was conducted in accordance with the ethical standards stated in the Declaration of Helsinki and approved by the institutional ethical board committee of each one of the centers participating in this investigation with informed consent obtained.

The data used for this investigation correspond to the official database “Iberia” of keratoconus cases created for the purpose of the multicenter study of keratoconus (National Network for Clinical Research in Ophthalmology RETICS-OFTARED, sponsored by Instituto Carlos III).

### Clinical Examination Protocol

A complete and uniform ophthalmologic examination was performed in all cases following the same study protocol. The examination included LogMAR uncorrected distance visual acuity (UDVA), logMAR spectacle-CDVA, manifest refraction (sphere and cylinder), slit-lamp biomicroscopy, Goldmann tonometry, fundus evaluation, ultrasonic pachymetry (DHG500 US pachymeter, DGH Technology, Inc.), and corneal topographic analysis. Topographic data were collected from five different centers, using three different corneal topography systems: The CMS 100 Topometer (G. Rodenstock Instrument GmbH, Ottobrunn, Germany), CSO (CSO, Firenze, Italy), and Orbscan IIz system (Bausch & Lomb, Rochester, New York, USA). The first two devices are Placido-based systems, and the Orbscan IIz is a combined scanning-slit and Placido disk topography system. Although the agreement between these specific devices has not been reported, Orbscan and Placido-based devices have been proven to provide similar accuracy and precision on calibrated spherical test surfaces.^{9} The following topographic data were evaluated and recorded with the three corneal topographic devices:

*Simulated keratometry (SIM K):*It represents the simulation of the readings that would be obtained with a keratometer, i.e., the mean sagittal curvature from the 4th to the 8th Placido ring. The simulated keratometry values are available for principal meridians: SIM K1 and SIM K2.*Keratometry (Meridians at 3 mm):*Corneal dioptric power in the flattest meridian for the 3 mm central zone (K1 3 mm), corneal dioptric power in the steepest meridian for the 3 mm central zone (K2 3 mm), mean corneal power in the 3 mm zone (mean K 3 mm).*Asphericity:*Mean asphericity in a 4.5 mm diameter corneal area, and mean asphericity in an 8 mm diameter corneal area.*Astigmatism:*Mean astigmatism in a 3 mm diameter corneal area, and mean astigmatism in a 5 mm diameter corneal area.*Corneal anterior corneal surface aberrometry:*Total root mean square (RMS), RMS error of primary astigmatism, primary spherical aberration, primary coma aberration, and RMS error for 3rd to 6th order Zernike polynomials as well as spherical-like, coma-like, and high-order aberrations were acquired at 6 mm.*Corneal indexes:*Inferior—superior index (I-S), apical gradient curvature (AGC), apical curvature (AK), and surface asymmetry index (SAI) were obtained; I-S quantifies curvature asymmetry on both sides of the horizontal meridian in a keratoconus eye. In this one, keratoconus diagnose is considered when the difference between asymmetries of both sides is greater than 1.8 D; AGC represents the average variations per unit of length of the corneal power, taking the apical power as reference. Values of AGC between 1.5 and 2 D/mm are considered as suspects, and higher than 2 D/mm are considered diagnostic of keratoconus; AK represents the power of the cornea at its apex. In this case, values between 48 and 50 D are considered suspect keratoconus and higher than 50 D as keratoconus diagnosis. The SAI is the average difference between corresponding points 180° apart on 128 equally spaced meridians. A spherical surface corresponds to SAI = 0, so the higher the SAI, the higher the asymmetry.

### Study of Corneal Biomechanics

The corneal biomechanics were evaluated in a group of patients using the Ocular Response Analyzer (Reichert Inc, Depew, NY). This device provided the study parameters corneal hysteresis and corneal resistance factor, both associated to the corneal resistance to deformation.^{10}

### Statistical Analysis

Due to the large number of variables involved in the clinical examination of keratoconus (visual, topographic, aberrometry, etc.), to perform a pattern recognition analysis properly, the combination of two statistical methods, PCA and discriminant analysis, was performed.

The variables related to each other in the different factors or components were grouped for this purpose. From the factors obtained, the study groups (normal *vs*. keratoconus grade I) were differentiated by performing a discriminant analysis. A discriminant function was obtained enabling the distinction of each patient and the corresponding group. The Statistical Package for the Social Sciences for Windows software package (version 23.0 SPSS, Inc.) was used for the statistical analysis. Principal component analysis is a group of techniques used to obtain high-dimensional data which use the dependencies among the variables to represent it in a more tractable, lower-dimensional form, without losing too much information. Principal component analysis is one of the simplest and most robust ways of doing such dimensionality reduction.

In order to summarize the p-dimensional vectors, they have to be projected down into a q-dimensional subspace. The summary will be the projection of the original vectors on to q directions, the principal components, which span the subspace.

There are several equivalent ways of deriving the principal components mathematically. The simplest way is by finding the projections which maximize the variance. The first principal component is the direction in space along which projections have the largest variance. The second principal component is the direction which maximizes the variance among all directions orthogonal to the first direction. The kth component is the variance-maximizing direction orthogonal to the previous k-1 components. There are p principal components in all of them.

Rather than maximizing the variance, it could be more plausible to look for the projection with the smallest average (mean-squared) distance between the original vectors and their projections on to the principal components; this turns out to be equivalent to maximizing the variance.

The main factors obtained contain most of the observed variance and avoid including redundant information. For this to happen, the variables have to be correlated with each other, indicating that it can be expressed as a linear combination. A larger variance incorporated into each of these components shows that it contains a greater amount of information. The mathematical process, by which the choice of the major components of a sample is performed, is made from a correlation matrix. Through the application of the corresponding factor analysis, the other correlation matrix factor is extracted. The columns of this last matrix represent a factor and the rows coincide with the total number of observed variables.

First, the Bartlett sphericity test, where results indicate that making a factor model is suitable (χ^{2} = 18,586.060; p < 0.001). Moreover, the contrast of Kaiser—Meyer—Orlin indicates that the sample size is adequate for this factor analysis (= 0.811).

The rotation method used in this PCA was the varimax method. A varimax rotation is an orthogonal rotation, meaning that it results in uncorrelated components. Compared with some other types of rotations, a varimax rotation tends to maximize the variance of a column of the factor pattern matrix (as opposed to a row of the matrix). To show the eigenvalues associated with a component or factor in descending order *vs* the number of the component or factor, a scree plot is used. Scree plots can be used for the PCA and factor analysis to visually assess which components or factors explain most of the variability in the data. The ideal pattern in a scree plot is a steep curve, followed by a bend, and then a flat or horizontal line.

Mann—Whitney U-test was applied to determine whether there were statistically significant differences between study groups in each of the factors. In all cases, differences were considered statistically significant when the p value was less than 0.05.

Receiver operating characteristic (ROC) curves were established in order to determine what factors or components have a better capacity of discrimination between groups.

Finally, to complete the pattern recognition identification, a discriminant analysis was performed to identify the characteristics that differentiate the study groups and create a function that is able to distinguish as accurately as possible the members of either group. It is a multivariate analysis technique that is able to take advantage of the relationships among large numbers of independent variables to maximize the capacity of discrimination. The aim of this discriminant analysis is to find a linear combination of independent variables that allow the study groups to be discriminated. Once a possible combination is found, it may be used to classify new cases. The stepwise inclusion method of Wilks’ lambda was used in this study, with classification by Snedecor's F criterion. According to the canonical correlation (η = 0.624), which is moderately high, the obtained function acceptably discriminates, and Wilks’ lambda (with Δ = 0.611, p < 0.001) indicates that the function is significant.

The number of cases of keratoconus and normal cases that were confirmed by the new image recognition profile were used to calculate the sensitivity and specificity of the method of this newly created diagnosis approach. The sensitivity of the pattern recognition test refers to the ability of the test to correctly identify those patients with keratoconus (true positives/[true positive + false negatives]). The specificity of the pattern recognition refers to the ability of the test to correctly identify those patients without keratoconus (true negatives/[true negatives + false positives]).

## RESULTS

### Principal Component Analysis

Out of 995 eyes collected retrospectively, 318 eyes (184 of these cases from the control group and 134 from the keratoconus group) presented data for all variables collected for the model. This is an essential requirement to apply to the PCA. The following variables were excluded from the analysis for lack of casuistry for both groups: Position of the corneal irregularity, corneal hysteresis, corneal resistance factor, and the thickness of the cornea in different positions (central, nasal, temporal, superior, and inferior). Also, during the test procedure the variables to present communities less than 0.6 were excluded. Commonality expresses part of the variability of each variable that can be explained by the combination of factors or components extracted in the analysis. Values above 0.6 means that 60% of the variance of the item is explained by the set of factors.

By applying the PCA with varimax rotation, a total of five factors are generated, which explains the 85.51% of the total variability. This value can be considered to be a high enough value to estimate that five is a sufficient number of factors. The scree plot of the components (Graph 1) tends to be used as a graphic contrast to know the number of components to retain.

##### Table 1

##### Table 2

According to this criterion, all the components that are previously located at the area of sedimentation are retained. This part of the graph, in which the components begin to not present steep slopes, occur from the fifth component or factor.

Table 1 shows the results for each factor represented. The coefficients provide the correlation of each variable with the factor in the range of maximum values between 1 and —1. The values of —1 to 0 imply an inverse relationship, that is, an increase in the variable causing a decrease in the factor. The values of 0 to 1 mean a positive relationship, where the variable grows and thus increases the value of the factor.

Factor 1. This component explains 39.80% of the total variance, notably occupying the first place compared with other components presented. This component comprises all attributes characterizing the overall nonorthogonal shape irregularity of the anterior corneal surface.

Factor 2. This factor contains a total of six variables, which are generated from the radius of curvature of the cornea providing clinicians with the necessary information about the orthogonal components of shape. This component explains 16.40% of the total variance.

Factor 3. This factor includes variables that provide information about the terms related to uncorrected dioptrical power. This component explains 10.91% of the total variance.

Factor 4. Three variables comprise this factor: CDVA, LogMAR CDVA, and lines of CDVA. All these variables represent the classification of keratoconus according to the terms related to corrected vision. This component explains 9.71% of the total variance.

Factor 5. The variables that belong to this component refer to the terms related to orientation of orthogonal shape. This explains 8.71% of the total variance.

### Receiver Operating Characteristic Curves

Receiver operating characteristic curves were performed for the resulting factors in order to determine the capacity of discrimination of each of them. The results of the graphs are presented in Graph 2. Factors 1 and 3 presented the highest area under the curve (AUC), those with the greatest ability to discriminate between the study groups. Factor 1 has the greatest AUC with a score of 0.90. The test was not significant for factors 4 and 5 (Table 2). Therefore, they are not capable of discrimination.

### Discriminant Analysis

The standardized coefficients indicate, like the ROC curve, that factors 1 and 3 are at the greatest discriminating capacity. These factors are those with higher values in the function coefficients being 0.924 and 0.521 respectively. This confirms that the variables of corneal shape and profile and the variables that refer to the visual and refractive condition are those that best characterize and differentiate the groups studied. Two ways to classify each case can be obtained this way according to the value obtained in each of these discriminant functions (as shown below). The classification functions (Formula A) enable the classification of each case in the group that obtains the highest score. From sorting functions, the Fisher's discriminant function can be obtained (Formula B), permitting the classification of each case depending on whether the result is positive or negative. If the result is negative, the individual is classified in the group control and if the result is positive, the individual should be classified in the keratoconus group. Table 3 shows the classification results: From a total of 318 cases, 275 were correctly classified from both the control and keratoconus group, representing 86.50%. Total 179 cases were correctly classified out of the 184 cases analyzed in the control group by this system. Regarding the keratoconus group, 96 cases were correctly classified from a total of 134. According to this, the sensitivity of the new pattern recognition approach was 71.6%, while the specificity was 97.3%.

## DISCUSSION

In this work, we report the use of statistical methods for the first time in a high number of cases to recognize the pattern that better allows identification of early keratoconus with normal vision and to differentiate it from normal corneas based on usually available clinical data in a normal updated ophthalmology office. Pattern recognition analysis has become nowadays an important tool in identifying disease and, recently in ophthalmology, in areas, such as intraocular lens calculation. The advantage of pattern recognition identification is that it reduces the variability created by less relevant variables, making the process only dependable of those variables that are strong enough to support the model. Pattern recognition analysis requires a high volume of demographic data coming from the disease or problem group. Thus, this type of approach may be beneficial for a disease like keratoconus which is considered to be a rare disease.^{1} However, while being still considered a rare disease, its diagnosis is made far more frequently today than before due to the progress of corneal diagnostic imaging technology and, especially, its routine application to patient candidates for refractive surgery and no visual symptoms and normal vision who harbor initial stages of this disease. Such initial cases are considered as major contraindication for corneal refractive surgery. The differentiation between normal and abnormal patterns of corneal topography is today well sustained by the definition of different corneal topography and corneal aberrometry indexes and corneal volumetric and densitometry data, among others.^{11} However, there is still a gray line of separation between borderline cases of keratoconus and normal corneas predisposed to the development of keratoconus. In such cases, the study of corneal epithelial thickness has shown to be able to identify abnormal corneas with normal topographies in the apparently normal contralateral eyes of patterns with confirmed keratoconus in the other eye.^{12} However, the main issue in patients with initial keratoconus is their predisposition to develop visual loss which, eventually, may evolve into severe visual disability. In this study, we have undertaken the task to identify the differences that exist between a large number of confirmed cases of keratoconus with normal CDVA and normal patients also with normal vision concerning a large number of variables analyzed by discriminating biostatistical methods not used, so far, in the analysis of keratoconus. The aim of the present study is to identify, by means of PCA, a number of variables that allow us to accurately differentiate between normal and keratoconic grade I corneas with normal levels of CDVA.

##### Table 3

Principal component analysis is applicable and appropriate when you have obtained measures on a number of observed variables and wish to develop a smaller number of artificial variables (called principal components) that will account for most of the variance in the observed studied factors. A principal component can be defined as a linear combination of optimally weighted observed variables. The principal components may then be used as predictor or criterion variables in subsequent analyses. The aim of performing a PCA is to investigate the possible existence of a set of functions that enable a more efficient method to differentiate and distinguish early forms of keratoconus.

This analysis has been previously used in different research areas. Specifically, in studies related to keratoconus it has been used to analyze the visual impact generated in patients with this condition,^{13} to classify keratoconus eyes into subgroups that have similar high-order aberrations,^{14} or to explain and find genetic associations in patients with keratoconus.^{15}

Our study is based on early detection of early stages of keratoconus. These initial stages were initially graded on a classification based on the degree of visual impairment, like the level of corrected vision.^{8} Taking into account this grading system, grade I represents the mild form of the disease, where there is a thin line between normal and pathological corneas. This degree of keratoconus cases is represented for those patients with minimal visual limitation CDVA ≥ 0.90. As we know, the importance of early detection of keratoconus is a key point in the prevention of iatrogenic ectasia in patients undergoing corneal refractive surgery. Thus, several investigations have focused their efforts in developing procedures and techniques for improving the early changes of the cornea that could trigger biomechanical alterations with severe consequence for the quality of life of those patients. Some of these investigations performed specific statistical analyses, such as ROC curves, in order to assess the discriminatory level between normal and pathological patients and also to provide cut-off values that describe the disease in each of the parameters studied. Multiple variables have been studied, such as corneal elevation data, pachymetry, aberrometry, among others. In the study conducted by Nilforoushan et al. using the Pentacam (Oculus) and Orbscan IIz (Bausch & Lomb), authors concluded that the posterior elevation was one of the most powerful tools for the detection of subclinical keratoconus even though the ROC analysis on their study reached an AUC of 0.830.^{16} In another study, Smadja et al^{17} compared a reference spherical surface through the use of a reference aspherotoric surface, aiming to improve the capacity to discriminate between normal eyes and eyes with keratoconus fruste. They reached a sensitivity and specificity of 82.00 and 80.00% respectively. In addition, Prakash et al^{18} have proposed cut-off values of pachymetry in keratoconic patients using Orbscan IIz system. Their results suggest that a cornea with minimal pachymetry <461 μm or a difference between the minimum pachymetry and corneal central thickness greater than 27 μm has 97.50% risk of being suspected keratoconus or clinical keratoconus, and therefore, only 2.50% are likely to be normal.

Other diagnostic methods have relied on creating a geometric modeling for studying volumetric cornea and surface of the cornea. In a recent study conducted by our research group, the difference between normal and mild keratoconic patients by using a three-dimensional computer modeling was analyzed.^{19} In that study, it was observed that one of the variables under analysis, the posterior apex deviation, showed a high discrimination capability between study groups with an AUC of 0.89.

In the current study, we performed singular statistical analysis based on multifactorial evaluation of 27 clinical variables normally available at any updated ophthalmology office grouped into five different factors in order to differentiate between normal and keratoconic grade I eyes. The five factors that independently grouped all those variables and that correlated between them were able to characterize and differentiate between the study groups. Additionally, it was found that three of these factors, factors 1, 2, and 3, showed a high discriminating power between the groups under analysis. Moreover, it was observed that factor 1 represented the best indicator to differentiate between normal and keratoconic corneas, reaching an AUC of 0.90. In our study, factor 1 was mainly represented by those variables related to the shape of the cornea and the aberrometric coefficients. Thus, the fact that factor 1 provides a highly discriminating capability is consistent with what has been published in the scientific literature regarding the importance of the aberrometric analysis in the early diagnosis of keratoconic patients.^{20}

In the present investigation, coefficients of discriminating function were also developed in order to classify in which of the study groups a patient should be included depending on the value presented in each one of the clinical variables under analysis. Thus, when applying the formulas presented, a patient can be classified as normal or having keratoconus grade I depending on the result obtained. As previously mentioned, this approach allowed us to correctly identify between normal and pathological corneas with a level of accuracy above 86% of the cases. Even though it is clear that by performing all of these mathematical formulas in our daily clinical examinations, it is not a practical procedure, these algorithms may be included within the software of our diagnostic instruments in order to increase the accuracy and reliability when performing refractive surgery screening and follow-up of corneal ectatic patients.

Principal component analysis provides a very good approach in order to characterize and differentiate keratoconus patients from normal population. Nevertheless, there are some limitations related to this kind of analysis. The main disadvantage with this approach is that the statistical functions require a large sample (statisticians recommend 10 times the number of variable under study) and 100% of the data to perform the analysis, otherwise the calculations cannot provide an adequate result. Another factor that we consider to be a drawback in this study, and that indeed may increase the specificity and sensibility of the diagnosis test, is that we did not evaluated the posterior surface of the cases under analysis. The outcomes of this analysis in terms of diagnostic sensitivity and specificity might be considered as modest when compared with other methods. However, the advantage of the method here described is that it is applicable to any normally equipped ophthalmology office and it is not restricted to a particular software or corneal topography technology.

In summary, pattern recognition analysis represents a powerful tool in order to characterize and differentiate normal subjects from patients with early morphological changes and without significant visual impairment. Those factors related to the shape of the cornea and the aberrometric coefficients are the ones that show more discriminating capabilities between normal and keratoconic patients. Finally, and even when several mathematical calculations are needed in order to obtained a proper result, these factors may be included in diagnostic topography tools in order to increase the sensibility and specificity in the diagnosis of our corneal refractive surgery candidates and for the follow-up of patients with corneal ectatic disorder. The analysis of the evolution of the pattern profile found here as associated to normal vision in early keratoconus cases may be also related to the progression in the severity of the disease and may reveal in further investigations unveiled associations related to the prognosis of visual loss.

We have demonstrated in the present report, based on multicenter data recording, how by using examination variables usually available in a normal ophthalmology office that pattern recognition analysis differentiated normal from early keratoconic corneas patients without significant visual impairment with adequate diagnostic sensitivity and even better specificity. The advantage of this method may be its availability for the practical ophthalmologists who use regular office equipment and furthermore that is not limited to a specific commercial software corneal topography technology. Further investigation in the use of this diagnostic approach is required to characterize not only the early stages of keratoconus but also its progression along the natural clinical evolution of the disease toward visual loss.