Application of topological and physicochemical descriptors: QSAR study of phenylamino-acridine derivatives

In the present investigation the applicability of various topological parameters are tested for the QSAR study on phenylamino-acridine derivatives. For the modeling of DNA binding affinity of phenylamino-acridine derivatives the regression analysis shows that even in the mono-parametric correlations the topological and physicochemical parameters give significant regression coefficients. Furthermore using combinations of topological, physicochemical parameters along with the indicator parameters, a tremendous improvement in the statistics has been observed. The results are critically discussed on the basis of regression data and cross-validation parameters


Introduction
Quantitative structure-activity relationship and Quantitative structure-property relationship (QSAR/QSPR) studies are indubitably of great importance in modern chemistry and biochemistry.To obtain a significant correlation, it is essential that appropriate descriptors are employed, whether they are theoretical, empirical or derived from readily available experimental characteristics of structures.1][22][23] In such a generalization, most biological activities are dominated by molecular size, which is well characterized by most of the physicochemical properties. 24deally, the activities and properties are connected by some known mathematical function, F: Biological activity = F [structure (in present study topological & physicochemical descriptors are used as the structural parameters.)]Biological activity can be any measure such as log1/C, K i , IC 50 , ED 50, EC 50 , logK and K m .
The relationship or function is more often than not a mathematical expression derived by statistical or related techniques.In present study the multiple linear regression (MLR) technique is used.The parameters describing structural and physicochemical properties are used as independent variables and the biological activities are dependent variables.
In the present investigation a QSAR study is performed over a set of 21 phenyl acridine derivatives.][27][28][29][30][31][32][33][34] Acridine derivatives are among the oldest classes of bioactive compounds, widely used as antibacterial and antiprotozoal agents.Some work in these areas continues, but recent research has mainly focused on their use as anticancer drugs, because of the ability of the acridine chromophore to intercalate into DNA and to inhibit topoisomerase enzymes. 35These compounds also have been used as chemotherapeutic agents against cancer cells. 36In the field of antitumor DNA-binding agents, this class of acridine derivatives play an important role both in the number of active compounds and in the importance of DNA binding affinity. 37he DNA binding affinity (logK) is then modeled using distance-based topological indices: Wiener (W) 38 , Szeged (Sz) 39,40 Balaban (J) 41 indices and Randic connectivity index (χ) 42 related to branching of molecules along with physicochemical properties: Molar Refractivity (MR), Molar Volume (MV), Parachor (Pc) etc. and indicator parameters accounted for the substitution effect on various positions.
The results obtained are better (discussed in the Result and Discussion part of this study) than those of a previous QSAR study performed by Hansch and coworkers 43 in their review work on applicability of hydrophobic parameters in QSAR.

Results and Discussion
The phenylamino-acridine derivatives (Figure 1), their DNA binding affinity and indicator parameters are given in Table 1 (see p. 141).
Various widely used topological indices tested in the present study are recorded in Table 2 (see p.142).Table 3 (see p. 143) contains the tested physicochemical properties.The details of calculation of these indices and the source of calculation of physicochemical properties are given in the experimental section of this paper.
In proposing QSAR models for modeling the DNA binding affinity we have used the maximum R 2 method. 44Initially we have used Pogliani's quality factor Q 45,46 for investigating predictive power of the various parameters and finally we used the cross validation parameters to prove our findings.The comparisons of various models by cross validation parameters are shown in Table 5 (see p. 145) The inter-correlation of various parameters and the correlation of parameters with biological activity (logK) are presented in Table 4 (see p. 144) in a correlation matrix.As shown by the correlation matrix, DNA binding affinity has a good correlation with all four topological indices as well as with the physicochemical parameters like MR, MV, Pc.Correlation data of univariate correlation of Wiener index (W) (r = 0.8118), Randic connectivity index (χ) (r = 0.832), Balaban index (J) (r = 0.84), and Szged index (Sz) (r = 0.824) exhibits the significance of topological indices in QSAR study of Phenylamino-acridine derivatives.Also a good correlation with physicochemical properties like MR (r = 0.886), MV (r = 0.862) and Pc (r = 0.869) expresses the importance of the structural and volumetric features in theoretical modeling of DNA binding affinity (logK).
This also shows the dependence of activity on structural features of the molecule as well as justifies the structural numerates of molecules in the form of topological indices.
For the QSAR study of the same series we tested the bivariate combinations of the parameters.The results obtained form the bivariate combinations are encouraging and better models are shown below with their statistics.logK = 2.2577 x 10 -4 (± 7.3634 x 10 -5 ) W + 0.3477 (± 0.0942) I DS + 5.528 When earlier work 43 was repeated using an entirely new set of parameters, the new results should show either better statistics or be of an equivalent quality.In the present study the models (Eq 1 to 6) have competitive statistics with less and totally different parameters then the model proposed by Hansch and co-workers 43 in their review study on positive hydrophobic parameters.
The remaining biparametric models (Eq 7 to 10) containing combinations of MR, MV, PC, and indicator parameter I DS and I 3 gives slightly better statistics with less standard error of estimation (Se) and higher Q value then that of model proposed by Hansch and co-workers 43 (n = 21, r = 0.92951, Se = 0.134, Q = 6.94).The models proposed in the present study are also better in terms of parameters because the present models have less parameter than the model proposed by Hansch and co-workers. 43q 1 to 10 also exhibit the applicability of topological and physicochemical parameters for the QSAR study of phenylamino-acridine derivatives.
For detailed SAR studies we tested the trivariate combinations which resulted in an excellent improvement in statistics.The best triparametric model was obtained from the combination of MR, I DS and I 3 as below.logK = 0.029(± 0.0054) MR + 0.2085(± 0.0861) I DS + 0.1132 (± 0.0761) I 3 + 2.5501 (11) n = 21, r = 0.947, Se = 0.1164, Q = 8.14 The statistics obtained from Eq 11 demonstrates the role of the parameter MR in the modeling of DNA binding affinity.The parameter MR is a physicochemical parameter and is a combined effect of size and polarizability of the substituents.It characterizes deformation of molecular electrons distribution.The equation also shows the direct relationship between MR and logK, i.e., an increase in Molar Refractivity enhances the DNA binding of phenylaminoacridine derivatives.The equation also expresses the significance of the indicator parameter I DS (which accounts for the presence of di-substitution) in DNA binding, i.e., di-substitution in phenylamino-acridine derivatives enhances their DNA binding affinity.Similarly the positive correlation coefficient of indicator parameter I 3 accounting for substitution at the 3 rd position shows the direct relationship between substitution at the 3 rd position and DNA binding affinity.The presence of the indicator parameter I DS in most of the models and its high magnitude in eq 11 demonstrates the dominating role of di-substitution in DNA binding affinity of phenylaminoacridine derivatives as compared to other indicator parameter and topological parameters.The estimated DNA binding affinities from Eq. 11 are shown in Table 6 (see p. 146) and are graphically presented in Figure 2 (see p 147).
As opposed to traditional regression methods, the method of cross-validation estimates the trustworthiness of a model by predicting data.This method uses cross-validated fewer parameters: PRESS (predicted residual sum squares), SSY (sum of the squares of response value), r 2 cv (overall predictive ability), and adjusted r 2 .PRESS (predicted residual sum of squares) is an important cross-validation parameter as it is a good approximation of the real predictive error of the models.Its value being less than SSY (sum of the squares of response value) points out that the model predicts better than chance and can be considered statistically important.In the present case all the proposed models have PRESS << SSY demonstrating them to be better than chance and statistically significant.
Furthermore, the ratio PRESS/SSY is used to estimate the confidence interval of the DNA binding affinity.To have a dependable QSAR model, PRESS/SSY should be smaller than 0.4.In our case the ratio PRESS/SSY ranges between 0.11-0.29 indicating that all the proposed models are reliable QSAR models.
The indication of the performance of the model is obtained from r 2 cv (the overall predictive ability).In our case, the highest r 2 cv is found for the model expressed by equation (11), indicating that it has an outstanding predictive power.

Conclusions
On the basis of the results discussed above, it can be concluded that the DNA binding affinity of phenylamino-acridine derivatives are structure specific in nature and most of the topological and structural parameters like W, 1 χ, Sz, MV, Pc and MR can be applicable for the modeling of DNA binding affinity of the phenylamino-acridine derivatives.It can be also concluded that the disubstitution and substitution at 3 rd position have a significant role in the DNA binding affinity and the models proposed in the present investigation are better then the previously proposed models. 43

Experimental Section
Biological activity (logK)-DNA binding affinity expressed as logK, was taken from the literature. 43opological indices-All the topological indices used are calculated from the hydrogen suppressed molecular graphs.Though their calculations are exclusively discussed in the literature, we give below the expressions used for their calculations.Wiener index (W) 38 -Wiener index W = W(G) of G is defined as the half sum of the elements of the distance matrix.
Where, (D) ij is the ijth element of the distance matrix which denotes the shortest graphtheoretical distance between sites i and j of G.

(14) Bonds
Where M is the number of bonds in G, µ is the cyclomatic number of G, and d i (i = 1,2,3, N; N is the number of vertices in G) is the distance sum.
The cyclomatic number µ = µ(G) of a cyclic graph G is equal to the minimum number of edges necessary to be erased from G in order to transform it into the related acyclic graph.In case of monocyclic graph µ = 1 otherwise it is calculated by means of the following expression Szeged index (Sz) 39,40 -the Szeged index, Sz = Sz(G), is calculated according to the following expression: Sz = Sz (G) = Σ n u .n v Edges Where n u is the number of vertices lying closer to one end of the edge e = uv; the meaning of n v is analogous.Edges equidistance from both the ends of an edges, e = uv are not taken into account.

Figure 1 .
Figure 1.Parent structure of phenylamino-acridine derivatives used in present study.