Principal properties (PPs) as solvent descriptors for multivariate optimization in organic synthesis: specific PPs for ethers

Principal properties (PPs) for solvents were extended to 113 solvents by the addition of ten ethers. Specific “PPs” for ethers, suitable for solvent optimization in Grignard reactions, were also derived and their physico-chemical significance discussed


Introduction
The application of multivariate strategies provides great advantages in optimizing desired properties (yield, regio-or stereoisomeric ratios, etc.) by simultaneous variation of experimental conditions-which may be continuous (temperature, time, concentrations, etc.) and discrete (solvent, catalyst, etc.) variables: the latter are parameterized by the so-called principal properties (PPs).Intrinsic properties suitable for experimental design need to be orthogonal to each other.4][5] First and second generation principal properties for heteroaromatics moieties, based on aromaticity 6 and on 3D-GRID structural parameters 7 respectively, have been reported by our group.
A typical example of the potentialities of optimization in PPs is provided by the results of a multivariate experimental design, based on the PPs of Lewis acid catalysts and solvents, which give a better understanding of the effects of the above parameters on the isomeric distribution in the reaction of phenylhydrazones derived from unsymmetrical ketones (the well known Fischer indole synthesis).Multivariate optimization achieved not only regiospecific synthesis of single indole regioisomers, 8 but almost quantitative yields in a single step (one-pot) reaction. 9It is striking that the above result was achieved for a reaction, such as the Fischer indole synthesis 10 known for more than a century, and to which an entire book has been devoted. 11n addition to statistical orthogonality, PPs derived from PCA multivariate characterizations have the advantage of being less influenced by measurement errors and system-specific variations than are single descriptors.Moreover, reliable PPs can be obtained from original data matrices even with missing data.
However, as PPs are derived by an empirical statistical method, such as PCA, carried out on a data matrix with a given number of objects and variables, the updating of such descriptors is needed as new properties (variables) become available for "old" and "new" objects.In particular, solvent PPs first derived by Carlson and co-workers based on eight parameters for 82 solvents 12 have been extended by the same author 1 to 103 solvents and integrated by addition of a 9 th variable (water-solubility).
For specific organic syntheses such as Grignard reactions, however, the use of solvent is chemically limited to a specific class, i.e., ethers.
In this context, we extend the solvent set by addition of ten solvents, all ethers, to the 103 considered by Carlson 1 in order to derive PPs from a larger data set containing 113 solvents.
Furthermore, we also consider an "ether" class model in order to derive "ether PPs" which might be more informative in characterizing the above solvents specifically for the Grignard reaction.Figure 1 shows that, as expected, the PPs for solvents 1-103 derived from the 113 solvents model, with a few exceptions (differences above 0.5 are discussed below), closely resemble those of the 103 solvents model reported in ref. 1. Differences in both t 1 and t 2 found for triglyme (54) and tetraethylene glycol (83) can be ascribed to the addition of new values for 54 and correction of logP for 83.Significant differences in t 1 for ethanol (10) and in t 2 for N,Ndimethylacetamide (32) are due to corrections of descriptor values, while those in t 1 for 4methyl-1,3-dioxol-2-one (21) and diglyme (55) are the result of the insertion of new values.Differences in t 2 for chlorobenzene (61) and in t 1 for piperidine (69) are probably due to printing errors in Table 15A.Figure 2, the PCA "p 1 -p 2 loadings plot" (see equation 1 in the Experimental Section), elucidates the descriptors information content, and provides guidance for interpreting the physico-chemical meaning of solvents PPs.The first PC, exhibiting high p 1 values for descriptors such as dielectric constant (3), dipole moment (4), E T (6), and water-solubility (9), and the lowest p 1 value for logP (8), can be related to the solvent polarity.1 increase on increasing the molecular weight of hydrocarbons (e.g., 103, 82, 102), of primary alcohols (e.g., 4, 10, 14, 15) and of ethers (e.g., 71, 98, 75, 105).The PPs in Table 1 can be conveniently adopted as solvent descriptors in multivariate optimization of reactions in which a wide range of solvents may be used.However, severe limitations in solvent selection may occur owing to chemical reasons, e.g., the Grignard reaction can be carried out only in ethers.In this case, where a small portion of the experimental space can be investigated, it appears appropriate to derive PPs from a specific "class" model including only chemically similar solvents.Therefore PCA was carried out on a data matrix including 24 objects (ethers 54-56, 63-65, 67, 70, 71, 73, 75, 93, 96, 98, 104-113) and seven variables (descriptors 1-3, 5-8).Exclusion of descriptors 4 and 9 from the analysis is dictated by the lack of the above data for many solvents (see Table 1).PCA provided a 3-PC model explaining 95.1% of variance (56.5 first PC, 31.6% second PC and 7.0% third PC).Ether PPs (t 1 , t 2 and t 3 ) derived from this model are reported in Table 2 and plotted in Figure 3.The first PP (t 1 in Table 2), exhibiting an excellent correlation (R 2 = 0.93) with the second PP derived from the general model (t 2 in Table 1), can be related to the ether molecular weight, i.e., with the lowest value for diethyl ether (71) and very high values for 2-ethoxynaphthalene (107) and diphenyl ether (70).This correlation is not surprising, as by restricting the model to a class of chemically similar solvents, it is expected that molecular weight becomes the first systematic variation evidenced by PCA.The interpretation of the 2 nd and 3 rd PPs is not straightforward.The loadings plot p 1 -p 2 shows (Figure 4a) grouping of descriptors 1, 2, 5 and 7 and a clear differentiation of descriptors 3 and 6 (low p 2 ) from 8 (high p 2 ) resembling the trend already observed in the general model (Figure 2), where variable 8 was discriminated from 3 and 6 by the first component.However, the correlation between t 2 for 24 ethers in the ethers model (Table 2) and t 1 in the overall model (Table 1) is very poor (R 2 = 0.53).This can be reasonably explained by considering that the first component in the general model accounted for, "interclass solvent polarity", which can be roughly represented by the dielectric constant, a bulk property measuring non-specific solvation effects opposite to those of logP as, in general, highly polar solvents are not very lipophilic.Accordingly, high t 2 values are exhibited in Figure 3a by symmetrical ethers with a C 4 -or C 5 -chain (75,105,106) and low t 2 values by glymes (54-56).The 3 rd PC is required to explain descriptor 3, whose information content, rather than being similar to that of descriptor 6 (see Figure 4b) is closer to that of descriptor 8, and to differentiate descriptor 2 (boiling point) from descriptor 1 (melting point).In fact, by considering a more homogeneous class, specific solvation effects such as hydrogen bonding and dipole-dipole interactions may be evidenced.Accordingly in Fig. 3b the 3 rd PC is required to differentiate solvents with very low t 3 values such as dioxane (67), dimethoxymethane (112) and diethoxyethane (113), having both low ε and logP values (i.e., high water-affinity in spite of their low "bulk" polarity).

Conclusions
Solvent PPs (t 1 and t 2 ) were extended to 113 solvents and specific PPs for ethers (t 1 , t 2 and t 3 ), suitable for solvent optimization in the Grignard reaction, also derived.The score in the ether model t 1 , related to the molecular weight, exhibits an excellent correlation with t 2 in the overall model, while t 2 and t 3 account for specific solvent effects.

ISSN 1424-6376
Page 63 © ARKAT USA, Inc Experimental Section General Procedures.The data set used for PCA 13 was a table (matrix) in which 113 solvents were characterized by nine physico-chemical properties.The variables have been autoscaled by multiplying the variables by appropriate weights (the reciprocal of the variable standard deviation) to give them unit variance (i.e., the same importance).PCA was carried out by using the SIMCA software package 14 on a data matrix containing x ik elements (113 x 9 for the overall model and 24 x 7 for the ethers model, respectively), where the index k is used for the physicochemical properties (variables) and index i for the solvents (objects).Autoscaled matrix elements were then fitted into a model given by Equation ( 1), where the number A of significant cross terms (components), and the parameters p ak and t ia are calculated by minimizing the residuals, e ik , after subtracting x k (the mean value of the i th experimental quantities x k ).The deviations from the model are expressed by the residuals, e ik .The number of significant components (A) was determined using the cross-validation technique. 15

Figure 1 .
Figure 1.Correlation between PPs for solvents 1-103 derived in the present work and those in ref. 1.
Figure1shows that, as expected, the PPs for solvents 1-103 derived from the 113 solvents model, with a few exceptions (differences above 0.5 are discussed below), closely resemble those of the 103 solvents model reported in ref.1.Differences in both t 1 and t 2 found for triglyme (54) and tetraethylene glycol (83) can be ascribed to the addition of new values for 54 and correction of logP for 83.Significant differences in t 1 for ethanol(10) and in t 2 for N,Ndimethylacetamide (32) are due to corrections of descriptor values, while those in t 1 for 4methyl-1,3-dioxol-2-one (21) and diglyme (55) are the result of the insertion of new values.Differences in t 2 for chlorobenzene (61) and in t 1 for piperidine (69) are probably due to printing errors in Table15A.1 of ref. 1, as the plot in the same book and PPs in ref. 12 are consistent with our values.Figure2, the PCA "p 1 -p 2 loadings plot" (see equation 1 in the Experimental Section), elucidates the descriptors information content, and provides guidance for interpreting the physico-chemical meaning of solvents PPs.The first PC, exhibiting high p 1 values for
and p ak (the loadings) depend only on the physico-chemical properties (variables), and the t ia (scores) only on the solvents.

Table 1
reports the data set considered to derive PPs in ref. 1, integrated by corrected or new descriptors for the original 103 solvents and expanded by including descriptors for ten new solvents (all ethers).PCA of the above data matrix, including 113 solvents and 9 descriptors (cf.

Table 1 )
provided a 2-principal-components (PC) model explaining 69.4% of the variance.The scores of such a model, new PPs for all 113 solvents, are also recorded in Table1.