Determination of equilibrium constants

Equilibrium constants are determined in order to quantify chemical equilibria. When an equilibrium constant is expressed as a concentration quotient,
 * $$K=\frac{{[S]} ^\sigma {[T]}^\tau ... } {{[A]}^\alpha {[B]}^\beta ...}$$

it is implied that the activity quotient is constant. In order for this assumption to be valid equilibrium constants should be determined in a medium of relatively high ionic strength. Where this is not possible, consideration should be given to possible activity variation. The equilibrium expression above is a function of the concentrations [A], [B] etc. of the chemical species in equilibrium. The equilibrium constant value can be determined if any one of these concentrations can be measured. The general procedure is that the concentration in question is measured for a series of solutions with known analytical concentrations of the reactants. Typically, a titration is performed with one or more reactants in the titration vessel and one or more reactants in the burette. Knowing the analytical concentrations of reactants initially in the reaction vessel and in the burette, all analytical concentrations can be derived as a function of the volume (or mass) of titrant added.

The equilibrium constants may be derived by best-fitting of the experimental data with a chemical model of the equilibrium system.

Experimental methods
There are four main experimental methods. For less commonly used methods see Rossotti and Rossotti

Potentiometric measurements
A free concentration [A] or activity {A} is measured by means of an ion selective electrode such as the glass electrode. If the electrode is calibrated using activity standards it is assumed that the Nernst equation applies in the form
 * E=E0+RT/nF ln{A}

where E0 is the standard electrode potential. When buffer solutions of known pH are used for calibration the meter reading will be pH.
 * pH=nF/RT (E0-E)

At 298K, 1 pH unit is approximately equal to 59 mV.

When the electrode is calibrated with solutions of known concentration, by means of a strong acid/strong base titration, for example, a modified Nernst equation is assumed.
 * E=E0+s log10[A]

s an empirical slope factor. A solution of known hydrogen ion concentration may be prepared by standardization of a strong acid against borax. Constant-boiling hydrochloric acid may also be used as a primary standard for hydrogen ion concentration.

Absorbance
It is assumed that the Beer-Lambert law applies.
 * $$A=\ell \sum {\epsilon c}$$

where $$\ell$$ is the optical path length, $$\epsilon$$ is a molar absorbance at unit path length and c is a concentration. More than one of the species may contribute to the absorbance. In principle absorbance may be measured at one wavelength only, but in present-day practice it is common to record complete spectra.

Fluorescence (luminescence) intensity
It is assumed that the scattered light intensity is a linear function of species’ concentrations.
 * $$I=\sum{ \phi c}$$

where $$\phi$$ is a proportionality constant.

NMR chemical shift measurements
Chemical exchange is assumed to be rapid on the NMR time-scale. An individual chemical shift $$\bar {\delta}$$ is the mole-fraction-weighted average of the shifts $$\delta$$ of nuclei in contributing species.
 * $$\bar {\delta} =\frac{\sum c_i \delta_i}{\sum c_i}$$

Calorimetric measurements
Simultaneous measurement of K and $$\Delta$$H for 1:1 adducts is routinely carried out using Isothermal Titration Calorimetry. Extension to more complex systems is limited by the availability of suitable software.

Range and limitations

 * 1) Potentiometry. The most widely used electrode is the glass electrode which is selective for the hydrogen ion. This is suitable for all acid-base equilibria. Log10 $$\beta$$ values between about 2 and 11 can be measured directly by potentiometric titration using a glass electrode. This enormous range is possible because of the logarithmic response of the electrode. The limitations arise because the Nernst equation breaks down at very low or very high pH. The range can be extended by using the competition method.
 * 2) Absorbance and Luminescence. An upper limit on log10 $$\beta$$ of 4 is usually quoted, corresponding to the precision of the measurements, but it also depends on how intense the effect is. Spectra of contributing species should be clearly distinct from each other
 * 3) NMR. Limited precision of chemical shift measurements also puts an upper limit of about 4 on log10 $$\beta$$. Limited to diamagnetic systems.
 * 4) Calorimetry. Insufficient evidence is currently available.

Computational methods
It is assumed that the experimental data which have been collected comprise a set of data points. At each i'th data point, the analytical concentrations of the reactants, $$T_A(i)$$, $$T_B(i)$$ etc. are known along with a measured quantity, $$y_i$$, that depends on one or more of these analytical concentrations. A general computational procedure has three main components.
 * 1) Definion of a chemical model of the equilibria
 * 2) Calculation of the concentrations of all the chemical species in each solution
 * 3) Refinement of the equilibrium constants
 * 4) Model selection

The chemical model
The chemical model consists of a set of chemical species present in solution, both the reactants added to the reaction mixture and the complex species formed from them. Denoting the reactants by A, B ..., each complex species is specified by the stoichiometric coefficients that relate the particular combination of reactants forming them.
 * $$pA+qB...\rightleftharpoons A_pB_q...: \beta_{pq...}=\frac{[A_pB_q...]} {[A]^p[B]^q...}$$

When using general-purpose computer programs, it is usual to use cumulative, association constants, as shown above. When reactants and complexes are chemical species ionic charges should be shown explicitly. With aqueous solutions the concentrations of proton (hydronium ion) and hydroxide ion are constrained by the self-dissociation of water.
 * $$H_2O \rightleftharpoons H^+ + OH^-: K_W'=\frac{[H^+][OH^-]}{[H_2O]}$$

With dilute solutions the concentration of water can be assumed to be constant so the equilibrium expression is written in the familiar form of the ionic product of water.
 * $$K_W=[H^+][OH^-]\,$$

When both H+ and OH− must be considered as reactants, one of them is eliminated from the model by specifying that its concentration is to be derived from the concentration of the other. Usually the concentration of the hydroxide ion is given by
 * $$[OH^-]=K_W[H^+]^{-1}\,$$

In this case the equilibrium constant for the formation of hydroxide has the stoichiometric coefficients -1 in regard to the proton and zero for the other reactants. This has important implications for all protonation equilibria in aqueous solution and for hydrolysis constants in particular.

It is quite usual to omit from the model those species whose concentrations are considered to be negligible. For example it is usually assumed then there is no interaction between the reactants and/or complexes and the electrolyte used to maintain constant ionic strength or the buffer used to maintain constant pH. These assumptions may or may not be justified. Also, it is implicitly assumed that there are no other complex species present. When complexes are wrongly ignored a systematic error is introduced into the calculations.

Equilibrium constant values are usually estimated initially by reference to data sources.

Speciation calculations
A speciation calculation is one in which the concentrations of all the species in an equilibrium system are calculated, knowing the analytical concentrations, TA, TB etc. of the reactants A, B etc. This means solving a set of non-linear equations of mass-balance
 * $$T_A=[A]+\sum p\beta_{pq...}[A]^p[B]^q...$$
 * $$T_B=[B]+\sum q\beta_{pq...}[A]^p[B]^q...$$

for the free concentrations [A], [B] etc. The concentrations of the complexes are derived from the free concentrations via the chemical model. Some authors include the free reactant terms in the sums by declaring identity (unit) $$\beta$$ constants for which the stoichiometric coefficients are 1 for the reactant concerned and zero for all other reactants:
 * $$[A] = \beta_{10...}[A],[B] = \beta_{01...}[B] ...\,$$
 * $$\beta_{10...}= \beta_{01...}... = 1\,$$

In this manner, all chemical species, including the free reactants, are treated in the same way, having been formed from the combination of reactants that is specified by the stoichiometric coefficients. The mass-balance equations assume the simpler form.
 * $$T_A=\sum p\beta_{pq...}[A]^p[B]^q...$$
 * $$T_B=\sum q\beta_{pq...}[A]^p[B]^q...$$

In a titration system the analytical concentrations of the reactants at each titration point are obtained from the initial conditions, the burette concentrations and volumes. The analytical (total) concentration of a reactant $$R$$ at the i'th titration point is given by
 * $$T_R=\frac{R_0+v_i[R]}{v_0+v_i}$$

where $$R_0$$ is the initial amount of R in the titration vessel, $$v_0$$ is the initial volume, $$[R]$$ is the concentration of R in the burette and $$v_i$$ is the volume added. The burette concentration of a reactant not present in the burette is taken to be zero.

In general, solving these non-linear equations presents a formidable challenge because of the huge range over which the free concentrations may vary. At the beginning, values for the free concentrations must be estimated. Then, these values are refined, usually by means of Newton-Raphson iterations. The logarithms of the free concentrations may be refined rather than the free concentrations themselves. Refinement of the logarithms of the free concentrations has the added advantage of automatically imposing a non-negativity constraint on the free concentrations. Once the free reactant concentrations have been calculated, the concentrations of the complexes are derived from them and the equilibrium constants.

Note that the free reactant concentrations can be regarded as implicit parameters in the equilibrium constant refinement process. In that context the values of the free concentrations are constrained by forcing the conditions of mass-balance to apply at all stages of the process.

Equilibrium constant refinement
The objective of the refinement process it to find equilibrium constant values that give the best fit to the experimental data. This is usually achieved by minimising an objective function, U, by the method of non-linear least-squares. First the residuals are defined as
 * $$r_i=y_i^{obs}-y_i^{calc}$$

Then the most general objective function is given by
 * $$U=\sum_i\sum_j r_i W_{ij} r_j\,$$

The matrix of weights, W, should be, ideally, the inverse of the variance-covariance matrix of the observations. It is rare for this to be known. However, when it is, the expectation value of U is one, which means that the data are fitted within experimental error. Most often only the diagonal elements are known, in which case the objective function simplifies to
 * $$U=\sum_i W_{ii}r_i^2$$

with $$W_{ij}=0$$ when j≠ i. Unit weights, $$W_{ii} = 1$$, are often used but, in that case, the expectation value of U is the root mean square of the experimental errors.

The minimization may be performed using the Gauss-Newton method. Firstly the objective function is linearised by appoximating it as a first-order Taylor series expansion about an initial parameter set, p.
 * $$U=U^0+\sum_i \frac{\partial U}{\partial p_i}\delta p_i$$

The increments $$\delta p_i$$ are to be added to the corresponding initial parameters such that U is less than U0. At the minimum the derivatives $$\frac{\partial U}{\partial p_i}$$, which are simply related to the elements of the Jacobian matrix, J
 * $$J_{jk}=\frac{\partial y_j^{calc}}{\partial p_k}$$

where pk is the kth parameter of the refinement, are equal to zero. One or more equilibrium constants may be parameters of the refinement. However, the measured quantities (see above) represented by y are not expressed in terms of the equilibrium constants, but in terms of the species concentrations, which are implicit functions of these parameters. Therefore the Jacobian elements must be obtained using implicit differentiation.

The parameter increments $$\mathbf{\delta p}$$ are calculated by solving the normal equations, derived from the conditions that $$\mathbf{\frac{\partial U}{\partial p}=0}$$ at the minimum.
 * $$\mathbf{ \left(J^T W J \right)\delta p=J^T W r }$$

The increments $$\delta p$$ are added iteratively to the parameters
 * $$\mathbf{p^{n+1}=p^n +\delta p}$$

where n is an iteration number. The species concentrations and $$y^{calc}$$ values are recalculated at every data point. The iterations are continued until no significant reduction in U is achieved, that is, until a convergence criterion is satisfied. If, however, the updated parameters do not result in a decrease of the objective function, that is, if divergence occurs, the increment calculation must be modified. The simplest modification is to use a fraction, f, of calculated increment, so-called shift-cutting.
 * $$\mathbf{p^{n+1}=p^n} +f \mathbf{\delta p}$$

In this case, the direction of the shift vector,$$\mathbf{\delta p}$$, is unchanged. With the more powerful Levenberg-Marquardt algorithm, on the other hand, the shift vector is rotated towards the direction of steepest descent, by modifying the normal equations,
 * $$\mathbf{ \left(J^T W J +\lambda I\right)\delta p=J^T W r }$$

where $$\lambda$$ is the Marquardt parameter and I is an identity matrix. Other methods of handling divergence have been proposed.

A particular issue arises with NMR and spectrophotometric data. For the latter, the observed quantity is absorbance, A, and the Beer-Lambert law can be written as
 * $$A_i=\sum\epsilon_{pq...} c_{pq...,i}$$

It can be seen that absorbance, A, is a linear function of the molar absorbptivities, $$\epsilon$$, at the path length used. In matrix notation
 * $$\mathbf{A=\Epsilon C}$$

There are two approaches to the calculation of the unknown molar absorptivities
 * 1) The $$\epsilon$$ values are considered to be parameters of the minimization and the Jacobian is constructed on that basis. However, the $$\epsilon$$ values themselves are calculated at each step of the refinement by linear least-squares:
 * $$\mathbf{\Epsilon = \left(C^TC\right)^{-1}C^TA }$$
 * using the refined values of the equilibrium constants to obtain the speciation. The matrix $$\mathbf{\left(C^TC\right)^{-1}C^T}$$ is an example of a pseudo-inverse.
 * 2) The Beer-Lambert law is written as
 * $$\mathbf{A= \left( \left( C^TC \right)^{-1}C^TA \right) C}$$
 * Golub and Pereyra showed how the pseudo-inverse can be differentiated so that parameter increments for both molar absorptivities and equilibrium constants can be calculated by solving the normal equations.

Parameter errors and correlation
In the region close to the minimum of the objective function, U, the system approximates to a linear least-squares system, for which
 * $$\mathbf{p=(J^TWJ)^{-1}J^TWy^{obs}}$$

Therefore the parameter values are (approximately) linear combinations of the observed data values and the errors on the parameters, p, can be obtained by error propagation from the observations, yobs, using the linear formula. Let the variance-covariance matrix for the observations be denoted by $$\Sigma^y$$ and that of the parameters by $$\Sigma^p$$. Then,
 * $$\mathbf{\Sigma^p=(J^TWJ)^{-1}J^TW \Sigma^y W^TJ(J^TWJ)^{-1}}$$

When $$\mathbf{W=(\Sigma^y)^{-1}}$$, this simplifies to
 * $$\mathbf{\Sigma^p=(J^TWJ)^{-1}}$$

In most cases the errors on the observations are un-correlated, so that $$\Sigma^y$$ is diagonal. If so, each weight should be the reciprocal of the variance of the corresponding observation. For example, in a potentiometric titration, the weight at a titration point, k, can be given by
 * $$W_k= \frac{1}{\sigma^2_E+\left( \frac{\partial E}{\partial v} \right)^2_k\sigma^2_v} $$

where $$\sigma_E\,$$ is the error in electrode potential or pH, $$\left( \frac{\partial E}{\partial v} \right)_k$$ is the slope of the titration curve and $$\sigma_v\,$$ is the error on added volume.

When unit weights are used ($$\mathbf{W=I, p=(J^TJ)^{-1}J^Ty}$$) it is implied that the experimental errors are uncorrelated and all equal: $$\Sigma^y=\sigma^2 \mathbf{I}$$, where $$\sigma^2\,$$ is known as the variance of an observation of unit weight, and $$\mathbf{I}$$ is an identity matrix. In this case $$\sigma^2\,$$ is approximated by $$\frac{U}{n_d-n_p}$$, where U is the minimum value of the objective function and nd and np are the number of data and parameters, respectively.
 * $$\mathbf{\Sigma^p=\frac{U}{n_d-n_p}(J^TJ)^{-1}}$$

In all cases, the variance of the parameter pi is given by $$\Sigma^p_{ii}$$ and the covariance between parameters pi and pj is given by $$\Sigma^p_{ij}$$. Standard deviation is the square root of variance. These error estimates reflect only random errors in the measurements. The true uncertainty in the parameters is larger due to the presence of systematic errors which, by definition, cannot be quantified.

Note that even though the observations may be un-correlated, the parameters are always correlated.

Derived constants
When cumulative constants have been refined it is often useful to derive stepwise constants from them. The general procedure is to write down the defining expressions for all the constants involved and then to equate concentrations. For example, suppose that one wishes to derive the pKa for removing one proton from a tribasic acid, LH3, such as citric acid.
 * $$L^{3-}+2H^+\leftrightharpoons LH_2^-:[LH_2^-]=\beta_{12}[L^{3-}][H^+]^2$$
 * $$L^{3-}+3H^+\leftrightharpoons LH_3:[LH_3]=\beta_{13}[L^{3-}][H^+]^3$$

The stepwise association constant for formation of LH3 is given by
 * $$LH_2^- +H^+\leftrightharpoons LH_3:[LH_3]=K[LH_2^-][H^+]$$

Substitute the expressions for the concentrations of LH3 and LH2- into this equation
 * $$\beta_{13}[L^{3-}][H^+]^3=K\beta_{12}[L^{3-}][H^+]^2[H^+]\,$$

whence
 * $$\beta_{13}=K\beta_{12}: K=\frac{\beta_{13}}{\beta_{12}} \,$$

and since $$pK_a=-\log(1/K)\,$$ its value is given by
 * $$pKa_1= \log \beta_{13}-\log \beta_{12}\, $$

When calculating the error on the stepwise constant, the fact that the cumulative constants are correlated must be taken into account. By error propagation
 * $$\sigma^2_K=\sigma^2_{\beta_{12}}+\sigma^2_{\beta_{13}}-2 \sigma_{\beta_{12}} \sigma_{\beta_{13}}\rho_{12,13}\,$$

and
 * $$\sigma_{\log K}=\frac{\sigma_K}{K}$$

Model selection
Once a refinement has been completed the results should be checked to verify that the chosen model is accepable. generally speaking, a model is acceptable when the data are fitted within experimental error, but there is no single criterion with which to make the judgement. The following should be taken into consideration.

The objective function
When the weights have been correctly derived from estimates of experimental error, the expectation value of $$\frac{U}{n_d-n_p}$$ is 1. It is therefore very useful to estimate experimental errors and derive some reasonable weights from them as this is an absolute indicator of the goodness of fit.

When unit weights are used, it is implied that all observations have the same variance. $$\frac{U}{n_d-n_p}$$, is expected to be equal to that variance.

Parameter errors
The errors on the stability constants should be roughly commensurate with experimental error. For example, with pH titration data, if pH is measured to 2 decimal places the errors of log $$\beta$$ should not be much larger then 0.01.

Distribution of residuals
At the minimum in U the system can be approximated to a linear one, the residuals in the case of unit weights are related to the observations by
 * $$\mathbf{r=y^{obs}-J \left(J^TT \right)^{-1}J^T y^{obs}}$$

The symmetric, idempotent matrix $$\mathbf{J \left(J^TT \right)^{-1}J}$$ is known in the statistics literature as the hat matrix, :$$\mathbf{H}$$. Thus,
 * $$\mathbf{r=\left(I-H \right) y^{obs}}$$

and
 * $$\mathbf{M^r=\left(I-H \right) M^y \left(I-H \right)}$$

where I is an identity matrix and Mr and My are the variance-covariance matrices of the residuals and observations, respectively. This shows that even though the observations may be un-correlated, the residuals are always correlated.

The diagram at the right shows the result of a refinement of the stability constants of Ni Gly+, Ni(Gly)2 and Ni(Gly)3- (GlyH=glycine). The observed values are shown a blue diamonds and the species concentrations, as a percentage of the total nickel, are superimposed. The residuals are shown in the lower box. The presence of correlation is evident in the way sequences all have the same sign. Correlation notwithstanding, the magnitudes of the residuals show some randomness. Individual residuals are mostly commensurate with experimental error (about 0.002 in pH). This is about as good as it gets.

Physical constraints
Some physical constraints are usually incorporated in the calculations. For example, all the concentrations of free reactants and species must have positive values and association constants must have positive values.

With spectrophotometric data the molar absorptivity (or emissivity) values should all be positive. Most computer programs do not impose this constraint on the calculations.

Other models
If the model is not acceptable a variety of other models should be examined in order to find the model that best fits the experimental data, within experimental error. The main difficulty is with the so-called minor species. These are species whose concentration is so low that the effect on the measured quantity is at or below the level of error in the experimental measurement. The constant for a minor species may prove impossible to determine if there is no means to increase the concentration of the species. .

Implementations
Some simple systems are amenable to spreadsheet calculations. These calculations do not follow the general procedures outlined here and use SOLVER to perform the least-squares minimization.

A large number of computer programs for equilibrium constant calculation have been published. See for a bibliography. The most frequently used programs are:
 * Potentiometric data: Hyperquad, BEST PSEQUAD
 * Spectrophotometric data:Hyperquad, SQUAD ,Specfit, Specfit/32 is now available commercially
 * NMR data HypNMR, EQNMR