CZ3253 Practical 1: Comparative Molecular Field Analysis (CoMFA)
 

CoMFA Theory

The idea underlying a Comparative Molecular Field Analysis (CoMFA) is that differences in a target property are often related to differences in the shapes of the non-covalent fields surrounding the tested molecules. To put the shape of a molecular field into a QSAR table, the magnitudes of its steric (Lennard-Jones) and electrostatic (Coulombic) fields are sampled at regular intervals throughout a defined region. While there are many adjustable parameters in CoMFA, certainly the most important is the relative alignment of the individual molecules when their fields are computed. Properly aligned molecules have a comparable conformation and a similar orientation in Cartesian space. The QSAR is then generated by a PLS analysis of the data contained in the MSS. The value of the resulting QSAR can be determined through the value of the crossvalidated r2 (from now on referred to as q2) reported by the PLS. If acceptable, the CoMFA QSAR, re-derived in final, non-crossvalidated form, can most easily be manipulated using various graphics techniques. Otherwise the alignment of one or more molecules can be changed, or other parameters altered, and the analysis repeated. Once an acceptable QSAR has been derived, prediction of the target property value for a new molecule is particularly straightforward.

From a user's point of view, the four major phases of a CoMFA are:

  1. Preparing for CoMFA
  2. Building Spreadsheet for CoMFA
  3. CoMFA PLS Runs
  4. Examination and Use of CoMFA Results

In this practical, we will be constructing a QSAR for a set of ryanoid molecules. The inhibitory activity values are chicken KDs (nM) for displacement of 7 nM [3H]ryanodine from skeletal muscle. (W. Welch, S. Ahmad, K. Gerzon, R.A. Humerickhouse, H.R. Besch, Jr., L. Ruest, P. Deslongchamps & J.L. Sutko, Biochemistry, 1994, 33: 6074-6085)

 

Preparing for CoMFA

  1. All the ryanoids include a tetrahydropyran which can be used to align them. Retrieve tetrahydropyran from the Fragment Library.
  2. Remove all the hydrogens.
  3. Save the core structure as a mol2 file so you can retrieve it later. This is a precautionary measure because the contents of this work area will be overwritten during step 4.
  4. Align all the molecules in the database onto one of them using the molecule on the screen as the basis for the alignment.
  5. Watch the alignment as it proceeds: each molecule in turn is retrieved from the database, brought into a new molecule area and aligned on the template. Once the process is complete, you are prompted for the name of a new database to store the 18 ryanoids in their new orientations.
  6. The new alignments have all been saved, so you can now clear the screen. This will reduce the time SYBYL spends refreshing the screen between commands.

 

Building Spreadsheet for CoMFA

  1. Create a spreadsheet from the new database and save it.
  2. Read in biological data
  3. In QSAR, the logarithms of concentrations are used. So the values we will be using are -log(concentration*10-9) since the concentration is expressed in nM.
  4. Add CoMFA Columns

             

  5. Save the Table

 

CoMFA PLS Runs

  1. With no rows or columns selected, any operations are performed on the entire spreadsheet. Perform a crossvalidated run with two objectives:
  2. Having the crossvalidation to confirm the predictive ability, derive the best predictive model for use in graphics and in numerical prediction.

 

Examination and Use of CoMFA Results

  1. Investigate the CoMFA results.
  2. Take a molecule that is a good candidate for improvement, make some structural changes and predict its activity.
  3. Label the atoms by ID numbers.
  4. You can easily model this new compound.
  5. Calculate the charges and predict the activity of this new compound. Note that the changes you have made to the DehydroR_2 molecule are minor. This means that minimization and realignment are not necessary.
  6. Close the spreadsheet.

 

Answer to Questions

Question 1: What is the range of values for the CoMFA column?
Answer: The values range from 154 to 236.

Question 2: What is the crossvalidated r2? Is the model good enough? What is the number of components for maximum r2?
Answer: The crossvalidated r2 is about 0.522, so one would expect to use such a model to account for 52% of the actual variance in activity among additional similar ryanoids. This is good enough to at least rank the activity of proposed new compounds rather accurately. The maximum r2 occurs at 2 components. Thus we know that the model is useful and that we should use 2 components for the final analysis.

Question 3: What is the r2? What is the percentage of contribution to the model's information from the electrostatic fields? What is the percentage of contribution to the model's information from the steric fields?
Answer: Information in the text window reports that the r2 measure of fit is 0.801. The electrostatic fields contribute 49.8% of the model's information, while the steric fields represent the other 50.2%.

Question 4: Which molecule has the highest inhibitory activity? What is its actual and predicted activities?
Answer: The molecule in row 5: DEHYDRO_R2 has the highest inhibitory activity. The actual value for the inhibitory activity is 8.22, while the predicted value is 7.85.

Question 5: What is the predicted activity of the modified compound?
Answer: The predicted value of DehydroR_2F, the modified compound, is 8.09. The predicted activity of the compound before modification is 7.85, as can be seen higher in the textport window (when you picked the compound in row 5). So, replacing the hydrogen with a fluorine is expected to have a small, positive effect on activity.