HOW TO READ GEOFOLD OUTPUT

 

Timecourse

 

GeoFOLD does its thing with your protein, producing a directed acyclic graph (DAG) of all topologically possibleintermediates of unfolding, each with calculated transition state energies. Starting with the DAG, UnfoldSim calculates the timecourse of unfolding. The native state concentration is initialized to the value CONCENTRATION in the parameters file. All other nodes are initialized to zero concentration. At each time step in the time course, the concentration of each node is updated based on the concentrations of all nodes connected to it.  We plot in "F" the native state and any states that have at least 90% of the buried surface area of hte native state. In "U" we sum the concentration of all states that have less than 1000 of buried surface area, equivalent to a typical 10 residue segment.  In "I" we plot the summed concentrations of all other, intermediate, states, not F and not U.  The simulation ends when the protein is half unfolded (if HALFLIFE is set to 1), or when there are no further changes in concentration (HALFLIFE set to 0). If FOLDIING is set to 1, then the simulation is initialized with the unfolded state having non-zero concentration, and all other states including the native state set to zero concentration.

 

 

 

Age Plot

This image is a contact map of the protein, colored by the order in which contacts are broken in the unfolding pathway. Red are contacts broken early in unfolding, then yellow, green, cyan, and finally blue are contacts that are broken late in unfolding. The contacts are ordered by "age" which is defined as the sum of the concentrations of all states that contain the contact in question. Age is higher if the total concentration of states having that contact is higher. The Age Plot is calculated at the point where the concentration of unfolded (U in the Timecourse) first passes 50% (first drops blow 50% if FOLDING = 1). The Age Plot can be used to identify early folding intermediates (blue contacts), and early unfolding segments (red contacts), and to identify contacts that have topological dependencies.

 

In the case shown here, the C-terminal is the first to unfold. Of the three beta hairpin structures in this protein (clusters of contacts perpendicular to the diagonal), the third is the most enduring, remaining until the end either for topological or energetic reasons.

 

 

 

Pathways

 

GeoFOLD pathways depict a subset of the states in the directed acyclic graph (DAG), elected according to the flow of traffic through the graph in a UnfoldSim simulation. The pathway you see (and all other results for that matter) depend on the settings for TEMPERATURE, OMEGA, and all energies, entropies and cutoffs. Orange nodes are structural states, with the native state at the top. Diamonds are transition states, red for pivots, black for hinges and white for breaks.

 

Clicking on one of the nodes will give you some specific information on the structural state or transition state. For example, clicking on an orange circle node give this info:

 

ISEGMT       2    2    0     6665.21       98.67       0     35   0.056048164

BBBBBBBBB..........AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA    1-   9,  20-  58

 

Columns in the first line (ISEGM) are

2: Segment number

3: Graph depth from native

4: Number of symmetry equivalent segments in this protein

5: Buried solvent accessible surface area

6. Unexpressed sidechain entropy

7. Number of buried void spaces

8. Number of hydrogen bonds

9. Final concentration (at half-life or equilibrium, depending on HALFLIFE setting)

The second line shows the residue positions present in this structural state. In this example, residues 1-9 and 20-58. The string shows that the two contiguous segmetns are being treated (internally) as separate chains. This is done because the connecting segment (residues 10-19) are already unfolded at this stage of unfolding.

 

 

 

Clicking on a diamond node gives:

 

TSTATE      70      1      2    105        0.35   h   0.7768

 

Columns in the TSTATE line are

2: tstate number

3:  ISEGMT f , number of the incoming structural state

4: ISEGMT u1, number of the outgoing structural state 1

5: ISEGMT u2, number of the outgoing structural state 2

6: Entropy of this transition relative to the most flexibe transtion of this cut-type

7: Cuttype = b: break, p:pivot or h:hinge

8: Traffic through this TSTATE as a fraction of all traffic coming from ISEGMT f.

 

Bold states and thick green lines mark the unfolding pathway with the highest traffic. Other pathways with non-zero traffic are shown as smaller, dimmer nodes and thin lines. Pathways with near zero traffic are not drawn. (Note, that a simulation of unfolding under folding conditions will show only the folded state! A very boring graph.)

 

Interpretting the graph

The overall nature of this graph depends on the protein and the conditions of unfolding. A protein with lots of topological complexifies or with disulfides will have one or more hinge (black diamond) nodes. Multiple chain proteins will have breaks (white diamonds) where subunit chains (or segments sparated by an unfolded segment) are physically separated during unfolding. Loosely bound subunits will  disocciate early in folding, shown by white diamonds at or near the top of the graph.

 

A protein with many alternative unfolding pathways will be wide, having many options for unfolding traffic. At higher omega (lower virtual urea), closer to the melting point,  the paths may reduce to one dominant pathway. This happens because the higher desolvation energy accentuates the differences between pathways.

 

A tall graph such as the one shown on the right, with lots of short branches has a single predominant pathway characterized by the "peeling" of short segments from the surface, like a ball of string.

 

 

 

 

 

 

 

 

 

 

 

 

 

Some unfolding pathways, such as the one to the right, have multiple long branches. The length of a branch is roughly proportional to the size of the state at the branch point. This graph splits into two large segments at depth 6, and one of the resulting children immediately splits again.

 

Also note in this image that more than one thick green line can end in one ISEGMT node. How can this happen, since a given segment can only unfold once? This happens in multimers, because symmetry-related segments are assigned the same state.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to GeoFOLD server