Folding@home - Project Significance

Project Significance

Further information: Protein folding

Proteins are an essential component to many biological functions and participate in virtually all processes within biological cells. They often act as enzymes, performing biochemical reactions including cell signaling, molecular transportation, and cellular regulation. As structural elements, some proteins act as a type of skeleton for cells, and as antibodies, other proteins participate in the immune system. Before a protein can take on these roles, it must fold into a functional three-dimensional structure, a process that often occurs spontaneously and is dependent on interactions within its amino acid sequence and interactions of the amino acids with their surroundings. Protein folding is driven by the search to find the most energetically favorable conformation of the protein, i.e. its native state. Thus, understanding protein folding is critical to understanding what a protein does and how it works, and is considered a "holy grail" of computational biology. Despite folding occurring within a crowded cellular environment, it typically proceeds smoothly. However, due to a protein's chemical properties or other factors, proteins may misfold—that is, fold down the wrong pathway and end up misshapen. Unless cellular mechanisms are capable of destroying or refolding such misfolded proteins, they can subsequently aggregate and cause a variety of debilitating diseases. Laboratory experiments studying these processes can be limited in scope and atomic detail, leading scientists to use physics-based computational models that, when complementing experiments, seek to provide a more complete picture of protein folding, misfolding, and aggregation.

Due to the complexity of proteins' conformation space—the set of possible shapes a protein can take—and limitations in computational power, all-atom molecular dynamics simulations have been severely limited in the timescales which they can study. While most proteins typically fold in the order of milliseconds, before 2010 simulations could only reach nanosecond to microsecond timescales. General-purpose supercomputers have been used to simulate protein folding, but such systems are intrinsically expensive and typically shared among many research groups. Additionally, because the computations in kinetic models are serial in nature, strong scaling of traditional molecular simulations to these architectures is exceptionally difficult. Moreover, as protein folding is a stochastic process and can statistically vary over time, it is computationally challenging to use long simulations for comprehensive views of the folding process.

Protein folding does not occur in a single step. Instead, proteins spend the majority of their folding time—nearly 96% in some cases—"waiting" in various intermediate conformational states, each a local thermodynamic free energy minimum in the protein's energy landscape. Through a process known as adaptive sampling, these conformations are used by Folding@home as starting points for a set of simulation trajectories. As the simulations discover more conformations, the trajectories are restarted from them, and a Markov state model (MSM) is gradually created from this cyclic process. MSMs are discrete-time master equation models which describe a biomolecule's conformational and energy landscape as a set of distinct structures and the short transitions between them. The adaptive sampling Markov state model approach significantly increases the efficiency of simulation as it avoids computation inside the local energy minimum itself, and is amenable to distributed computing (including on GPUGRID) as it allows for the statistical aggregation of short, independent simulation trajectories. The amount of time it takes to construct a Markov state model is inversely proportional to the number of parallel simulations run, i.e. the number of processors available. In other words, it achieves linear parallelization, leading to an approximately four orders of magnitude reduction in overall serial calculation time. A completed MSM may contain tens of thousands of sample states from the protein's phase space (all the conformations a protein can take on) and the transitions between them. The model illustrates folding events and pathways (i.e. routes) and researchers can later use kinetic clustering to view a coarse-grained representation of the otherwise highly detailed model. They can use these MSMs to reveal how proteins misfold and to quantitatively compare simulations with experiments.

Between 2000 and 2010, the length of the proteins Folding@home has studied have increased by a factor of four, while its timescales for protein folding simulations have increased by six orders of magnitude. In 2002, Folding@home used Markov state models to complete approximately a million CPU days of simulations over the span of several months, and in 2011, MSMs parallelized another simulation that required an aggregate 10 million CPU hours of computation. In January 2010, Folding@home used MSMs to simulate the dynamics of the slow-folding 32-residue NTL9 protein out to 1.52 milliseconds, a timescale consistent with experimental folding rate predictions but a thousand times longer than previously achieved. The model consisted of many individual trajectories, each two orders of magnitude shorter, and provided an unprecedented level of detail into the protein's energy landscape. In 2010, Folding@home researcher Gregory Bowman was awarded the Thomas Kuhn Paradigm Shift Award from the American Chemical Society for the development of the open-source MSMBuilder software and for attaining quantitative agreement between theory and experiment. For his work, Pande was awarded the 2012 Michael and Kate Bárány Award for Young Investigators for "developing field-defining and field-changing computational methods to produce leading theoretical models for protein and RNA folding" as well as the 2006 Irving Sigal Young Investigator Award for his simulation results which "have stimulated a re-examination of the meaning of both ensemble and single-molecule measurements, making Dr. Pande’s efforts pioneering contributions to simulation methodology."

Read more about this topic:  Folding@home

Famous quotes containing the words project and/or significance:

    Although I mean it, and project the meaning
    As hard as I can into its brushed-metal surface,
    It cannot, in this deteriorating climate, pick up
    Where I leave off.
    John Ashbery (b. 1927)

    Of what significance the light of day, if it is not the reflection of an inward dawn?—to what purpose is the veil of night withdrawn, if the morning reveals nothing to the soul? It is merely garish and glaring.
    Henry David Thoreau (1817–1862)