And that's based on Gromacs and several other software packages. It still doesn't address the issue of where does the data come from and will the outcome of mining it be fair and trustworthy?
Input data could come from the protein database, which has catalogs of thousands of protein structures. The issue being is when is a protein folded? Do we have some known ending structure and we just need to get within an acceptable Root Mean Square Deviation? The other issue would be there need to be some sort of centralized server used to collect the resulting data, and create the Markov State Model.