Salesforce’s ProGen trained on 280 million amino acid sequences to learn to generate proteins
This week, a team of scientists at Salesforce published a study detailing an AI system ProGen they say is capable of generating proteins in a “controllable fashion,” such that it could unlock new approaches to protein engineering. If their claims pan out, it could lay the groundwork for meaningful advances in synthetic biology and material science a highly desirable outcome in the midst of the devastating coronavirus outbreak.
As Salesforce research scientist Ali Madani explained in a blog post, proteins are simply chains of molecules amino acids bonded together. There are around 20 standard amino acids, which interact with one another and locally form shapes that constitute the secondary structure. Those shapes continue to fold into a fully three-dimensional structure called a tertiary structure. From there, proteins interact with other proteins or molecules and carry out a wide variety of functions, from ferrying oxygen to cells around the body to regulating blood glucose levels.