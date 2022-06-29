As an undergraduate at Brandeis, Berger had to take calculus a third time—and chose to take it pass/fail. After getting perfect scores all semester without studying, she remembers running to the professor in a panic after missing the final exam. Though she had already passed the class, she insisted on taking the final, imploring, “I need to prove to myself that I can do this!” She walked out with another perfect score, less than an hour into the three-hour test.

Berger declared majors first in Russian and then in psychology, but she discovered a love of coding in a sophomore class on statistical programming in Fortran and switched to the school’s new computer science major. In 1981, her junior year, she spent $800 of her own money on a 1,200-baud modem so that she could code through the night from her shared apartment, feeding her code into the mainframe. She credits her professors at Brandeis with encouraging her to consider graduate school. “Even though I got discouraged out of math,” Berger says, “I came back to it.”

Finding biology

At MIT, Berger studied computer science under the cryptography pioneer and future Turing Award winner Silvio Micali and informal mentor Peter Shor, a professor of applied mathematics and winner of this year’s Killian Award. After winning an award for her doctoral thesis, she began her postdoc at the Institute; one year in, she made the switch to biology—or, rather, began mining the field for interesting problems to work on. Daniel Kleitman, a professor of applied mathematics and Berger’s postdoc advisor, had heard Stanford biophysicist (and future Nobel laureate) Michael Levitt speak about protein folding. And just like the businessman in The Graduate who urged Dustin Hoffman’s character to pursue plastics, Kleitman “was so enthralled that he came back and said to me: ‘Proteins!’” Berger recalls. “‘That’s what you should do.’” She smiled at his movie reference, and decided she was game.

Proteins need to fold into three-dimensional shapes in order to become biologically active, and the exact shape matters, since the same protein folded in different ways can do different things. How a protein is folded affects such things as which binding sites are exposed and how it interacts with other molecules. The big debate at the time was whether or not proteins had intermediate states while folding. Berger worked on predicting the presence and functions of so-called coiled coils, strings of amino acids that are twisted together like two telephone cords within a protein; they play roles in gene expression and in stabilizing links between proteins. Having been introduced to cryptography in Shafi Goldwasser’s class on the subject and by her peers in Micali’s lab, Berger wondered whether a technique that uses the frequency of character pairs and triplets to break codes could be adapted to analyze protein sequences. Before getting a lab tech to help her, Berger spent hours upon hours comparing protein sequences by hand to look for patterns. She figured out how to predict, on the basis of those sequences, which proteins would form coiled coil structures. The resulting paper and its follow-ups have now been cited 2,000 times, and this work has allowed researchers to do things like predict, as biology professor Peter Kim has done, how an influenza virus binds to a cell membrane through a spring-loaded mechanism. Software programs Berger’s group has since developed have been used by thousands to identify coiled coils and their functions, she says.

Berger also worked with Jonathan King in the biology department and Shor to study viral capsids, the three-dimensional shells that protect viruses and help them get inside their host cells. She predicted that the formation of these structures, which are composed of repeating protein subunits, must follow local rules: “One protein is like a lock into a key that binds to the next protein, which changes its shape.” Her rules described how viral capsids could self-assemble from a given number of different proteins.

Comparing mice and men

Berger joined the MIT faculty in 1992 as an assistant professor of applied mathematics, with a joint appointment in the Laboratory for Computer Science (LCS). Soon after, she began working on the first comparison of the human and mouse genomes with her students Serafim Batzoglou ’96, MEng ’96, PhD ’00, and Lior Pachter, PhD ’99, along with future Broad Institute researcher Eric Lander. After Lander suggested looking at whole-genome comparisons, Berger, Pachter, and Batzoglou came up with an algorithm that made it possible to align the genomes of two different species. Then they worked with Lander to map the genes between human and mouse, starting by aligning large matching windows and then drilling down to smaller and smaller matches within them. In 2000, they published the first paper on comparative genomics, demonstrating that coding regions of the genome were on average 80% identical between the two species. This work, says Berger, launched the subfield of comparative genomics.

Berger and Batzoglou also helped set the stage for sequencing the human genome. Using the shotgun sequencing protocol, researchers had split up the genome into millions of random DNA fragments, which they then sequenced individually. Then they needed software tools that could look for overlaps in the sequences so they could reassemble them in the correct order. Berger and Batzoglou developed the prototype for one such tool, called the Arachne sequence assembler. After earning his PhD, Batzoglou worked for a year at the Whitehead Institute, where he and a colleague turned that prototype into a production-quality tool Lander’s lab used extensively as part of the large collaborative effort to assemble the first human genome. That work would help earn Batzoglou a spot on Technology Review’s list of Top 100 Young Technology Innovators in 2003, following in the footsteps of Berger, who made the inaugural TR100 list in 1999.

“Taking a network view allows us to analyze different kinds of data through a common lens.”

Berger’s group extended this human-mouse comparison work to look at fruit flies and 18 species of yeast, the focus of Manolis Kellis ’99, MEng ’99, PhD ’03, who was then a student and is now a fellow MIT professor. That research led Berger and Rohit Singh, PhD ’12, to develop software called Isorank, which aligns genome sequences from different species using a ranking algorithm similar to the page-rank system in search engines: two regions are likely to be a good match if their neighbors are alike, and their neighbors’ neighbors, and so on. This makes it feasible to integrate disparate types of data—for example, sequence alignments, protein-protein interactions, or genetic interactions—to find genes with common ancestry and function in different species. “Our idea,” says Berger, “was that taking a network view allows us to analyze different kinds of data through a common lens.”