It’s flu season and Paul Walker is leery about getting a flu shot. At 75 and with a history of heart disease, the Kannapolis retiree knows he should participate in the annual roll-up-your-sleeve ritual. But Walker remembers his first flu shot in the 1970s. Three days later, he came down with the flu and missed work for six days.
“I haven’t made up my mind this year,” he says. Walker understands the risks. He knows that flu and flu shots change from year to year. So, if he decides to get the vaccine, does he also know what type of flu he’s protected against?
“No. I don’t,” he says. “I know they are numbered: 6, 1, 12; but I don’t know what that means.”
Walker’s uncertainty has merit. Flu vaccines are a gamble. A few months before flu season, scientists evaluate the data concerning the flu-like (short for influenza) illnesses around the country. They then guess which subtypes will prevail.
Of the two types of flu that frequently affect humans, A and B, the most dangerous is A. For the 2013 flu season, the Food and Drug Administration’s Vaccines and Related Biological Products Advisory Committee recommends protection against two A viruses, H3N2 and H1N1, and one B virus. Some 2013 vaccines will contain a second B virus component.
If guesses prove accurate, the United States will have a mild flu season. If the guesstimates are wrong or incomplete, we could have a replay of 2002 and 2009 when a novel and particularly nasty virus subtype emerged.
During those epidemics, scientists did not have a vaccine ready until the outbreak was well underway. Unlike measles and chicken pox, there is yet no universal flu vaccine. To defend against this massive health and economic disruption, we need accurate and rapid prediction.
The Flu Genome
Dan Janies (pronounced “Janus”) promotes a better way to understand and predict flu outbreaks. At 47, he is the Belk Distinguished Professor of Bioinformatics and Genomics at the University of North Carolina at Charlotte (UNC Charlotte). He neatly unites 21st century biology with big data.
Janies’ university appointment adds validity to the observation made by Michael Levitt when he was recently named a recipient of the Nobel Prize in chemistry. “Biology is very complicated and computers are powerful tools,” said Levitt, a professor of structural biology at Stanford University. “The prize,” he said, “is a belated recognition of the importance of the computer in biology.” More commonly, the combination is referred to as bioinformatics.
Janies believes that the way we describe the flu is outdated. He calls it a “legacy nomenclature based on weak technology.”
That legacy goes back to 1971 when two flu molecules, hemagglutinin and neuraminidase were first used to identify flu subtypes. Hemagglutinin (H) is responsible for attaching the virus to a host’s cell and allowing the virus to enter. The virus then hijacks the cell’s replication machinery to make new copies of itself.
Neuraminidase (N) allows the new viruses to enter and infect more cells and hosts. H and N were the perfect target for the flu vaccine. The H and N proteins are present on the surface of the flu virus and our immune system had no trouble finding them.
Flu vaccines stimulate our immune system to develop antibodies to H and N. If exposed to a live flu virus, these antibodies attack the Hs and Ns on the virus’s surface and, if all function as planned, we escape the flu.
The numbers 1 through 16 for H and 1 through 9 for N refer to the different types of hemagglutinin and neuraminidase. Each influenza virus is defined by only one type of H surface protein and one type of N. Within each subtype, there are also strains that arise from random mutations in the virus.
“I don’t think that way,” says Janies. For him, the entire flu genome, which contains six more genetic segments besides H and N, is the key to understanding and combating the disease.
“The genetic segments contain all the exquisite details—the building blocks of the virus. Knowing our enemy’s bricks and mortar will win the war against influenza,” he maintains.
Each living organism has its own unique genome. Virus, plant, bacteria, microbe, fungus and now human genomes are being studied and sequenced. Such work has led to a paradigm shift in understanding and treatment.
Cancer research is a good example. Some researchers have abandoned their focus on the organs where cancer arises—lung, pancreas, breast, skin—to instead focus on the cancer genome. They have found that different cancers share a number of genetic similarities. These scientists advocate treatment based on what genes are mutated, not the tissue involved.
For Janies, genome sequencing provides a much more accurate way of identifying a virus’s subtype. Once an expensive, tedious, painstaking process, sequencing costs have dropped a hundred-thousand fold.
“The H1N1 virus that our great-grandparents experienced in 1918 is a completely different H1N1 from what emerged in 2009,” says Janies. “The H still reacts to the 1 and the N reacts to the 1 antibody, but all the rest of the genes in the genome—all the internal genes—are completely different. That’s what genomics gives you—a clearer picture of what’s there. It illustrates where the legacy nomenclatures are wanting.”
“In the case of Severe Acute Respiratory Syndrome (SARS) in 2003 it took two months from detection of a novel virus to the public release of the genome,” says Janies. “For H1N1 in 2009, it was two weeks.”
Enter the Supramap
Interestingly, it is this ability to identify a particular virus by genome that made it possible to track a particular virus along with big data. Enter the Supramap.
In 2007, Janies and his colleagues at Ohio State University, the American Museum of Natural History and the Ohio Supercomputer Center developed Supramap to track the spread and evolution of pandemic (H1N1) and avian influenza (H5N1).
At the time, Janies was an expert in computational genomics at the Wexner Medical Center at Ohio State.
“Supramap does more that put points on a map—it is tracking a pathogen’s evolution,” said Janies, as the first author of the research paper on the Web-based tool that combines information about the genetic sequences of pathogens with geographic information on Google Earth, allowing researchers to predict and track where infectious disease will strike and how it may mutate.
Using Supramap, they initially developed maps that illustrated the spread of drug-resistant influenza and host shifts in H1N1 and H5N1 influenza and in coronaviruses, such as SARS.
Describing the transition of the Web service to an open-source, freely available phylogenetic analysis program, able to be used by other researchers, he described, “We package the tools in an easy-to-use Web-based application so that you don’t need a Ph.D. in evolutionary biology and computer science to understand the trajectory and transmission of a disease.
“The tool’s users can obtain a pathogens’ phylogenetic tree by submitting its genetic sequences to the system. Supramap then projects that information onto the globe, showing how diseases can mutate over space and time to infect new populations.”
In July 2012, Janies joined the faculty of the University of North Carolina at Charlotte as Belk Distinguished Professor of Bioinformatics and Genomics.
Janies, it seems, has always been the “inventive” sort. He received B.S. in Biology from the University of Michigan and a Ph.D. in Zoology from the University of Florida. He’s worked as a postdoctoral fellow and a principal investigator at the American Museum of Natural History in New York City where he lead a team that, using off-the-shelf PC components, built one of the world’s largest computing clusters in 2001.
He was attracted to UNC Charlotte by the ability to work with outside businesses and to conduct joint research and innovation.
Tracking the Spread of Infectious Disease
Seeing is believing and when you see the 3D visualization of Google Earth and influenza data that Janies and his colleagues have connected on the Supramap giving influenza height, width, depth, movement and meaning, you realize you are looking at an interactive “weather map of disease.”
“Supra, Latin for over, is a good descriptor of what the map delivers,” Janies says.
For the Supramap for avian flu (H5N1), a flu that moves within bird populations and then jumps from birds to humans, for example, Janies and his colleagues accumulated data on outbreaks among hosts such as ducks, chickens, wild birds and humans in China, Russia, the Middle East, Africa and Europe. Hundreds of thousands of cases were classified by strain, location and host.
Janies and his colleagues then use specialized software and Google Earth to project the latitude and longitude of similar flu strains onto the globe. If the movement of a pathogen is related to bird flyways, for example, and those routes are shifting because of something like climate change, it can predict where the disease might logically emerge next.
The Supramap allows any user to input raw genetic sequences of a pathogen’s strains and build an evolutionary tree based on mutations. The branches are projected onto the globe with pop-up windows to show how strains mutate over space and time and infect new hosts. That is, in essence, what Janies calls the “crystal ball.”
Disease is visualized as a “tree” whose “roots” are the common ancestor of a particular flu strain. When an ancestor gives rise to descendant strains, the tree grows higher. Intermediate ancestors and other descendents are given less altitude. Outbreaks are connected with lines reaching across the globe. Finally, date of outbreak is factored in, giving the tree a temporal dimension.
For H5N1, the “tree” grew and shrunk from 1999 to 2006, as it moved over the landscape infecting new hosts.
“The idea of this evolutionary tree of the virus,” says Janies, “is to help predict where the next outbreak of the virus is likely to occur. The map gives us a whole new way of seeing the virus in action and understanding what it is—and isn’t—doing. In the meantime, we are working on mapping other diseases, such as MERS and H7N9.”
The role birds play in the origin and spread of flu is a fairly recent discovery, going back only to the 1990s, says Janies. “Influenza has many, many strains and most live in birds. Often those strains get into mammals and humans.”
One of the central questions in influenza research was which birds were the chief culprit in the spread of avian flu. The usual suspects were migrating wild birds and chickens and ducks sold and then shipped to distant locations.
With Supramap, Janies has found that domestic fowl were to blame in Indonesia, but in other regions both wild and domestic birds are responsible. In one interesting case, a smuggled eagle carrying the flu virus was caught in customs after being transported thousands of miles from Bangkok to Brussels.
Asia heads the list of places where most flus originate, but Janies is quick to point out that H1N1 began in Mexico, California and Texas.
“We are not doing a good job of observing flu around the world,” he adds. The World Health Organization operates the influenza surveillance system in partnership with national governments and places its limited number of observers in major cities. Flu in rural areas often goes undetected.
“Until a few years ago,” says Janies, “we just knew about influenza in Johannesburg and Cairo. We didn’t know anything about the rest of the continent of Africa. Without comprehensive influenza surveillance data, and the means to put it in context to inform inoculation programs, influenza prevention will struggle.”
When a pandemic breaks out, disease circles the globe, often leaping from species to species. Supramap doesn’t just track the spread of viruses, it tracks how the viruses are mutating as they jump into new hosts and encounter new medicines. Using Supramap, scientists might be able to stay ahead of the virus mutation curve and figure out when to switch medicines as the microbes adapt and develop resistance.