Part II: Logistics of GenomeTrakr
Last month in Food Genomics we asked FDA scientists Drs. Marc Allard and Eric Brown to help the readers of Food Safety Tech understand the process used by GenomeTrakr. In part two we cover some logistical and more general questions.
Greg Siragusa/Douglas Marshall: Why should a food producer or processor submit its own pathogen isolates to GenomeTrakr? Are there any legal liabilities incurred by doing so?
Eric Brown/Marc Allard: The database is available publicly for any outside laboratory to be able to rapidly compare their new WGS data to all of the data in the database. The data is all publicly available so food industry members should carefully consider the strengths and weaknesses of sharing data. The main reason for sharing data is that if any matches arise then this would be immediately known for an investigation and corrective action. With knowledge, companies can better understand their risk and exposure to occasional contamination events.
Siragusa/Marshall: Are there private third-party providers who will perform the same method of sequence analysis for private companies that GenomeTrakr uses in the FDA?
Brown/Allard: Yes, as all of the FDA methods of data collection and analysis are fully transparent and publicly available, any expert third-party provider could easily set up and reproduce the GenomeTrakr methods. Third-party support may be an excellent mechanism for food industry partners that wish to examine the pathogens they have found connected to their products but do not wish to maintain an active WGS laboratory. An internet and reference search will uncover these private third-party providers, as this is a growing market with a diversity of services provided. The FDA works closely with the Institute for Food Safety and Health (IFSH) to share information that may be valuable to their industry partners.
Siragusa/Marshall: Will the FDA perform analysis of isolates for private parties and the sequence not made publicly available?
Brown/Allard: No. While we will sequence relevant strains from many different sources, as a matter of protocol we will submit all of these data to the GenomeTrakr database. That is, currently, the FDA sequences and uploads all available genomic strain data. All data are made publicly available through the GenomeTrakr and NCBI pathogen detection website. The metadata describing each isolate only includes species, date, state location and a general food description which could include the type of food (e.g., an egg) and/or the type of sample (e.g., environmental swab, surface water, sediment, etc.) as well as production date, pH, fat content and water activity. No trade or industry brand names are made publicly available, and the location is ambiguous down to the state level to allow for anonymity of specific farm names or processing centers. An example of metadata in the GenomeTrakr database might include Salmonella, from Washington State in spinach from 2015.
Siragusa/Marshall: Is the CDC tied into GenomeTrakr and if so, how?
Brown/Allard: CDC labels their clinical WGS data as PulseNet with the data uploaded to the NCBI Pathogen Detection website. USDA FSIS also uploads the isolates that they have collected and sequenced from foods that they regulate. All of this WGS data is housed in a centralized repository at NCBI Pathogen Detection website where NCBI conducts rapid analysis for QA/QC. The NCBI posts a daily tree for all species that recently have been uploaded. This way all of the data collected by these federal laboratories and their state and international partners are made publicly available for direct comparison. Numerous other international and academic laboratories also provide data to the NCBI centralized database. When isolates cluster together and appear to be closely related, the FDA works with CDC and USDA FSIS through the normal channels. The great benefit of combining food, environmental and clinical isolate genomes in a common database cannot be overstated.
Siragusa/Marshall: In the event of an outbreak, is it possible to obtain WGS’s from using a shotgun metagenome (a microbial and organismic profile obtain by sequencing all of the DNA in a sample, not just bacterial analysis of an enrichment thereby precluding isolation? (Refer to glossary; see Table 1)
Brown/Allard: Yes, preliminary research has documented the potential to obtain WGS data from cultural enrichments, saving the time it takes for full pure culture isolation, which potentially could provide time savings of two to five days depending on the pathogen. Having well characterized draft genomes such as those in the GenomeTrakr database will support rapid characterization from metagenomes after cultural enrichment. A future goal for the FDA is to transform and expand GenomeTrakr into metaGenomeTrakr to support either pure culture or enriched shotgun metagenomic samples.
Siragusa/Marshall: Is there any way that associated metadata tied to a strain (and hence its sequence) can be unmasked through legal action?
Brown/Allard: FDA protects confidential metadata collected during inspection just as it has always done with PFGE data. WGS data is protected at the same level as other types of subtyping information.
Siragusa/Marshall: Is the GenomeTrakr database associated with the GMI (Global Microbial Identifier)?
Brown/Allard: The GMI is a consortium of like-minded public health scientists who wish to collaborate to create a harmonized global system of DNA genome databases that is publicly available to promote a one-health approach. The GenomeTrakr is one of the databases that make up this larger effort that includes some data from members of the GMI.
Siragusa/Marshall: This column is meant to keep food safety professionals abreast of the latest knowledge, technology and uses of genomics for food safety and quality. Tell us your vision of how or which changes in technology (sequencing chemistry, bioinformatics, etc.) will be coming down the pike and how it might impact GenomeTrakr?
Brown/Allard: New technology has been constantly improving in WGS and in sequencing for the last 20 years, and there is no sign of this slowing down. Improvements continue to accrue in chemistry, equipment and software analysis. Likely future improvements will include more turnkey solutions for WGS from sample to report. This includes both DNA extraction and library preparation for sequencing, as well as data analysis pipelines (the system of analyzing the actual sequence data) that provide rapid, accurate and simple language results. Smaller mobile WGS devices are starting to show feasibility that would bring the lab to the samples and decrease the time to an answer (See: https://nanoporetech.com/products/minion) Metagenomics approaches appear to be maturing so that technology improvements are moving this out of a research phase and into direct applications. Currently MISeq (a commonly used workhorse nucleic acid sequencer made by the Illumina Co.) outputs are on the order of 300 base pair read lengths of nucleotides (i.e. A’s, T’s. C’s G’s), long read sequencing technologies, upwards of 1,500 base pairs may make analysis much easier so that more assembled and completed finished genomes are available in the databases. Cloud-based solutions of data analysis pipelines may provide simple solutions, giving wider access to rapid, validated data analysis and results. FDA researchers are working on all of these aspects of improvements in WGS technology as well as expanding the network to more global partners.
Siragusa/Marshall: Sequences deposited into GenBank (as part of GenomeTrakr) are accessible to anyone anywhere. Does this essentially usher in a whole new chapter in food microbiology especially at the pre-harvest level?
Brown/Allard: Yes, having well characterized reference genomes provided by GenomeTrakr partners will support microbial ecology and metagenomics studies. Metagenomics or microbiomes describing which species are present and what they may be doing in the ecology is providing new knowledge in all aspects of the farm to fork continuum. As the costs for these services decrease, we are seeing an increase in use to answer questions that have been impossible or extremely difficult in the past.
Siragusa/Marshall: GenomeTrakr is not a project per se; rather it is a program. How is it funded and will it continue on stable fiscal footing for the foreseeable future?
Brown/Allard: GenomeTrakr started as a research project in the Office of Regulatory Science in CFSAN, but much of this data collection is no longer research. Today, and for some time in the future, WGS at the FDA is collected as fully validated regulatory data to support outbreak and compliance investigations. As such, the FDA is in transition of moving WGS into a phase for more stable regulatory support. Research and development for future applications and technology exploration will always be a part of the FDA portfolio, although typically at lower funding levels than the regulatory offices. Public health funding is generally protected as everyone wants safe food.
Siragusa/Marshall: Are there any restrictions of isolate source? For instance, can isolates from poultry flocks or even wild birds be deposited?
Brown/Allard: The GenomeTrakr and NCBI pathogen detection databases are open to the public and thus there are no restrictions as long as the minimal metadata and QA and QC metrics are met. Current GenomeTrakr WGS foodborne pathogen data includes samples from both poultry and wild birds, as well as turtles, snakes and frogs. Members interested in what is in the database can go to the NCBI Pathogen Detection website and filter on simple words like avian, bird, gull, chicken, wheat, avocado, etc. An example is as follows for a snake.
Siragusa/Marshall: If a company deposits an isolate, will it have access to the GenomeTrakr derived sequence exclusively or at least initially for some period before that information becomes public?
Brown/Allard: No, currently the FDA does not hold WGS data. All data collected by the FDA is uploaded and released publicly at the GenomeTrakr bioprojects and at NCBI pathogen detection website with no delays. If companies wish to hold data then they need to look to third-party solutions for their needs. The reason that GenomeTrakr has been so successful is due to the real-time nature of the released information and that it is globally available.
Siragusa/Marshall: Will the extensive data obtained on Salmonella be fuel to finally develop a sequence based serotyping surrogate for the genus?
Brown/Allard: Yes, the FDA already collaborates with University of Georgia investigators using a program called SeqSero that rapidly identifies serotype from the draft WGS. Other software tools are also being built to go from sequence to serotype.
Siragusa/Marshall: Currently there are a few foodborne pathogens that comprise the GenomeTrakr program’s database. How easy would it be to expand the scope of microorganisms under the GenomeTrakr umbrella? For instance, would a private company be able to use the GenomeTrakr resources to build their own private database? Will the realm ever be expanded to mycotoxigenic fungi or foodborne viruses?
Brown/Allard: Yes, it is relatively easy to take any of the data that is publicly available in GenomeTrakr and download the data to build a private database to add value or to provide additional tools to a private user group. This is already happening with GenomeTrakr data. In addition, the NCBI mechanisms are not species specific, and so any private group can build a bioproject of draft WGS for any species that they wish. To use the NCBI bioproject tools the data would have to be publicly released after a year. Similarly, the NCBI pathogen detection website, which does QA/QC and builds a new phylogenetic tree every time there is new data, is currently open to any human pathogen. The main constraint to building a new species database is that the people who want to build such a database would need to speak to NCBI representatives to convince them that enough data was going to be uploaded and released publicly and that there would be enough use to justify NCBIs efforts to validate and test these new pipelines. Industry should also ask about non-pathogens such as spoilage organisms, although this may be out of the scope of NCBI. The FDA plans to expand to foodborne virus such as Hepatitis A, although fungi may have lower priority. The FDA is willing to work with industry to identify and populate mutually beneficial databases that the industry will use to improve food safety. For example, the FDA has suggested that there may be value of typing foodborne pathogens that have shown known resistance to cleaners and sterilizers with the goal of understanding how these pathogens are avoiding the preventative controls that are put in place to keep them out of the food supply. Better characterization of the genes responsible for resistance may lead to rapid PCR tools to understand and screen for the ability of pathogens to persist in the environment.
Siragusa/Marshall: We would like to end with the following question; why should food producers and processors embrace GenomeTrakr and become a part of it (i.e., Why should a company send in its pathogen isolates)?
Brown/Allard: The FDA understands that most food producers will not likely send their strains to the FDA or make WGS data they have generated publicly available due to a concern for legal liability risks. This then highlights the importance of third-party members providing assistance to the food industry to utilize the power of WGS data for understanding their own processing facilities and supply chains. The FDA is committed to support industry adoption of these WGS methods to improve food safety, which can be accomplished by data sharing, methods validation and data interpretation and education.
Siragusa/Marshall: Thank you Drs. Allard and Brown. That was most informative and we appreciate your sharing this knowledge.
Readers, we hope this interview will be useful to your work and knowledge base. Obtaining whole genome sequences of organisms is a routine service offered by many labs worldwide for reasonable costs. At the current time, there are no standardized methods, but we are informed that it is possible at some time in the future that an official method of analysis (OMA) will be formulated, similar to our other cultural tools, making WGS a tool available on a routine rather than special project basis with more comparable results. For more information on the GenomeTrakr Network of labs visit the GenomeTrakr Network.
As always, please contact either Greg Siragusa or Doug Marshall with comments, questions or ideas for future Food Genomics columns.
About the Interviewees
Marc W. Allard, Ph.D.
Marc Allard, Ph.D. is a senior biomedical research services officer specializing in both phylogenetic analysis as well as the biochemical laboratory methods that generate the genetic information in the GenomeTrakr database, which is part of the NCBI Pathogen Detection website. Allard joined the Division of Microbiology in FDA’s Office of Regulatory Science in 2008 where he uses Whole Genome Sequencing of foodborne pathogens to identify and characterize outbreaks of bacterial strains, particularly Salmonella, E. coli, and Listeria. He obtained a B.A. from the University of Vermont, an M.S. from Texas A&M University and his Ph.D. in biology in from Harvard University. Allard was the Louis Weintraub Associate Professor of Biology at George Washington University for 14 years from 1994 to 2008. He is a Fellow of the American Academy of Microbiology.
Eric W. Brown, Ph.D.
Eric W. Brown, Ph.D. currently serves as director of the Division of Microbiology in the Office of Regulatory Science. He oversees a group of 50 researchers and support scientists engaged in a multi-parameter research program to develop and apply microbiological and molecular genetic strategies for detecting, identifying, and differentiating bacterial foodborne pathogens such as Salmonella and shiga-toxin producing E. coli. Brown received his Ph.D. in microbial genetics from The Genetics Program in the Department of Biological Sciences at The George Washington University. He has conducted research in microbial evolution and microbial ecology as a research fellow in the National Cancer Institute, the U.S. Department of Agriculture, and as a tenure-track Professor of Microbiology at Loyola University of Chicago. Brown came to the Food and Drug Administration in 1999 and has since carried out numerous experiments relating to the detection, identification, and discrimination of foodborne pathogens.
Let's start the conversation