Marine dissolved organic matter (DOM), estimated at 662 × 1012 kg carbon, is the largest inventory of reduced carbon in the ocean and a marketplace for metabolic exchange (1). It is composed of more than 100,000 different molecules (2), of which over 95% are stable (recalcitrant) for thousands of years (3). Photosynthetic organisms are the main source of DOM in the sunlit ocean, particularly of bioavailable (labile) molecules that are rapidly turned over by the microbial community (4–6). Algal blooms, important ecological phenomena of increased phytoplankton biomass, substantially contribute to the release of labile molecules (7, 8). However, it is largely unknown how algal bloom succession shapes DOM composition and how different biotic interactions affect the DOM pool. A major driver of bloom demise is lytic viral infection, which leads to the release of algal biomass to the DOM pool, a process termed the “viral shunt” (9, 10). This key ecosystem process links primary production with the DOM pool and curtails the energy transfer to higher trophic levels, hence fueling the microbial food web and affecting the biogeochemical cycling of major nutrients in the ocean. Laboratory-based experimental approaches suggest that this virus-induced DOM (vDOM) has a characteristic composition (11). Nevertheless, the chemical nature of vDOM, driven by the viral shunt during algal bloom demise, has not yet been explored in the natural environment. Moreover, the characterization of individual metabolites that are unique to the marine vDOM is still lacking.
An ecologically important host-virus model system is the cosmopolitan alga Emiliania huxleyi and its specific large double-stranded DNA virus, E. huxleyi virus (EhV). E. huxleyi frequently forms vast blooms in the ocean, which are routinely terminated by lytic viral infection (12–14). These blooms are an important biomass source for the marine food web and affect the global biogeochemical cycling of carbon and sulfur (15–19). Viral infection leads to profound remodeling of the E. huxleyi metabolic pathways to support infection, including enhanced glycolytic fluxes, elevated fatty acid synthesis, and production of specific virus-derived glycosphingolipids (20–23). Once infected cells lyse, they act as a source for the DOM pool, releasing a unique bouquet of metabolites to the ocean. In this study, we sought to examine the impact of phytoplankton bloom succession on the composition of marine DOM and, in particular, to decode the metabolic footprint of the virus-induced bloom demise of E. huxleyi.
Phytoplankton bloom succession shapes the marine DOM
To map the exometabolic landscape during bloom succession of E. huxleyi, we applied an untargeted exometabolomics approach using an in situ mesocosm setup in the coastal waters of southern Norway (Fig. 1), where blooms and viral infection of E. huxleyi naturally occur (12). Natural marine microbial communities were enclosed in four mesocosm bags and monitored daily over 24 days. All bags were supplemented with nutrients at a nitrogen-to-phosphorus ratio of 16:1 to favor the growth and induce a bloom of E. huxleyi (24).
The phytoplankton community in each bag responded with an increase in biomass compared to the surrounding fjord, as indicated by elevated chlorophyll levels (Fig. 2A). A first bloom of a mixed algal community peaked at day 10, consisting mainly of pico- and nanoeukaryotes (fig. S1). This was followed by a bloom of E. huxleyi, as indicated by an increase in calcified cells, reaching up to 8 × 107 cells/liter (Fig. 2B). On the basis of E. huxleyi cell abundance, we defined three phases: the mixed algal community phase with a low abundance of calcified cells (<1 × 106 cells/liter; days 0 to 9), the bloom phase with an increase in calcified cells (days 10 to 17), and the demise phase, showing a sharp decrease in calcified cells (days 18 to 23). Phytoplankton bloom succession occurred similarly in all four bags apart from the demise phase of E. huxleyi, during which the bags diverged from one another (Fig. 2, A and B).
To characterize the DOM composition and monitor changes that are derived from biotic interactions during bloom succession, we applied an untargeted exometabolomics approach, in which we profiled small, semi-polar, extracellular metabolites on a daily basis (Fig. 1C). Seawater filtrates were extracted using solid-phase extraction (SPE) cartridges, and their metabolite composition was analyzed by liquid chromatography–high-resolution mass spectrometry (LC-HRMS). Principal components analysis (PCA) of the extracted exometabolites (6786 mass features) revealed a clear separation of fjord water samples from bloom-associated mesocosm samples throughout the experiment (Fig. 2C). The exometabolomes of the mesocosm bags were separated along the first PC axis (34.4%), which likely represents phytoplankton community succession over time. Furthermore, changes in metabolite composition between days were more pronounced than the variability between bags [repeated-measures analysis of variance (ANOVA), P < 0.0001]. The exometabolomes changed more substantially during the mixed algal community phase and the demise phase of E. huxleyi compared to a relatively uniform exometabolic landscape during the E. huxleyi bloom phase (fig. S2).
We further investigated the DOM composition at the metabolite level to reveal the effect of bloom succession on the production and release of exometabolites as opposed to their consumption or degradation. About 26% (n = 1747) of all detected mass features were elevated during succession of the induced phytoplankton community compared to the fjord. Hierarchical cluster analysis (Fig. 3A) revealed two distinct occurrence patterns: mass features that were abundant during the mixed algal community phase (n = 518) and mass features that increased during the E. huxleyi bloom and demise phases (n = 1229). Closer inspection of six selected clusters following grouping to putative metabolites (data S1) revealed diverse dynamics and fluctuation patterns in the bloom exometabolome (Fig. 3B). Some metabolites increased rapidly within a few days (clusters I and VI), whereas others increased gradually over more than 2 weeks (cluster V), indicating several biogenic sources throughout bloom succession. Metabolite consumption and transformation either occurred within a few days (cluster I), steadily over several days (cluster II) or were not observed until the end of the experiment (cluster III), thus representing exometabolites with different turnover times. Some metabolites peaked twice, both during demise of the mixed algal community and of E. huxleyi (cluster IV), suggesting a general association with phytoplankton bloom demise. The selected clusters were further inspected for differences in chemical composition by comparing their mass and retention time distributions (fig. S3). No significant difference in their metabolite mass distribution was found (ANOVA, P = 0.1). However, metabolite polarity, as indicated by retention time, showed significant differences between clusters (ANOVA, P < 0.0001), ranging from a wide distribution (cluster V) to a more condensed distribution (cluster IV).
A major driver of the exometabolite composition in the mesocosm bags was the phytoplankton bloom succession as shown by chlorophyll levels (Pearson correlation, r = 0.62), compared to abiotic factors such as temperature (Pearson correlation, r = 0.03). Accordingly, the fjord water exometabolome showed only minor changes throughout the experiment (Fig. 2C). Phytoplankton bloom succession was shown to affect recalcitrant DOM (25). Here, facilitated by the high temporal resolution of our metabolite profiling, we show the important role of bloom succession in shaping the biologically active, labile DOM. Mapping the exometabolic landscape further indicated that the bloom phase of E. huxleyi leads to a more uniform composition as compared to the pre- and postbloom periods. The extent to which blooms orchestrate DOM composition may thus depend on the characteristics of the bloom, such as species dominance. Hence, some of the major oceanic phytoplankton blooms may be characterized by distinct metabolite footprints, which could drive the growth of specialized associated microbial communities (26). In comparison, both the mixed algal community phase and the E. huxleyi demise phase showed strong changes in the exometabolic landscape. This might be explained by a fast succession of primary producers competing for the newly available inorganic nutrients during the mixed algal community phase and by a rapid succession of microbial consumers competing for the released organic matter during the E. huxleyi demise phase. By applying an untargeted exometabolomics approach daily over several weeks, we revealed the highly dynamic changes in the exometabolic landscape that occur throughout phytoplankton bloom succession. Most of the metabolites that are released during algal blooms are rapidly transformed or consumed by microbes within hours or days, hampering their discovery (3). Here, we were able to shed light on this biologically active, labile DOM fraction, thus expanding our ability to track changes in marine DOM composition. Although recent developments in analytical chemistry notably improved the ability to describe DOM composition, major challenges remain to decode the richness of seawater chemodiversity and to unambiguously identify its components. Another major bottleneck is to link changes in DOM composition to specific microbial interactions that occur during microbial community succession. Of specific interest are host-virus interactions, which can lead to profound remodeling of host metabolism followed by host cell lysis, thereby having major biogeochemical consequences (27, 28). However, the consequences of virus-induced bloom demise for DOM composition are poorly understood.
The metabolic footprint of alga-virus interaction
We aimed to discover exometabolites that are specific to virus-induced E. huxleyi bloom demise. Infection dynamics of E. huxleyi by EhV varied among the bags (Fig. 4A), providing a comparative tool to identify metabolites that are enriched during viral infection. The strongest increase in extracellular EhV occurred in bag 4, followed by bag 2 and bag 1, whereas in bag 3, no viral proliferation was observed. This was also reflected in the demise of E. huxleyi, which varied between the bags, declining to the lowest cell abundance in bag 4 (Fig. 4A). To resolve the metabolic composition of the vDOM, we correlated the temporal profile of each mass feature with a gene marker of EhV [major capsid protein (mcp)]. In total, 862 mass features were positively correlated with extracellular mcp abundance in the most infected bag 4 during the bloom and the demise phase (Pearson correlation, r > 0.6). Hierarchical cluster analysis (fig. S4) and inspection of the intensity profiles from all bags revealed a subset of 65 mass features that were differential between the bags and corresponded to the different levels of viral infection. Following feature deconvolution and manual curation, these mass features were grouped into 20 putative metabolites (Fig. 4B). MS-based structural characterization revealed that 17 of the 20 newly identified vDOM metabolites are halogenated, having different combinations of chlorine, bromine, and iodine (Table 1).
Intriguingly, nine vDOM metabolites (#3 to 8, 10, 14, and 16) contained both two to three chlorine atoms and iodine based on their isotope patterns, characteristic neutral losses, and fragments of halogen atoms (Fig. 5, A and B, and data S2). The predicted molecular formulas of these chloro-iodo metabolites (Table 1) were not found in common mass spectral and natural product libraries. These metabolites are oxygen-rich (O/C ratio of ~0.5), lack nitrogen, and show a low degree of unsaturation (double bond equivalent of ≤4). This excludes peptides, alkaloids, and aromatic metabolites and may hint toward compounds such as modified terpenes, lipids, and polyketides. While E. huxleyi can release halomethanes (29), other marine algae produce halogenated terpenes, oxylipins, and polyketides (30, 31). Automated prediction of compound classes based on the fragmentation spectra of the chloro-iodo metabolites further indicated “carbohydrates and conjugates,” “carboxylic acid derivatives,” and “carbonyl compounds” as putative compound classes.
We further investigated general differences in the occurrence of iodine-containing metabolites by screening for the iodide fragment [mass/charge ratio (m/z) 126.90] following collision-induced dissociation in MSE mode. This revealed a general increase in the number and intensity of iodine-containing metabolites through time in the mesocosm bags as a function of viral infection (Fig. 5C and fig. S5). Moreover, it revealed additional iodine-containing metabolites in bag 4, three of which also contained chlorine (metabolites #21 to 23; Fig. 5C and data S2). Together, these data indicate a general metabolic shift toward iodination during virus-induced bloom demise.
The chlorine-iodine–containing vDOM metabolites were not detected in the fjord water and in the absence of E. huxleyi (fig. S6), further highlighting their close association with viral infection of E. huxleyi. Closer examination of their temporal profiles indicated an early appearance (day 16 in bag 4; Fig. 5D), which preceded the first detection of extracellular EhV by 1 day and the onset of bloom demise by 2 days. The abundance of chloro-iodo metabolites increased during the demise phase until day 20 and then remained constant until day 23, indicating that these metabolites are not immediately consumed, transformed, or exported. Thus, the chloro-iodo metabolites may function as sensitive biomarkers for virus-induced demise of E. huxleyi blooms in the natural environment.
Chloro-iodo metabolites as hallmarks of viral infection in the ocean
We sought to examine the significance of these chloro-iodo metabolites as metabolic signatures of viral infection in oceanic E. huxleyi blooms. Biomass samples for endometabolite analysis were collected during the NA-VICE cruise in the North Atlantic, capturing E. huxleyi blooms at different infection stages that were defined using several diagnostic markers (32). “Late infection” sites were characterized by high EhV levels, whereas at “postinfection” sites, only low levels of E. huxleyi were detected, indicating that the samples were collected after bloom demise (32). We were able to detect 7 of 12 chloro-iodo metabolites at the late infection bloom stage (cast 29), 5 of which were present in high intensities (Fig. 5E; for comparison of LC-MS data, see fig. S7 and data S2). In contrast, only trace amounts of these metabolites were detected at the two postinfection sites (casts 20 and 25; Fig. 5E). The presence of the chloro-iodo metabolites in marine particulate matter (i.e., intracellularly), in addition to their presence as dissolved metabolites in the mesocosm bags (i.e., extracellularly), suggests that their production is directly linked to virus-infected E. huxleyi cells or to closely associated microorganisms. The lower intracellular amounts of chloro-iodo metabolites at the postinfection bloom stage may be a result of host cell lysis, shunting these metabolites from the particulate to the dissolved pool. Similar to the mesocosm bags, a general increase in iodine-containing metabolites was observed at the late infection bloom stage (fig. S8). We identified six of these iodine-containing metabolites as chloro-iodo metabolites (metabolites #24 to 29; data S2), of which two were detected at low intensities in mesocosm samples (metabolites #24 and 28). The distinct halogenated metabolic signature of viral infection of E. huxleyi, which we found in mesocosm enclosures in the Norwegian Raunefjorden, can thus be found also in open ocean blooms.
Halogenation is a prominent attribute of natural products in the marine environment due to the high concentrations of halide ions in seawater, with ~500 mM chloride, ~1 mM bromide, and ~ 0.001 mM iodide (33). Accordingly, several thousand chlorine- and bromine-containing metabolites have been reported, compared to less than 200 iodine-containing metabolites (34). Among them are only a small number of chloro-iodo metabolites, including tryptophan derivatives isolated from a marine sponge (35) and coral-derived diterpenoids (36). In the pelagic ecosystem, the formation of metabolites that are both chlorinated and iodinated has been reported only for volatile organohalogens, such as chloroiodomethane (29, 37). While some halogenated metabolites have antioxidant and antipathogen activity (38), little is known about their ecological roles in the marine environment. All known algal halogenases belong to the class of haloperoxidases (39), which lead to the formation of halogenated metabolites following the oxidation of halides by H2O2. Recently, H2O2 was shown to play a pivotal role in regulating E. huxleyi cell death during the onset of the lytic phase of viral infection (40). Therefore, the increased levels of halogenated metabolites may be a result of enhanced haloperoxidase activity in E. huxleyi cells, acting as a possible strategy to scavenge surplus reactive oxygen species during viral infection. However, to date, no halogenases have been reported for E. huxleyi. Recently, a halogenase encoded by a marine cyanophage was found (41), illustrating how viruses can expand the metabolic capabilities of the infected host. Once part of the DOM pool, the chloro-iodo metabolites may be subjected to microbial activity, including transformation, assimilation, and remineralization processes. For example, Rhodobacteraceae, heterotrophic bacteria that are closely associated with E. huxleyi blooms, are known to encode dehalogenases (42). Thus, it would be interesting to investigate whether Rhodobacteraceae can use the chloro-iodo metabolites as a carbon or energy source.
The importance of the viral shunt to the marine DOM pool was first highlighted more than 20 years ago (10). Since then, the role of viruses as major ecological and evolutionary drivers in the aquatic ecosystem has become increasingly apparent (43). Nevertheless, we still lack quantitative tools to assess the extent of the viral shunt, its metabolic composition, and the direct consequences for the marine microbial community. By applying an untargeted exometabolomics approach in high temporal resolution, we mapped the metabolic landscape of biologically active, labile DOM that evolves during phytoplankton bloom succession. We further resolved the metabolic footprint of lytic viral infection of E. huxleyi in the marine environment, revealing that halogenation with both chlorine and iodine is a hallmark of the E. huxleyi vDOM. Accumulation of the halogenated vDOM metabolites during bloom demise indicates their stability under environmental conditions. Consequently, they may serve as metabolic biomarkers for the quantification of the viral shunt in E. huxleyi blooms. Decoding the unique metabolic landscape produced by different microbial interactions that control cell fate in phytoplankton blooms will provide essential insights into the impact of these microscale interactions on large-scale biogeochemical processes in the marine environment.
MATERIALS AND METHODS
Chemicals and internal standards
All solvents and metabolite standards were obtained at highest purity. Hydrochloric acid [HCl, ≥32% (T), Fluka] and methanol (Chromasolv LC-MS Ultra) used for SPE were purchased from Honeywell (Seelze, Germany). For all other purposes, methanol [ultragradient high-performance liquid chromatography (HPLC)] was purchased from J. T. Baker (Norway). Acetonitrile (ULC/MS), methyl tert-butyl ether (MTBE, HPLC), hexane (HPLC), and formic acid (99%, ULC/MS) were purchased from Bio-Lab (Jerusalem, Israel). Acetone (≥99.8%, Chromasolv HPLC) was purchased from Sigma-Aldrich (Saint Louis, MO, USA). Water (HiPerSolv Chromanorm) used for SPE was purchased from VWR (Oslo, Norway). For all other purposes, water was purified by a Milli-Q system (resistivity of 18.2 megohm cm at 25°C, total organic carbon <5 parts per billion; Merck Millipore, Molsheim, France). Indole-3-acetic-2,2-d2 acid (d2-IAA; ≥98%) and N-hexanoyl-l-homoserine lactone-d3 (d3-C6-HSL; ≥99%) were used as isotopically labeled extraction standards (Santa Cruz Biotechnology, Dallas, TX, USA). Caffeine-(trimethyl-d9) (98%; Sigma-Aldrich) and l-tryptophan-d5 (98%; Cambridge Isotope Laboratories, Andover, MA, USA) were used as isotopically labeled injection standards for ultraperformance LC–HRMS (UPLC-HRMS) analysis.
Mesocosm setup and sampling
The mesocosm experiment AQUACOSM VIMS-Ehux was carried out between 23 May (day −1) and 16 June (day 23) 2018 in Raunefjorden at the Marine Biological Station Espegrend, Norway (60°16′11N; 5°13′07E) as previously described (44). Four light-transparent enclosure bags were filled with nonfiltered surrounding fjord water (day −1; pumped from 5 m depth) and continuously mixed by aeration (from day 0 onward). Each bag was supplemented with nutrients at a nitrogen-to-phosphorus ratio of 16:1 (1.6 μM NaNO3 and 0.1 μM KH2PO4 final concentrations) on days 0 to 5 and 14 to 17, whereas on days 6, 7, and 13, only nitrogen was added. Nutrient concentrations and temperature were measured daily (45). Water samples were collected daily (7:00 a.m.) from each bag and the surrounding fjord, which served as an environmental reference. For flow cytometry, water samples were collected in 50-ml tubes from approximately 1 m depth. For all other purposes, water samples were collected in 10- to 20-liter carboys (rinsed with <100-kDa filtered seawater) from approximately 1 m depth using a peristaltic pump at ca. 5 liters/min and prefiltered with a 200-μm nylon mesh. Samples were kept at 4°C until further processing.
Enumeration of phytoplankton cells by flow cytometry
Water samples were prefiltered using 40-μm cell strainers and immediately analyzed with an Eclipse iCyt flow cytometer (Sony Biotechology, Champaign, IL, USA) as previously described (21). A total volume of 300 μl with a flow rate of 150 μl/min was analyzed. A threshold was applied on the forward scatter signal to reduce background noise. Four phytoplankton populations were identified by plotting the autofluorescence of chlorophyll (emission, 663 to 737 nm) versus phycoerythrin (emission, 570 to 620 nm) and side scatter: calcified E. huxleyi (high side scatter), Synechococcus (high phycoerythrin), nanoeukaryotes (high chlorophyll), and picoeukaryotes (low chlorophyll; fig. S1).
Enumeration of extracellular EhV by qPCR
Water samples (1 to 2 liters) were sequentially filtered by vacuum through hydrophilic polycarbonate filters with a pore size of first 20 μm (47 mm; Sterlitech, Kent, WA, USA), then 2 μm (Isopore, 47 mm; Merck Millipore, Cork, Ireland), and lastly 0.22 μm (Isopore, 47 mm; Merck Millipore). Filters were immediately flash-frozen in liquid nitrogen and stored at −80°C until further processing. DNA was extracted from the 0.22-μm filters using the DNeasy PowerWater kit (QIAGEN, Hilden, Germany) according to the manufacturer’s instructions. Each sample was diluted 100 times, and 1 μl was then used for quantitative polymerase chain reaction (qPCR) analysis. EhV abundance was determined by qPCR for the mcp gene (46): 5′-acgcaccctcaatgtatggaagg-3′ (mcp1Fw; (47)) and 5′-rtscrgccaactcagcagtcgt-3′ (mcp94Rv). All reactions were carried out in technical triplicates. For all reactions, Platinum SYBR Green qPCR SuperMix-UDG with ROX (Invitrogen, Carlsbad, CA, USA) was used as described by the manufacturer. Reactions were performed on a QuantStudio 5 Real-Time PCR System equipped with the QuantStudio Design and Analysis Software version 1.5.1 (Applied Biosystems, Foster City, CA, USA) as follows: 50°C for 2 min, 95°C for 5 min, 40 cycles of 95°C for 15 s, and 60°C for 30 s. Results were calibrated against serial dilutions of EhV201 DNA at known concentrations, enabling exact enumeration of viruses. Samples showing multiple peaks in melting curve analysis or peaks that were not corresponding to the standard curves were omitted.
Sampling and extraction of DOM by SPE
To collect DOM of <0.22-μm particle size, water samples were filtered gently, acidified, and led through hydrophilic-lipophilic balance (HLB) SPE cartridges, which selectively adsorb metabolites and thereby extract a fraction of the DOM pool. Glassware and chemically resistant equipment were used whenever possible and cleaned with HCl (1 or 10%) and deconex 20 NS-x (Borer Chemie, Zuchwil, Switzerland) to reduce contaminations. On day 5, no samples were extracted for DOM analysis. Water samples were first gravity-filtered through 25-μm stainless steel filters (47 mm; Sinun Tech, Barkan, Israel), which were precleaned by thorough washing in a polarity gradient of organic solvents (water, methanol, acetone, and hexane). Filtrates were then filtered gently by vacuum (<400 mbar underpressure) through precombusted GF/A glass microfiber filters (≥5 hours at 460°C; 47 mm; GE Healthcare Whatman, Buckinghamshire, UK) and lastly through 0.22-μm precleaned (48) hydrophilic polyvinylidene difluoride (PVDF) filters (<600 mbar underpressure; 47 mm; Durapore, Merck Millipore). Filtration led to 70 to 84% reduction in the abundance of large virus-like particles (VLPs) and to the complete removal of bacteria (fig. S9, A and B). Per sample, 1 liter of filtrate was collected in a glass bottle and spiked with 5 μl of internal standard solutions containing d2-IAA (1 μg/μl in water) and d3-C6-HSL (0.2 μg/μl in methanol), except for days 1 to 2, in which no internal standards were added. The extraction efficiency was estimated at ~65% for d2-IAA and ~81% for d3-C6-HSL by comparing the average of all biological samples to the “IS matrix” samples (fig. S10 and table S1). Following addition of internal standards, the filtrates were incubated for 2 to 3 hours at 4°C in the dark and then acidified to pH 2 using 10% HCl (49). Metabolites were extracted using SPE cartridges (Oasis HLB, 500 mg; Waters, Milford, MA, USA) as follows: cartridges were conditioned (6 ml of methanol), equilibrated (6 ml of 0.01 N HCl), and then loaded by gravity with the acidified samples (1.5 to 2.5 hours). The cartridges were then washed (18 ml of 0.01 N HCl), dried completely using a vacuum pump, and gravity-eluted with 5 ml of methanol into 4-ml glass vials. In total, 110 biological samples were collected. Blank samples and internal standard quality control (IS QC) samples were obtained every 5 days (in total, 19 samples; table S1). Eluates were kept at −80°C, dried under a flow of nitrogen (TurboVap LV, Biotage, Uppsala, Sweden) within 1.5 months after collection, and stored at −80°C until further processing.
Enumeration of large VLPs and bacteria in seawater filtrates by flow cytometry
Extracellular large VLPs and bacteria were quantified as described previously (21, 50). Briefly, water samples were fixed with glutaraldehyde (0.5% final concentration) for 30 min at 4°C, plunged into liquid nitrogen, and then thawed. Ten microliters of fixed sample was stained with SYBR Gold (Invitrogen, Paisley, UK) prepared in tris-EDTA buffer as instructed by the manufacturer (5 μl of SYBR Gold in 50 ml of 0.22 μm filtered tris-EDTA), then incubated for 20 min at 80°C, and cooled down to room temperature. Flow cytometry analysis was performed using an Eclipse iCyt flow cytometer (Sony Biotechnology) with excitation at 488 nm and emission at 525 nm. Gates for large VLPs and bacteria were set by plotting the emission at 525 nm against the emission at 663 to 737 nm. A total volume of 30 μl with a flow rate of 10 μl/min was analyzed. A threshold was applied on the basis of the forward scatter signal to reduce background noise. The gates for large VLPs and bacteria were set by comparing to reference samples containing fixed EhV201 and bacteria from laboratory cultures (fig. S9C).
Untargeted profiling of semipolar metabolites by UPLC-HRMS
Biological samples were randomized and divided into three batches with approximately 40 samples in each batch, including blanks and IS QC samples (table S2). Randomization was performed automatically using an in-house R (51) script with the following constraints: the total number of biological samples per batch was either 36 or 37, of which 7 or 8 samples were randomly sampled from the pool of fjord samples and the remaining samples were randomly sampled from the pool of bag samples. Every experimental sampling day was represented at least once in each analytical batch. Per batch, samples were thawed, redissolved in 310 μl of methanol:water (1:1, v/v) containing l-tryptophan-d5 (2.1 μg/ml) and caffeine-(trimethyl-d9) (1.5 μg/ml) as injection standards, vortexed, sonicated for 10 min, and centrifuged at 3200g for 10 min at 4°C. The supernatants were transferred to 200-μl glass inserts in autosampler vials and directly used for LC-MS analysis. An aliquot of 1 μl was analyzed using UPLC coupled to a photodiode array detector (ACQUITY UPLC I-Class, Waters) and a quadrupole time-of-flight mass spectrometer (SYNAPT G2 HDMS, Waters), as described previously (52) with slight modifications. Briefly, metabolites were separated using an ACQUITY UPLC BEH C18 column (100 mm by 2.1 mm, 1.7 μm; Waters) attached to a VanGuard pre-column (5 mm by 2.1 mm, 1.7 μm; Waters) with a gradient of 5 to 100% acetonitrile at a flow of 0.3 ml/min and a total run time of 40 min. The mobile phases consisted of 0.1% formic acid in either acetonitrile:water (5:95, v/v, mobile phase A) or acetonitrile (mobile phase B). The chromatographic gradient was set to linear from 100 to 72% mobile phase A over 22 min and from 72 to 30% mobile phase A over 13.5 min, after which the column was first washed with 100% mobile phase B for 2 min and then returned to initial conditions (100% mobile phase A) and equilibrated for 1.5 min. The photodiode array detector was set to 200 to 600 nm. A divert valve (Rheodyne) excluded 0 to 1.2 min and 35 to 40 min from injection to the mass spectrometer. The electrospray ionization source was set to 140°C source and 450°C desolvation temperature, 1.5-kV capillary voltage, and 20- or 27-eV cone voltage (positive or negative ionization mode, respectively), using nitrogen as desolvation gas (800 liters/hour) and cone gas (53 liters/hour). The mass spectrometer was operated in full-scan MSE resolution mode in two separate acquisitions—positive (25,000 at m/z 556) and negative (26,000 at m/z 554) ionization modes—over a mass range of 50 to 1600 Da alternating with 0.1 min scan time between low- (1 eV of collision energy) and high-energy scan function (collision energy ramp of 10 to 45 eV in positive and 10 to 40 eV in negative ionization mode). LC-MS analyses were performed over 2 weeks with about 120 injections per batch in the positive and negative ionization modes, including blanks, IS QC, authentic standard mixtures, reference material, and aliquots of a pooled QC sample (tables S3 and S4). The pooled QC sample was generated by combining aliquots of 10 μl from all biological samples of the first batch. For each batch, a new aliquot was transferred to an injection vial.
Comparative analysis of untargeted metabolite profiling data
LC-MS files were converted from Waters RAW binary files to open-format “mzXML” files using the command line version of the “msconvert” utility as part of the ProteoWizard toolkit (53). Conversion parameters were set as follows: 64-bit numeric accuracy, “zlib” compression enabled, and “scanEvent” filter set to “1,” corresponding with the full-scan acquisition channel. Preprocessing of the “mzXML” files, which generates matrices with aligned LC-MS features across experimental samples with corresponding integrated peak area values, was performed using the R packages “xcms” (54) and “CAMERA” (55) obtained from the Bioconductor repository (www.bioconductor.org). More recent 3.x versions of “xcms” have dedicated functions for quality control of raw data preprocessing, for which parameters were fine-tuned (tables S5 and S6 and fig. S11). The internal standards d2-IAA and d3-C6-HSL were used to assess the outcome of the grouping step after retention time alignment (fig. S12). Feature matrices were inter- and intrabatch-corrected using an in-house R script applying the algorithm presented by van der Kloet et al. (56). Briefly, peak intensities in each batch were corrected for systematic variations in MS sensitivity by using a nonlinear curve fitted to peaks from the pooled QC samples, whereas nonsystematic interbatch fluctuations in MS sensitivities were corrected by adjusting the medians of the regression curves. PCA was applied to the batch-corrected data to have an initial overview of the sample separation and as a reference for the data normalization procedure. Two fjord samples (day 10 and day 19) were identified as outliers and omitted from further analysis. Choice of the most appropriate normalization method was based on the following experimental constraints: absence of technical replicates, potentially high variability between biological groups (the mesocosm bag samples are not true replicates), and the strong influence of background environmental fluctuations. The probabilistic quotient normalization algorithm (57) was chosen as it applies minimal assumptions about the data and instead relies on the empirical distribution in a reference sample set. Each peak intensity is thereby normalized by the quotient of the same peak in the reference sample (the average of the pooled QC samples). As pooled QC samples capture most of the analytical variation that remains after batch correction and some of the environmental variability, they are a natural choice as reference sample set. PCA was then reapplied to the normalized feature matrix in the positive (Fig. 2C) and the negative (fig. S13) ionization modes.
One-way repeated-measures ANOVA was performed to test whether the DOM composition, as represented by the scores of PC axis 1 (Fig. 2C), changed significantly through time. Using the R package “nlme,” a linear mixed-effects model was fitted by setting the factor “time” as fixed effect and the factor “bag number” as random effect. The degree of change in DOM composition was further assessed for each phytoplankton bloom phase by fitting a liner regression model using the R function “lm” (fig. S2). Furthermore, a Pearson correlation analysis was performed to correlate PC axis 1 with water temperature and flow cytometry–based chlorophyll level using the R function “cor” while omitting missing values.
Hierarchical cluster analysis was applied to the feature matrix in the positive ionization mode. First, data were filtered by removing features in which the intensity of the fjord samples was higher than 60% of the maximum intensity for that particular feature. While being a very simplistic filtering approach, it removed most features related to environmental changes and allowed us to focus on features that are most relevant to the biological processes of interest. The feature matrix was then log-transformed and standardized per mass feature. The “heatmap.2” function from the “gplots” R package was used to generate a heatmap with row-wise scaling and clustering and with the “redblue” color panel (Fig. 3A). Columns were ordered according to the experimental day factor to preserve the temporal property of the data and observe possible trends over time. Mass features of the selected clusters I to VI underwent further manual annotation to reduce redundancy in each cluster. The peak shape of the extracted ion chromatograms (EICs) from coeluting mass features was compared using MassLynx (version 4.1, Waters), and isotopes, adducts, and apparent neutral losses (e.g., of water) were annotated. This reduced the number of mass features in each cluster to a smaller number of feature groups as follows: cluster I, from 53 to 12; cluster II, from 43 to 16; cluster III, from 124 to 67; cluster IV, from 129 to 67; cluster V, from 239 to 146; and cluster VI, from 102 to 71. For each feature group, the protonated molecule ([M + H]+) was selected or calculated (data S1), and violin plots were generated to visualize their distribution in each cluster along with the retention time distributions (fig. S3). A global one-way ANOVA was performed for the protonated molecule and retention time distributions using the function “compare_means” from the R package “ggpubr” and the method set to “kruskal.test.”
Correlation analysis of metabolite profiling data with extracellular EhV abundance
To focus on the bloom and demise phases of E. huxleyi as indicated by cell enumeration data, the raw data preprocessing steps were reapplied as described above, however, for a reduced number of samples (days 10 to 23) with more sensitive peak grouping parameters (“minSamples = 8,” “minFraction = 0”; tables S5 and S6). This reiteration enabled the detection of mass features that were present in some of the mesocosm bags and for a few days only. The feature matrices resulting from this step contained almost twice as many features as the first, global feature matrices and were used, together with the extracellular EhV abundance data, for correlation and differential analysis. Pearson’s correlation coefficient values between the temporal profile of extracellular EhV and each feature corresponding with mesocosm bag 4 were calculated using the R function “cor.test.” The “alternative” was set to “greater” to find only features that are positively correlated with extracellular EhV abundance, and “method” was set to “BY” to adjust P values for multiple correlation testing. Features were filtered according to the following thresholds: estimated correlation r > 0.6, adjusted P < 0.05, and intensity of feature in the fjord samples <60% of the maximum intensity in the mesocosm bag samples, resulting in 862 mass features. The log-transformed peak intensities were clustered using the “agnes” function in the R package “cluster” and plotted as a heatmap using the function “heatmap.2” (fig. S4). Columns were ordered according to each mesocosm bag to highlight the differences between them. A subcluster of 128 mass features showed distinct differences between bags 3 and 4. Individual intensity profiles of each feature were plotted for the mesocosm bags and the fjord across days 10 to 23. These plots were used for manual inspection, which allowed the selection of a subset of 65 mass features that showed differential intensities between mesocosm bags, as observed for extracellular EhV abundance. Manual curation of these mass features revealed 20 feature groups in which the isotopes, adducts, and in-source fragments were annotated (data S2). Hereinafter, these feature groups are referred to as vDOM metabolites. Last, a heatmap of the most intense feature of each metabolite (the protonated molecule or the water loss fragment) was plotted using the function “heatmap.2” and the “PuBuGn” color panel of the R package “RColorBrewer” (Fig. 4B). The columns were ordered first according to bags following their level of viral infection and then days and the rows according to retention time.
Structural characterization of the vDOM metabolites
For structural information of the 20 vDOM metabolites, tandem MS (MS/MS) analyses were performed in both positive and negative ionization modes for the protonated or deprotonated molecules, respectively, using a collision energy ramp of 10 to 45 eV and a scan time of 0.5 s. Analyses were performed on samples with high intensities, namely, samples from days 22 and 23 of mesocosm bag 4. In case of low signal intensity, up to 3 μl was injected, or the most intense fragment ion ([M + H – H2O]+) or adduct ion ([M – H + FA]−) was selected. Manually curated MS/MS spectra were used to annotate fragments and neutral losses and for the prediction of molecular formulas using SIRIUS 4.0.1 (58). For formula prediction, the following elements were allowed: zero to three chlorine atoms (based on the isotope pattern), zero to one iodine atom (based on the presence of m/z 126.90 in MS/MS spectra acquired in negative ionization mode), and zero-infinite nitrogen atoms (set to “0” if only odd fragments were detected). Default settings were used for C, P, H, S, and O (“0-infinite”). Mass accuracy was set to 10 parts per million (ppm). For a summary of the 20 vDOM metabolites, see Table 1, and for annotated mass spectral information, see data S2. The predicted molecular formulas of the chlorine-iodine–containing metabolites were searched against several external mass spectral and natural product databases (table S7). The MS/MS spectra were used to further predict compound classes using the CANOPUS tool in SIRIUS 4.6.0, which predicts compound classes for unknown metabolites (59). This computational approach matched the fragment spectra of (i) the two nitrogen-iodine–containing metabolites (#9 and 12) to the subclass “N-acyl amines,” and (ii) the nine chlorine-iodine–containing metabolites (#3 to 8, 10, 14, and 16) to the subclasses “carbohydrates and conjugates,” “carboxylic acid derivatives,” and “carbonyl compounds.”
Temporal profiles of the chloro-iodo metabolites and untargeted screening for iodine-containing metabolites
Temporal intensity profiles of the chloro-iodo metabolites were generated using MassLynx and QuanLynx (version 4.1, Waters). Peak areas were extracted for the [M + H]+ or [M + H – H2O]+ adduct ions above a signal-to-noise threshold of 10 (limit of quantification). The peak areas were normalized to the extraction standard d2-IAA. Normalization to d3-C6-HSL yielded similar results (fig. S14). Analysis of blank samples from the mesocosm experiment (n = 10; table S1) ruled out that any of the metabolites originated from sample processing.
For an untargeted screening for iodine-containing metabolites, EICs of the iodide fragment (m/z 126.90 ± 0.03 Da) in negative ionization mode following collision-induced dissociation in MSE mode were generated using MassLynx. EIC profiles were compared for (i) samples of bags 3 and 4 at the end of the E. huxleyi demise phase (day 23; Fig. 5C) and (ii) all mesocosm samples for days 10 to 23 (fig. S5A). Sum intensities of iodine-containing metabolites per bag and day throughout the whole experiment were calculated by summing the intensities of all scans of each EIC using an intensity threshold of 1.5 × 103 arbitrary units (AU) (fig. S5B). Structural analysis of the additional chloro-iodo metabolites #21 to 23 was based on MS/MS analyses in negative ionization mode for the deprotonated molecules as described above.
Extraction and metabolite profiling of E. huxleyi blooms in the North Atlantic
Water samples of natural E. huxleyi blooms were collected during the NA-VICE cruise as described previously (32). To extract metabolites that are associated with viral infection, cast 29 was selected, which represents a late infection stage (60), in addition to casts 20 and 25, which represent a postinfection stage (32, 60). Samples (1 to 1.5 liters) were prefiltered through 200-μm mesh and collected on 0.8-μm hydrophilic polycarbonate filters (47 mm; Millipore). The filters were then flash-frozen in liquid nitrogen and stored at −80°C until processing 7 years after collection. Extraction was performed as described previously (21) with slight modifications. Briefly, filters were extracted with 3 ml of precooled (−20°C) methanol:MTBE (1:3, v/v) solution containing d2-IAA (1.67 μg/ml) and d3-C6-HSL (0.33 μg/ml) as internal standards. The samples were shaken for 30 min at 4°C and sonicated for 30 min. The samples were then supplemented with 1.5 ml of water:methanol (3:1, v/v) solution, vortexed for 1 min, and centrifuged for 10 min at 3200g and 4°C. The upper organic phase was removed (1.5 ml), and the remaining aqueous phase was reextracted with 1.5 ml of precooled MTBE to further reduce lipophilic metabolites. Last, the aqueous phase (1.5 ml) was transferred to 2-ml centrifuge tubes, centrifuged for 10 min at 28,000g and 4°C, and the supernatant was dried under a flow of nitrogen (TurboVap LV), followed by lyophilization (Gamma 2-16 LSCplus, Martin Christ, Osterode am Harz, Germany). The dried extracts were kept at −80°C until LC-MS analysis.
Untargeted profiling of semipolar metabolites by UPLC-HRMS analysis was performed as described above, using an aliquot of 3 μl of each resuspended extract. All samples were screened for the presence of the 12 vDOM chloro-iodo metabolites (#3 to 8, 10, 14, 16, and 21 to 23). Two samples with high peak intensities (cast 29, 5 and 9 m) were subjected to MS/MS analyses in negative ionization mode as described above, using an injection volume of 4 μl. Mass differences between the mesocosm and NA-VICE cruise samples were calculated for precursor and product ions (data S2). For an exemplary mass spectral comparison of chloro-iodo metabolite #7, see fig. S7. Depth profiles of five chloro-iodo metabolites (#3, 5, 7, 8, and 14; Fig. 5D) that were above limit of quantification were generated as described above and further normalized to the filtered volume.
For an untargeted screening for iodine-containing metabolites, EICs of the iodide fragment were generated for three samples (cast 29, 9 m; casts 20 and 25, 8 m; fig. S8A). For all additional peaks, potential precursor ions were selected, and MS/MS analyses were performed in negative ionization mode as described above using the 5- and 9-m samples of cast 29. Depth profiles of six additional chloro-iodo metabolites derived from the NA-VICE cruise (#24 to 29) were generated as described above using the deprotonated molecules [M – H]− (fig. S8B).