Olfactory proteins of Endoclita signifer larvae and their roles in host recognition

Endoclita signifer causes severe damage to eucalyptus plantations, and the larvae transfer to and damage eucalyptus accurately in mixed forests, suggesting that the larval olfactory system contributes to host selection. The olfactory proteins in the head and tegument of E. signifer larvae were previously identified. To identify the relationship between olfactory protein expression in the larval head the larvae head and the developmental expression dynamics, and its functions in further recognition of plant volatiles, the head transcriptomes of two instar larvae and the expression profiles of olfactory proteins in the instars after exposure to volatiles were studied. Eight odorant-binding proteins, six chemosensory proteins, three odorant receptors, three gustatory receptors, and 18 ionotropic receptors were identified. Half of the olfactory proteins were the most highly expressed in the young (5th) larval head, and EsigGOBP2, EsigGOBP4, EsigGOBP5, EsigCSP1, EsigCSP3, EsigGR1 and EsigGR3 were highly expressed and showed a specific expression pattern. In addition, after exposure to o-cymene, α-phellandrene, n-butyl ether, and 4-ethylacetophenone, EsigGR3 was downregulated significantly, and exposure to n-butyl ether caused EsigGR1 to be downregulated significantly. Seven specific olfactory proteins may be important genes in larval olfactory recognition. Furthermore, based on the receptors that were downregulated after exposure to volatiles and the previous electrophysiological activity in the third larvae, we speculated that the ligand of EsigGR1 was n-butyl ether, and the ligands of the newly identified EsigGR3 are all electrophysiologically active compounds, which demonstrated host recognition in the third larvae of E. signifer. These results provide a way to find key plant volatiles recognized by the key olfactory proteins as new targets for pest control.


Background
The ghost moth Endoclita signifer Walker (Lepidoptera, Hepialidae) is the primary wood-boring pest of eucalyptus, which was first paid attention to in China in 2007 and has caused great economic losses and ecological impacts in southern China, especially in Guangxi and Guangdong [1]. E. signifer is widely distributed in Japan and Korea in eastern Asia and from central, southern and southwestern China to India, Thailand and Myanmar in southern Asia [2]. In China, E. signifer is a native pest, and its host plants include 30 families, 40 genera and 51 species [3]. Before eucalyptus was planted in large areas in Guangxi and Guangdong, E. signifer feeding on trees was not damaging large areas of forest and no one treated it as pest. After eucalyptus plantations were established almost everywhere in Guangxi, a large area of damage to E. signifer was found in 2007. Currently, all of the plantations in Guangxi, except for 17.1% of counties, are infested [3].
In Guangxi, the E. signifer occurs in one generation a year, rarely two in a year. The adults (Fig. 1A) emerge during the middle of March to April, followed by mating and oviposition. The larvae hatch in one month and then live in the soil. Interestingly, the larvae move from the soil to a standing tree after the third instar from July to August (Fig. 1B). The larvae feed on bark, bore into the interior of the wood, and weave packages with wood bits and silk to cover the entrance to the wormholes, constructing homes (Fig. 1C-E). The larvae live in their homes from July or August to January of the following year, and pupariation occurs in February.
Female oviposition is dispersed; however, the larvae can specifically damage eight species of eucalyptus in mixed forests accurately, so we hypothesized larval olfactory cues contribute to host selection. E. signifer is a native polyphagous insect pest, but it universally and severely damaged eucalyptus after it was planted in southern China in 2007, which is a typical example of native insect adaptation to exotic hosts [4]. Olfactory proteins in the head and tegument of the E. signifer larval transcriptome were previously identified, and 39 olfactory proteins were found to be expressed in the head, with EsigGR1 and EsigCSP3 as the key olfactory proteins [5], establishing a basis for studying the dynamic changes in E. signifer olfactory proteins and their relationship with larval host selection.
In addition, the functions of many olfactory proteins in insects have been explored using qRT-PCR, prokaryotic expression, immunofluorescence localization, fluorescence competitive binding, molecular docking, a Xenopus oocyte expression system, single-sensillum recording, and behavioral studies. Fluorescence binding assays indicated that three H. assulta PBPs show selectivity for linear alcohols and aldehydes of different lengths, and PBP1 and PBP2 have optimal affinities to ligands containing 13-15 and 12-14 carbon atoms, respectively [11]. OBP10 might be a carrier of oviposition deterrents, favoring the spread of H. assulta eggs [12]. For olfactory receptors, nonanal is the main ligand of OR67, as demonstrated with an in vitro Xenopus oocyte expression system and single-sensillum recording [13]. HassOR40/ORco is expressed in the B neurons of short trichodea sensilla, and the active tobacco volatile ligand nerolidol attracts both sexes in adult H. assulta [14]. HassOR23/ORco is narrowly tuned to farnesene isomers in the Xenopus oocyte expression system; farnesene inhibits H. assulta and attracts its endoparasitoid [15]. HassOR31 has much higher expression in the ovipositor than in the antennae or other tissues, and the Xenopus oocyte model system, electrophysiological responses, and oviposition preference experiments suggest that HassOR31 expression in the H. assulta ovipositor helps females to determine precise egg-laying sites in host plants [16]. Overall, olfactory proteins, including odorant-binding proteins and odorant receptors, interact with plant volatiles.
Transcriptome analyses of larvae have focused on pesticides or ecological adaptability, exploring the molecular evidence and differences based on physiological and biochemical reactions, such as the exposure of Spodoptera exigua to Cry1Ca protein [17] and Apis mellifera to carbendazim [18] and the desiccation tolerance of Polypedilum vanderplanki [19]. Regarding larval olfaction, six novel OBPs and CSPs were identified in the transcriptomes of H. assulta larval antennae and mouthparts, respectively, and four novel OBPs and seven novel CSPs were identified in the same transcriptomes of Helicoverpa armigera [20]. Tissuespecific profiles of H. armigera showed that six OBPs and four CSPs were specific to larval tissue, while 15 OBPs and 13 CSPs were expressed in both larvae and adults, and the remainder were adult-specific [20]. Sex-iOBP13 was highly expressed in the larval head but not in other larval parts and was not detected in any adult tissue; SexiOBP13 showed high binding affinity to the sex pheromone component of S. exigua Z9, E12-14: OAc. This is supported by behavioral tests, indicating that SexiOBP13 plays a role in female sex pheromone reception in S. exigua larvae [21]. Immunohistochemistry demonstrated that anti-MscuOBP8 binds specifically to MscuOBP8 and showed that MscuOBP8 is expressed in Melipona scutellaris larvae in the mandibular region, supporting the hypothesis of olfactory function in immature stages [22]. Single-sensillum recordings revealed that larval antennal sensilla of the moth Heliothis virescens respond to specific sex pheromone components; the pheromone receptors HR6 and HR13, SNMP1, and pheromone-binding protein 1 (PBP1) and PBP2 were expressed in larval antennae sensilla or cells, indicating the responsiveness of larval sensilla to female-emitted sex pheromones [23]. All of the above indicate that olfactory proteins can be identified in larval tissues and that they function to detect plant volatiles or sex pheromones.
Based on larval olfactory protein reactions with plant volatiles, their specific lifestyle and the previously identified olfactory proteins [5] of E. signifer larvae, this study examined the transcriptomes of the heads of different instar E. signifer larvae and determined the expression profiles of E. signifer larvae olfactory proteins during larval development. In addition, we identified olfactory protein functions in host selection in young larvae. The larval stage is the longest period in insects, and its olfactory system is simple. Exploring the olfactory proteins in larvae, especially the olfactory proteins that recognize plant volatiles, can provide new insight into pest control.

Insect and tissue collection
Eighteen larvae of the 5th and 12th instars and nine larvae of the ninth instar of E. signifer were collected from a damaged eucalyptus plantation by cutting trees from December 2019 to January 2020 and September to November 2020 at the Gaofeng forest station (N22.907°, E108.266°), Guangxi, China. Larval samples were collected and stored at − 80 °C.

cDNA library construction and Illumina sequencing
Total head RNAs of nine of the 5th-and 12th-instar larvae were extracted using TRIzol reagent (Ambion) and the RNeasy Plus Mini Kit (No. 74134; Qiagen, Hilden, Germany), and the quantity was detected by a Nan-oDrop 8000 (Thermo Fisher Scientific, Waltham, MA, USA). Three RNA samples from the fifth larvae heads and three RNA samples from the 12th larvae heads were used to construct one cDNA library of the 5th and 12th instar heads, respectively. cDNA library construction and Illumina sequencing of the samples were performed at MajorBio Corporation (Shanghai, China). All cDNA library preparation methods, such as mRNA sample purification, fragmentation, synthesis of first-strand cDNA, end repair, and PCR amplification, were performed according to Zhang [5]. The cDNA library was sequenced on the HiSeq2500 platform.

Assembly, functional annotation and chemosensory gene identification
All raw reads acquisition and clean read assembly were performed according to Zhang [5]. The clean reads were used in TransRate (http:// hibbe rdlab. com/ trans rate/) and CD-HIT (http:// weizh ongli-lab. org/ cd-hit/) to evaluate the sequences and remove redundant and similar sequences. Then, BUSCO (Benchmarking Universal Single-Copy Orthologs, http:// busco. ezlab. org) was used to assess the assembly integrity of the transcriptome by single-copy direct homologous genes. The annotation of unigenes was performed using NCBI BLASTx searches in the Nr protein database, with an E-value threshold of 1e −5 . GO annotation was performed by the Blast2GO pipeline. The longest ORF for each unigene was determined by the NCBI ORF Finder tool (http:// www. ncbi. nlm. nih. gov/ gorf/ gorf. html). Expression levels were expressed in terms of FPKM values (fragments per kilobase per million reads) [24], which were calculated by RSEM (RNA-Seq by Expectation-Maximization) (Version: 1.3.1) with default parameters [25]. Based on the EPKM results, we used DESeq2 (version: 1.38.0, threshold value │Log2FC │ ≥ 1& padjust < 0.05) to analyze the genetic variations between groups and to identify differentially expressed genes. Chemosensory gene (OBP, CSP, OR, GR, IR, and SNMP) identification was performed using BLASTx and manually checked by tBLASTn as described in Zhang [5]. The nucleic acid sequences encoded by all chemosensory genes that were identified from the E. signifer larval head transcriptomes are listed in Additional file 1.

Sequence and phylogenetic analysis
Amino acid sequence alignment was performed using the Muscle method implemented in the Mega v6.0 software package [26]. The phylogenetic tree was constructed using the neighbor-joining (NJ) method [27] with a P-distance model and a pairwise deletion of gaps performed in the Mega v6.0 software package. The reliability of the tree structure and node support was evaluated by bootstrap analysis with 1000 replicates. The phylogenetic trees were colored and arranged in FigTree (Version 1.4.2). Considering that E. signifer is a primitive Lepidoptera moth, the phylogenetic analyses of the OBPs were based on Lepidoptera PBPs and OBPs of Coleoptera Dastarcus helophoroides [28], Diptera Chrysomya megacephala [29], Lepidoptera Plutella xylostella [30], S. exigua [31,32], H. armigera [20], and E. signifer. The gene names and GenBank numbers of P. xylostella, H. armigera and Lepidoptera PBPs are listed in Additional file 2, and the other gene sequences are listed in the reference articles.

Expression analysis of different instars and volatile exposure
The total RNA of nine larval heads of the 5th, 9th and 12th stage was extracted following the methods described above for expression analysis of different instars. Four volatiles with gas chromatography-mass spectrometry (GC-MS) and gas chromatography-electroantennographic detection (GC-EAD) active substances were selected for exposure of the third larvae as described by [32]. Thirty-six 3rd E. signifer larvae were placed in a 50 mL jar covered with silver paper, and a glass pipe containing a piece of Whatman filter paper soaked with 50 μL of the odorant diluted to 10 g/L in methanol was added to the jar. Controls with nine 3rd E. signifer larvae were exposed to methanol only. All larval heads were dissected after 24 h of exposure, and RNA was extracted as described above. Three independent replicates for each treatment (nine larvae) were carried out.
NanoDrop2008 and agarose gel electrophoresis were used to examine the density and quality of the RNA. cDNA was synthesized with the TransScript One-Step gDNA Removal and Synthesis Super Mix (No. O10306; Trans, Beijing, China). Primers of the newly identified genes were designed using Primer3 (http:// bioin fo. ut. ee/ prime r3-0. 4.0/) (Additional file 3), and the previously designed gene primers and those for the reference genes were the same as those used by Zhang [5]. PCR analysis was conducted using a Roche LIGHT CYCLE 480II (USA). Genious 2X SYBR Green Fast qPCR Mix (No ROX) (No. RK21205; ABclonal, Wuhan, China) was used for the PCR under a three-step amplification. Each PCR was conducted in a 20 µL reaction mixture containing 10 µL of Genious 2X SYBR Green Fast qPCR Mix (No ROX), 0.8 µL of each primer (10 mM), 2 µL of sample cDNA (2.5 ng of RNA), and 7.2 µL of dH2O (sterile distilled water). The qRT-PCR cycling parameters were as follows: 95 °C for 180 s, followed by 40 cycles of 95 °C for 5 s, 60 °C for 30 s, and 65 °C to 95 °C in increments of 0.5 °C for 5 s to generate the melting curves. Each qRT-PCR for each instar and exposure was performed in three biological replicates and three technical replicates. Negative controls without either template were included in each experiment. Roche LIGHT CYCLE 480II was used to normalize the expression based on ΔΔCq values, with EsigCSP9 in ninth larval heads and EsigGR3 in.alpha.-phellandrene as control samples, and the 2 −ΔΔCt method was used [33]. Before comparative analyses, we examined the normal distribution and equal variances test, and all logarithm data followed a normal distribution with equal variances. The comparative analyses for every gene among the three stages were assessed by a one-way nested analysis of variance (ANOVA), followed by Tukey's honestly significance difference (HSD) tests implemented in SPSS Statistics 18.0. Values are presented as the means ± SE.

Transcriptome sequencing and sequence assembly
In total, on average, we generated 50 million raw reads from each cDNA library of the E. signifer larvae. The average percentages of reads with q20 and q30quality scores were 98.02% and 94.21%, respectively. After trimming the adapters, removing low-quality raw sequences using Trimmomatic (http:// www. usade llab. org/ cms/ index. php? page= trimm omatic), and blending the head sequences, followed by splicing and assembly, we obtained 62,499 transcripts, with an N50 of 1666 bp, average length of 915 bp, and maximal length of 63,226 bp (Table 1; Fig. 2A). BUSCO analysis showed that the completion rate was 94.00%, the single copy rate was 91.60%, and the duplicate rate was 2.40% (Table 1).

Homology analysis and Gene Ontology annotation
For 23.00% of the transcripts, we obtained matches to entries in the Nr protein database by BLASTx with an E-value cutoff value of 1e −5 . We observed the most sequence matches to Eumeta japonica (9.28%), followed by Chilo suppressalis (5.81%), Hyposmocoma kahamanoa (4.95%) and so on (Fig. 2B). We used Gene Ontology (GO) annotations to classify the 11,566 transcripts into functional groups using BLAST2GO, which had a P value calculated by a hypergeometric distribution test, and the E-value was less than 1 × 10 -5 . In the E. signifer transcriptome, molecular functions accounted for most of the GO annotations (37.60%), followed by cellular component (33.37%) and biology process (28.94%).
In the molecular function category, the terms binding, catalytic activity, and transporter activity were the most highly represented. In the biological process category, the terms cellular process, metabolic process, and biological regulation were the most frequent. Membrane part, cell part, and organelle were the most abundant cellular component terms (Fig. 3A). In addition, 48,699 unigenes were assembled in the E. signifer larvae head transcriptome, of which 10,338 were differentially expressed (Fig. 3B).

Olfactory proteins
We identified eight transcripts encoding putative OBPs in E. signifer, of which three were general odorant-binding proteins (GOBPs) and five were identified in the head, thorax and abdomen cuticula transcriptome (Table 2 labeled with underline) [5]. In addition, EsigOBP9 and EsigOBP11 were more highly expressed in the 5th head, and the opposite was true according to the FPKM of unigenes (Table 2). We identified six transcripts encoding putative chemosensory proteins CSPs, three of which were previously identified (labeled with underline), and all were more highly expressed in the 12th head (Table 2). Three ORs were identified, and EsigOR5 was more highly expressed in the 5th head ( Table 2). We identified three transcripts encoding putative gustatory receptor GRs, among which EsigGR1 was identified previously and was more highly expressed in the 12th head, while the others were more highly expressed in the 5th head ( Table 2). We identified 18 ionotropic receptors IRs, among which Esi-gIR1, EsigIR93a-1, EsigIR11, EsigIR75p-6, EsigIR93a-4, EsigIR93a-5, and EsigIR12 were more highly expressed in the 12th head, while the others were the reverse. EsigIR1, EsigIR40a-1, EsigIR8,3 EsigIR93a-1, EsigIR25a and Esi-gIR6 were identified previously (labeled with underline) (Additional file 4).

Phylogenetic analysis of OBPs and CSPs
In the phylogenetic tree of OBPs (Fig. 4), the PBP and GOBP clades labeled with red included EsigGOBP1, EsigGOBP7, EsigOBP10, PxylGOBP1, PxylGOBP2, HarmGOBP2 and all Lepidoptera PBPs. The PBP clade with a 100% support rate is labeled with a yellow circle, and the GOBP clade with a 100% support rate is labeled with a red circle. Interestingly, the support rate between the PBP clade and GOBP clade was 95%.

Expression of olfactory proteins (except IRs) among the three instars
We characterized the expression profiles of the identified olfactory proteins in the transcriptomes of E. signifer 5th-, 9th-, and 12th-instar larval heads. Except for EsigOBP4, EsigOBP11, and EsigCSP8, all of the OBPs, CSPs, ORs, and GRs were expressed in at least one head (Fig. 5). Two, EsigCSP1 and EsigGR2, were not expressed in the 5th larval heads; EsigGOBP4, EsigCSP1, EsigOR3 and EsigGR2 were not expressed in the 9th larval heads; and EsigOBP3, EsigGOBP4, EsigCSP3, and EsigOR3 were not expressed in the 12th larval heads (Fig. 5). However, EsigGOBP2, Esig-GOBP5, EsigCSP3, and EsigGR1 were expressed the highest among all of the olfactory proteins (Fig. 5). Among all of the olfactory proteins, nine (52.94%) were expressed the most in the 5th larval heads, among which EsigGOBP2, EsigCSP7, EsigCSP9, and EsigGR3 had significantly different expression patterns (p < 0.05); six (35.29%) olfactory proteins were expressed the highest in the 12th larval heads, among which EsigCSP2, EsigOBP9 and EsigOBP10 were significantly different (p < 0.05); and EsigOR5 and EsigGR1 (11.77%) were the most highly expressed in the 9th larval heads and were significantly different (p < 0.05) (Fig. 5). For the different kinds of olfactory proteins, 50.00% of the EsigOBPs, 50.00% of the olfactory receptors 12th: the oldest 12th instar larval head; 9th: the ninth instar larval head; 5th: the fifth instar larval head. 18S was used as the reference gene to normalize target gene expression. The standard errors are represented by the error bars, different lowercase letters (a, b, c) above the bars denote significant differences at p < 0.05 and 60% of the EsigC-SPs were expressed the highest in the 5th larvae; 50.00% of the EsigOBPs, 40% of the EsigCSPs and 16.70% of the olfactory receptors were expressed the highest in the 12th larvae; and only 33.33% of the olfactory receptors were the highest expressed among all olfactory proteins in the 9th larvae (Fig. 5). Furthermore, only 50% of the olfactory receptors and 20.00% of the EsigCSPs were expressed at their lowest levels in the 12th and 9th larvae, respectively (Fig. 5). For the development tendency of olfactory proteins in instars, EsigOBP3, Esig-GOBP2, EsigCSP3 and EsigGR3 expression decreased with larval instar, while that of EsigOBP10, EsigGOBP5, and EsigCSP2 increased. What's more, EsigGR3 was expressed the most in the 5th instar heads, followed by the 9th and 12th instar heads in turn, and all of which differed significantly from the others (Fig. 5).

Discussion
Eight transcripts encoding putative OBPs, six CSPs, three ORs, three GRs and 18 IRs, were identified in the E. signifer head transcriptome, among which there were two new OBPs, three new CSPs, two new GRs, and 12 new IRs compared with the previous head and thorax and abdomen cuticula transcriptome [5], although the number of identified olfactory genes in the two E. signifer transcriptomes was almost the same. However, in other larval transcriptomes, there were fewer than 20 OBPs, 11 CSPs, 9 ORs, 11 IRs, 7 GRs and 4 SNMPs in the newly hatched D. helophoroides larval transcriptome [28] and 13 CSPs in the Chilo auricilius larval transcriptome [34]. In larvae and other life stage transcriptomes, there were much fewer than the 127 olfactory genes in the adult antennae and caterpillar antennae and maxillary palps transcriptome [35]; 25 ORs, 26 OBPs, 19 IRs, 23 GRs and 11 SNMPs in the Chlorops oryzae larvae, pupae and adult transcriptomes [36]; 58 ORs, 20 GRs and 21 IRs in the Cydia pomonella adult antennae and neonate head  [37]; 57 OBPs, CSPs, 47 ORs, 6 GRs and 17 IRs in the Spodoptera littoralis adult antennae, larval antennae and maxillary palps transcriptomes [35]; and 34 OBPs, 20 CSPs, 10 ORs, six GRs and six IRs and three SNMPs in the eggs, 1st to 5th instar larvae, pupae, female and male adult S. exigua transcriptomes [31]. The reasons for the differences are that the primitive Lepidoptera moths, such as Hepialidae, have rarely been studied, with fewer data in the Nr database, and a small number of olfactory proteins have also been found in other larvae [20]. More importantly, the simplicity of larval olfactory systems, their long lifetime and their ease of feeding make larvae an excellent model to study olfactory signal transduction and coding pathways [38] and to provide details of the molecular mechanisms of larval olfaction, such as in Helicoverpa/Heliothis [39] and S. littoralis [40]. According to the phylogenetic tree of OBPs supporting EsigGOBP7 as the PBP of E. signifer [5], we found that EsigGOBP7 and EsigOBP10 were in a PBP/GOBP clade with 98% support, and both were in a GOBP clade with a 100% support rate; however, the sister group, the PBP clade, had a 100% support rate, which indicated that EsigOBP10 and EsigGOBP7 were the GOBPs of E. signifer.
α-Pinene treatment regulated four CSPs in C. auricilius larvae, and CSP8 had good binding affinity with α-pinene in vitro [34]. SexiOBP13 may play a role in female sex pheromone reception in S. exigua larvae [21]. Many larval binding proteins function in the recognition of volatiles or pheromones. Except for EsigOBP4 and EsigCSP8, all of the olfactory proteins studied were expressed in larval heads of E. signifer. In comparison, one larva-specific OBP was found in S. littoralis [35] larvae and 10 in Lymantria dispar larvae [41], suggesting that the expression of OBPs in larvae is common in Lepidoptera. Furthermore, S. exigua OBP2 [31] showed predominantly larval head-biased expression, and 14 S. exigua OBPs were expressed in larval heads but not in adult antennae [32], indicating the existence of larval head-specific OBPs in insects. Two P. xylostella GOBPs were abundantly expressed in the three major sensilla basiconica of the larval antenna [30]. Among all olfactory proteins and instars, first, half of the olfactory proteins, including 50.00% of the EsigOBPs, 50.00% of the olfactory receptors and 60% of the EsigCSPs, had the highest expression in the 5th E. signifer larvae; second, 35.29% of the olfactory proteins, including 50.00% of the EsigOBPs, 40% of the EsigCSPs and 16.70% of the olfactory receptors, were expressed the highest in 12th E. signifer larvae; and finally, only two olfactory receptors were expressed the highest in middle (9th) larvae, indicating the olfactory proteins were expressed the highest in young (5th) larval head, followed by old (12th) and only a few in the middle (9th) stage larvae, which is in accordance with the need of young instar to select a host; however, the high expression in older instars needs to be researched further. Furthermore, EsigGOBP2, EsigGOBP5 and EsigCSP3 were expressed the highest among all olfactory proteins in the three larval head stages and were also previously reported to be the most strongly expressed in the 5th stage head of E. signifer larvae [5]. EsigGR1 exhibited the highest expression in the 5th larval tissues [5] and the highest expression among all olfactory proteins in the three larval head stages. C. megacephala OBP Cmeg33593_c0 was upregulated with increasing larval instar [29], which is consistent with the expression pattern of EsigOBP10, EsigGOBP5, and EsigCSP2 increased with increasing larval instar. However, the expression pattern of EsigOBP3, EsigGOBP2, EsigCSP3, and EsigGR3 were inversely, especially EsigGR3, with obvious tendencies and significant differences. EsigGOBP4 was the specific OBP of the 5th E. signifer instar larvae, and EsigCSP1 was the specific CSP in the 12th-instar larvae.
Therefore, based on these expression patterns, Esig-GOBP2, EsigGOBP4, EsigGOBP5, EsigCSP1, EsigCSP3, EsigGR1 and EsigGR3 may be the key olfactory proteins in E. signifer larvae, and might be pivotal in their host choices. Furthermore, with larval heads, a comparison between caterpillar antennae and maxillary palps revealed numerous organ-specific transcripts, suggesting the complementary involvement of these two organs in larval chemosensory detection [35]. Of note, while most of the genes examined were expressed in larval heads, over half of them were also detected in nonolfactory tissues, such as the egg and thorax [31]. Therefore, the expression and functions of E. signifer olfactory proteins in larval nonolfactory tissues should be explored. In S. littoralis, caterpillars express a smaller set of olfactory genes than adults, SlitOBP21 and SlitGOBP1 are adult-specific [35], and 7 of 10 OBPs and CSPs are expressed more in larvae than in adults, while 2 of 10 OBPs are expressed more in adults than larvae [29]. Whether the expression of olfactory proteins in E. signifer adults and larvae is the same as that in S. littoralis should be further explored. We did not identify PBPs and PRs in E. signifer larval heads, but four PBPs were expressed in S. exigua larval heads, and the expression of PBPs and pheromone receptors has been reported in the larvae of many lepidopterans [21,23,30,42].
Several larval-enriched OR transcripts have been identified [37]. The E. signifer ORs and GRs were expressed differently among the three examined larval stages: EsigGR1 and EsigOR5 expression were the highest in 9th instar larval heads; EsigOR3, EsigOR4, and EsigGR3 were the highest in 5th instar heads; and EsigGR2 was the highest in 12th instar heads. In E. signifer larvae, EsigOR3 was 5th-instar specific, and EsigGR2 was 12th-instar specific. Similarly, the expression of 50 ORs has been reported in larval heads of S. exigua [32], adding sixteen ORs in H. armigera [38] and nine ORs in C. pomonella [37]. No larval-specific ORs were found in transcriptome data for larvae of S. littoralis, Dendrolimus punctatus, and L. dispar [35,41,43]. We identified 12 new IRs, in addition to the highly conserved subtype receptors of IRs, for example, IR8a and IR25a, which were also identified in the head and tegument transcriptome of E. signifer larvae. Interestingly, EsigGR1 expression was high in 5th-and 9th-instar larvae and lower in 12th-instar larvae; EsigGR1 may function in identifying the host among young larvae.
After exposure to four gas chromatography-mass specstrometry (GC-MS) and gas chromatography-electroantennographic detection (GC-EAD) active substances, the olfactory proteins showed different expression patterns. In the n-butyl ether treatment, 54.5% and 27.2% of genes were up-and downregulated, respectively. After 4-ethylacetophenone treatment, the same genes (27.2%) were up-and downregulated. This result is also supported by the fact that excitation of an OSN with its best ligand does not necessarily result in downregulation of gene transcription of the neuron's corresponding chemosensory receptor [44] in both S. exigua adults [45] and larvae [32], which was explained as a mechanism that mediates odor sensitization [46], a phenomenon that has also been observed in S. littoralis [47]. One study found that odorants induced a fast and reversible concentration-dependent decrease in the transcription of genes corresponding to activated receptors in intact mice [48]. Interestingly, after o-cymene and α-phellandrene treatment, 54.5% and 45.5% of genes were downregulated, without any upregulated genes, which was the same as in mice. Combined with the results of downregulated receptors and GC-EAD reactivity, we speculated that the ligand of EsigGR1 was n-butyl ether, and the ligands of EsigGR3 were o-cymene, α-phellandrene, n-butyl ether and 4-ethylacetophenone. Most importantly, changes in gene expression after exposure to important plant volatiles can provide a way to find key plant volatiles for the key olfactory proteins among large amounts of plant volatiles. EsigGR3 was newly identified in the head transcriptome, and the 3rd larvae recognized all tested GC-EAD active compounds, which supported its role in host recognition in the 3rd larvae of E. signifer.

Conclusions
We identified 38 olfactory proteins in the heads of E. signifer larvae. Around half of the olfactory proteins were the most highly expressed in the young (5th) larval head. EsigGOBP2, EsigGOBP4, EsigGOBP5, EsigCSP1, EsigCSP3, EsigGR1 and EsigGR3 may be important proteins in larval olfactory recognition. In addition, based on the receptors downregulated after exposure to volatiles and the GC-EAD reactivity in the third larvae, we speculated that the ligand of EsigGR1 was n-butyl ether, and the ligands of EsigGR3 were all electrophysiologic active compounds. Important plant volatiles can be targeted for pest control. The simplicity of the E. signifer larval olfactory system, filtering important olfactory proteins, along with their long life and ease of feeding, make E. signifer larvae a suitable in vivo model for the study of olfactory signal transduction and coding pathways.