I. Database & tools
i. Datasets we collected include Anopheles sinensis (China strain) annotated genome sequences, RNA-seq data and Chinese insecticide-resistance related data collected from research articles.
ii. Tools applied in ASGDB include JBrowse genome browser, ViroBLAST and OpenLayers 3.
II. Data storage
Data used for ASGDB are stored separately in and managed in MySQL relational databases.
III. Web interface
The ASGDB interface provides direct links to seven individual pages: Home, JBrowse, Search, Download, Resistance related gene, Resistance surveillance, Contact and Tutorial. All the links are clickable icons. A single click leads to the respective page.
How to visualize the whole draft genome sequence?
Users can search with scaffolds to locate regions on the An. sinensis genome. These scaffolds are freely selected from the drop-down menu. If the users are interested in a particular region on the scaffold, they can also enter the starting and terminal positions of the region to retrieve detailed information. In the genomic view, rectangular frames with directionality represent the corresponding genes or ncRNAs from the positive or negative strand. A single click on the frame will open an information table, which provides detailed information such as annotation, location, GO, KEGG and sequences. The sequence data for the selected gene can be downloaded in the same page as FASTA files. Other people can see the exact same region of the An. sinensis genome and the collection of open tracks on their screen when the visible URL (accessible either via the browser address bar or the “Share” button) is shared.
If users desire to search the scaffold “KE523861.1”, they can select this scaffold from the drop-down menu. The page is divided into two parts. The relatively small panel on the left is a list of different types of factors, on which the icons, from top to bottom, are: “GC content”, “gene”, “exon”, “miRNA”, “rRNA”, “tRNA” and “reference sequence”. All these icons for each scaffold can be tracked to the detailed view. The larger window on the right is the tracks display region. Each track can be turned off by clicking the “cross” in front of the title, which allows users to hide unwanted information for a better user experience. Dragging the tracks up or down can change the positions of tracks to display datasets of interest at the top for convenience. JBrowse also provides efficient panning and zooming of a genomic region in the genome via embedded navigation buttons. With the help of these important and efficient visualization modules, users can easily browse and search on a large scale in a graphic interface.
For genes, users can also get the detailed information about name, description, location in the genome, length and nucleic acid sequence by clicking the gene ID.
How to search specific genes of interest?
There are two sub-categories in the search part: simple search and BLAST search. To browse different types of genetic factors, a simple search can be performed using the following parameters:
(1) NCBI or ASGDB accession numbers.
(2) Gene name or symbol.
(3) GO ID or GO term.
(4) KEGG ID or KEGG annotation.
Users can enter these parameters to obtain specific gene information from ASGDB and fuzzy queries are supported. All the matched genes will be listed to in the search job when more than one gene is matched with the input keyword. The BLAST search allows searching of genes using the ViroBLAST. Users perform similarity searches against each type of sequences using various BLAST search forms (BLASTn, BLASTp, BLASTx, tBLASTn and tBLASTx). The reference database used for BLAST is all nucleic acid and amino acid sequences of An. sinensis. Users can enter nucleotide or protein query sequences or upload a local sequence file in the FASTA format to search against the reference database. The BLAST search tool allows users to set parameters such as the e-value cut-off, alignment view mode or matrix.
Taking CYP 9J53 as an example, users can input many types of keywords, e.g. “KFB49800.1”, “CYP 9J53” or “9J53” as search content. Pressing on the “GeneID” button will display the detailed information for this gene.
At the top part of the gene information page, the users can view some fundamental information of CYP 9J53, such as description, length and location. Clicking on the right “JBrowse” button enables users to visualize CYP 9J53 under the background of the scaffold. Below these, the sequence information of CYP 9J53 is presented. The exon regions are highlighted in red and the remaining sequences are introns. Clicking on the “show pep” button allows amino acid sequences to be displayed. The lower portion of the information page is the functional feature description of CYP 9J53, including orthologs, GO and KEGG pathways. In the ortholog part, many-one or many-many orthologous genes to the An. sinensis CYP 9J53 in other mosquito species and fruit fly are displayed. The prediction of GO terms shows that CYP 9J53 belongs to “iron ion binding” (0005506), “electron carrier activity” (0009055), “heme binding” (0020037) and “oxidation-reduction process” (0055114) categories. Users can click the GO term for detailed term information. CYP 9J53 participates in “Linoleic acid metabolism” pathway. Clicking the KO (ko00591) will open a new page to show the reference pathway map. Transcriptional results are shown at the bottom of the results page, and include technology, comparison, regulation, fold change, published articles, etc as a whole.
Copy and paste multiple query sequences in fasta format to text box, or upload a sequence file in FASTA format.
How to search and view insecticide-resistant phenotypic and genotypic data?
ASGDB provides an OpenLayers map-based interface for users to obtain data. All the mosquito-sampling sites are identified with small red pompons. By clicking on the pompon, a pop up text box will appear on the map with the most recent insecticide-resistant related record. Users can also browse all the relevant information in this region on the same page just below the map. It should be noted that when different sampling sites are in close proximity, the red pompons might be very near or even superposed on each other. To avoid clicking on the non-target pompon, the users can zoom in to magnify their view of the map by tapping the “plus” button in the upper left corner or by scrolling up with mouse's scroll wheel. Double left click on the map can also simultaneously centre the map and zoom in at the position clicked. Once users think they are done, they can use the “minus” button or scroll down with mouse's scroll wheel to zoom out. To move the map, users can click and drag at any point on the map. ASGDB also provides the users with a search engine to facilitate obtaining the information of interest, such as collection site, year, insecticide and resistance mechanism.
By clicking on the pompon, a pop up text box will appear on the map with the most recent insecticide-resistant related record.
Users can also get more details below the on the same page just below the map.
Auto-complete input boxes are also be used. In these boxes the user needs to type in one letters from the requested term and a list of all possible matching terms will appear. After filling the desired search criteria, click on the “Search” button at the bottom right end of the form. Shortly thereafter a new page will appear on your browser with all matching results. The matching results can also be downloaded.
How might users contribute new data for incorporation into database?
Researchers can submit any comments, suggestions, or questions regarding all aspects of ASGDB on “Contact” page. To promote researchers’ sharing and exchange of knowledge and ideas, the submission process is simple. No registration requirement is imposed, although users need to provide a valid email address so that our team can contact them in case of any queries.