D 72 for 30 sec, and finally, 5 min at 72 . The PCR products were

D 72 for 30 sec, and finally, 5 min at 72 . The PCR products were separated on an 8.0 non-denaturing polyacrylamide gel electrophoresis (PAGE) gel and then visualized by silver staining. The pBR322 Marker I DNA ladder (Zheping, Biological Technology Development Co., Ltd, Beijing, China) was used as the standard size marker.Genetic diversity analysisThe number of alleles (Na), observed heterozygosity (Ho), gene diversity (expected heterozygosity; He), and polymorphism information content (PIC) for each of the EST-SSR markers were calculated using PowerMarker V3.25 [22]. A genetic similarity matrix based on the “proportion of shared alleles” among the 32 adzuki bean accessions was generated using PowerMarker. An unrooted neighbor-joining tree based on the shared allele distances was constructed using MEGA 4 software [23] to reveal genetic relationships among the 32 adzuki bean accessions.Results Sequencing and de novo assembly of Illumina paired-end reads from adzuki bean transcriptomesA total of 54.79 and 57.24 million paired-end raw reads were obtained for the `Jingnong2′ and `Zhonghong5′ varieties, respectively. After removal of the low-quality reads, 52.11 and 54.30 million clean reads with GC content of 45.9 and 46.3 were obtained for the `Jingnong2′ and `MK-886 biological activity Jinghong5′ varieties, respectively. The buy R848 sequence quality, based upon the clean reads of both adzuki bean varieties, was 98.4 of Q20. The combined sequences of these reads were assembled into 65,950 unigenes by Trinity. The average length of the unigene is 1,213 bp (N50 = 1,889 bp). The lengths of the unigenes ranged from 200 to 19,090 bp. Of these unigenes, 22,188 (33.6 ) were 201 to 500 bp; 13,444 (20.4 ) were 501 to 1,000 bp; 10,421 (15.8 ) were 1,001 to 1,500 bp; 7,682 (11.7 ) were 1,501 to 2,000 bp; 4,956 pnas.1408988111 (7.5 ) were 2,001 to 2,500 bp; 2,871 (4.4 ) were 2,501 to 3,000 bp; and 4,388 (6.7 ) were more than 3,000 bp in length (Fig 1).Sequence annotationFor annotation of the sequence assembly contigs and unique singletons, all unigenes were searched against the five databases (see Materials and methods). A total of 65,950 unigenes showed significant BLAST hits. Among those unigenes, 47,009 (71.3 ) showed significant similarity to known proteins in the NR sequence database of which 31776 (48.2 ) were similar to protein wcs.1183 in the Swiss-Prot. A total of 51,131 unigenes were annotated in all databases. Based on annotation against NR database, 11,580 (18.4 ) unigenes were assigned to gene ontology (GO) terms (Fig 2). The sequences that belonged to the biological process, cellular component and molecular function clusters were categorized into 50 terms. Under the biological process category, the highest sub-category was cellular process (23,555, 16.1 ), followed by metabolic process (23,390, 16.0 ), single-organism process (15,823, 10.8 ) and locomotion (41, <0.1 ).PLOS ONE | DOI:10.1371/journal.pone.0131939 July 6,4 /Development of EST-SSR from the Transcriptome of Adzuki BeanFig 1. Frequencies length distribution of Illumina read sequences. doi:10.1371/journal.pone.0131939.gUnder the cellular component category, cell component (26,966, 25.2 ) and organelle component (21,271, 19.9 ) represented the majorities, whereas only a few unigenes were assigned toFig 2. Gene ontology (GO) classification of assembled unigenes. doi:10.1371/journal.pone.0131939.gPLOS ONE | DOI:10.1371/journal.pone.0131939 July 6,5 /Development of EST-SSR from the Transcriptome of Adzuki Beanvirion (17, <0.1 ), virio.D 72 for 30 sec, and finally, 5 min at 72 . The PCR products were separated on an 8.0 non-denaturing polyacrylamide gel electrophoresis (PAGE) gel and then visualized by silver staining. The pBR322 Marker I DNA ladder (Zheping, Biological Technology Development Co., Ltd, Beijing, China) was used as the standard size marker.Genetic diversity analysisThe number of alleles (Na), observed heterozygosity (Ho), gene diversity (expected heterozygosity; He), and polymorphism information content (PIC) for each of the EST-SSR markers were calculated using PowerMarker V3.25 [22]. A genetic similarity matrix based on the "proportion of shared alleles" among the 32 adzuki bean accessions was generated using PowerMarker. An unrooted neighbor-joining tree based on the shared allele distances was constructed using MEGA 4 software [23] to reveal genetic relationships among the 32 adzuki bean accessions.Results Sequencing and de novo assembly of Illumina paired-end reads from adzuki bean transcriptomesA total of 54.79 and 57.24 million paired-end raw reads were obtained for the `Jingnong2' and `Zhonghong5' varieties, respectively. After removal of the low-quality reads, 52.11 and 54.30 million clean reads with GC content of 45.9 and 46.3 were obtained for the `Jingnong2' and `Jinghong5' varieties, respectively. The sequence quality, based upon the clean reads of both adzuki bean varieties, was 98.4 of Q20. The combined sequences of these reads were assembled into 65,950 unigenes by Trinity. The average length of the unigene is 1,213 bp (N50 = 1,889 bp). The lengths of the unigenes ranged from 200 to 19,090 bp. Of these unigenes, 22,188 (33.6 ) were 201 to 500 bp; 13,444 (20.4 ) were 501 to 1,000 bp; 10,421 (15.8 ) were 1,001 to 1,500 bp; 7,682 (11.7 ) were 1,501 to 2,000 bp; 4,956 pnas.1408988111 (7.5 ) were 2,001 to 2,500 bp; 2,871 (4.4 ) were 2,501 to 3,000 bp; and 4,388 (6.7 ) were more than 3,000 bp in length (Fig 1).Sequence annotationFor annotation of the sequence assembly contigs and unique singletons, all unigenes were searched against the five databases (see Materials and methods). A total of 65,950 unigenes showed significant BLAST hits. Among those unigenes, 47,009 (71.3 ) showed significant similarity to known proteins in the NR sequence database of which 31776 (48.2 ) were similar to protein wcs.1183 in the Swiss-Prot. A total of 51,131 unigenes were annotated in all databases. Based on annotation against NR database, 11,580 (18.4 ) unigenes were assigned to gene ontology (GO) terms (Fig 2). The sequences that belonged to the biological process, cellular component and molecular function clusters were categorized into 50 terms. Under the biological process category, the highest sub-category was cellular process (23,555, 16.1 ), followed by metabolic process (23,390, 16.0 ), single-organism process (15,823, 10.8 ) and locomotion (41, <0.1 ).PLOS ONE | DOI:10.1371/journal.pone.0131939 July 6,4 /Development of EST-SSR from the Transcriptome of Adzuki BeanFig 1. Frequencies length distribution of Illumina read sequences. doi:10.1371/journal.pone.0131939.gUnder the cellular component category, cell component (26,966, 25.2 ) and organelle component (21,271, 19.9 ) represented the majorities, whereas only a few unigenes were assigned toFig 2. Gene ontology (GO) classification of assembled unigenes. doi:10.1371/journal.pone.0131939.gPLOS ONE | DOI:10.1371/journal.pone.0131939 July 6,5 /Development of EST-SSR from the Transcriptome of Adzuki Beanvirion (17, <0.1 ), virio.