Chloroplast Genome and Phylogenetic Analysis of Katmon (Dillenia philippinensis Rolfe): A Philippine Endemic Fruit

J
Jairus Jake M. Lucero1
J
Judy Ann M. Muñoz2
L
Lyka Y. Aglibot3
R
Roneil Christian S. Alonday2,4,*
1Genetic and Molecular Biology Division, Institute of Biological Sciences, College of Arts and Sciences, University of the Philippines Los Baños, Laguna, 4031, Philippines.  
2Philippine Genome Center-Program for Agriculture, Livestock, Fisheries and Forestry, Office of the Vice Chancellor for Research and Extension, University of the Philippines Los Baños, Laguna, 4031, Philippines.
3National Plant Genetic Resources Laboratory, Institute of Plant Breeding, College of Agriculture and Food Science, University of the Philippines, Los Baños, Laguna, 4031, Philippines.
4Biochemistry Laboratory, Crop Biotechnology Division, Institute of Plant Breeding, College of Agriculture and Food Science, University of the Philippines Los Baños, Laguna, 4031, Philippines.

Background: Katmon (Dillenia philippinensis Rolfe) is a Philippine-endemic fruit species with a relatively well-studied biochemical profile but poor genomic characterization. Studies involving the chloroplast genome can provide valuable insights into its evolution and support conservation efforts.

Methods: The complete chloroplast genome of D. philippinensis was sequenced using Illumina NovaSeqX. Reads were quality-checked, assembled with GetOrganelle and annotated using CPGAVAS2 and GeSeq. Simple sequence repeats, codon usage and inverted repeat boundaries were analyzed. Phylogenetic relationships were inferred using concatenated rbcL and matK sequences via maximum likelihood analysis.

Result: The chloroplast genome was 161,591 bp with a GC content of 36.3%. It exhibited the typical quadripartite structure, consisting of a large single-copy (LSC) region  (89,411 bp), a small single-copy (SSC) region (19,208 bp) and a pair of inverted repeats (IR) (26,486 bp each).  A total of 113 unique genes were identified, comprising 79 protein-coding, 30 tRNA and four rRNA genes. Fifty-four SSRs, primarily A/T mononucleotide repeats and 53,863 codons were observed. Phylogenetic analysis placed D. philippinensis as the closest relative to D. suffruticosa and the most distantly related to D. ovata. The complete chloroplast genome of D. philippinensis provides a valuable resource for phylogenetic studies, germplasm characterization and future breeding and conservation programs. 

Endemic plants are species that are solely found within a specific geographical region. They are well-adapted to their local habitats, thereby posing less damage to the environment and making them valuable resources for reinforcing food security. They also exhibit greater climate resiliency and pest tolerance than commercial crops (Nhamo et al., 2022). Moreover, indigenous fruits display substantial, if not higher, nutritional value compared to widely cultivated species (Durst and Bayasgalanbat, 2014). Despite the health benefits and potential for diet diversification, indigenous plants remain poorly studied (Oraye et al., 2023; Villarino and Villarino, 2023), leading to their underutilization and preventing their full potential from being harnessed. 
       
Dillenia philippinensis
is a medium-sized, evergreen fruit tree belonging to the family Dilleniaceae (Aquino et al., 2015, as cited by Fatallo and Panes, 2022). It is distributed in many provinces across the Philippines, such as Laguna, Quezon, Oriental Mindoro and Cebu (Magdalita et al., 2014). The leaves of this tree species were found to exhibit cytotoxic (Dante et al., 2019), antifungal (Ragasa et al., 2009) and antioxidant (Ansari et al., 2021) activities. Meanwhile, the fruit extracts display antimicrobial activity (Tubillo et al., 2016) and can be used as a natural food preservative (Pormento, 2024). While the biochemical profile of katmon is relatively well-studied, its genomic characteristics remain largely unexplored. Genomic characterization is limited to the work of Fatallo and Panes (2022), who used rbcL, matK and ITS markers for barcoding and phylogenetic analysis. No further studies have been conducted for the genetic characterization of this species. 
       
The chloroplast genome is one of the three genomes present in plants (Rozov et al., 2022). It holds evolutionary significance, as it is absent in other eukaryotes and functions for photosynthesis, a process whose biochemical mechanisms are conserved among plants (Theeuwen et al., 2022; Cao et al., 2025). Thus, the characterization of a chloroplast genome provides valuable insights into phylogenetic relationships and plant taxonomy (Daniell et al., 2016). Moreover, a comprehensive analysis of the chloroplast genome can equip plant breeders and conservation biologists with pertinent knowledge on breeding, conservation and germplasm characterization efforts (Madayag et al., 2026). Molecular data derived from the chloroplast genome can support the development of new cultivars with greater yield, higher nutritional content and higher genetic diversity (Swarup et al., 2020). Therefore, this study sought to characterize the chloroplast genome of D. philippinensis to gain a deeper understanding of its phylogeny. 
Sample collection
 
Mature leaves of D. philippinensis accessions GB68492 and GB70976 were collected from the germplasm collections of the National Plant Genetic Resources Laboratory, Institute of Plant Breeding, University of the Philippines, Los Baños. The collected leaves were placed inside airtight plastic bags and kept in an icebox during transportation. The leaves were then disinfected using 70% ethanol, put in a resealable plastic bag and stored at -80°C until further use. 
 
Genomic DNA isolation
 
Genomic DNA was extracted from collected leaf samples following the protocol of Inglis et al. (2018). The purity and concentration of the isolated DNA were determined using an Epoch Multi-Volume Spectrophotometer System (BioTek Instruments Inc., USA). The extract from accession GB68492 was sent to Macrogen (South Korea) for complete chloroplast genome sequencing via the Illumina NovaSeqX platform (Illumina Inc., San Diego, CA). 
 
Chloroplast genome assembly and annotation
 
Raw reads generated from Illumina sequencing were subjected to quality checking using FastQC v0.12.1 (Andrews, 2010). Adapters and low-quality reads were trimmed using Trimmomatic v0.39 (Bolger et al., 2014), which was implemented using the following operations: “ILLUMINACLIP TruSeq3 2:30:10:8; LEADING:25 and SLIDINGWINDOW:4:20.” Poor-quality reads were discarded, while high-quality reads were subjected to de novo assembly using GetOrganelle v1.7.7.1+ (Jin et al., 2020), with the consensus rbcL sequence serving as a seed sequence for assembly. The assembled chloroplast genome was annotated using CPGAVAS2 (Shi et al., 2019), with the complete chloroplast genome of D. indica deposited in NCBI GenBank (NC_042740.1) serving as reference. Manual validation was done using Geneious Prime v2025.1.2 and by cross-checking the annotations made by GeSeq (Tillich et al., 2017). 
 
Amplification of rbcL and matK
 
The genomic DNA samples were diluted to 20 ng/μL. Polymerase chain reaction (PCR) was performed to amplify rbcL and matK. Primers from Thooptianrat et al. (2017) and Yu et al. (2011) were used to amplify rbcL and matK, respectively. PCR thermocycling conditions were adapted from the study of Fatallo and Panes (2022). The amplicons were electrophoresed using 1.5% agarose at 110 V for 30 min to check amplification success. They were then purified using a NucleoSpinTM Gel and PCR Clean-Up Kit (Macherey-Nagel, Germany). Spectrophotometry was performed to assess the purity and concentration of the purified amplicons. Lastly, the amplicons for rbcL and matK were sent to Macrogen (South Korea) for bidirectional Sanger sequencing.
 
Phylogenetic analysis
 
The consensus rbcL and matK sequences of the two D. philippinensis accessions were subjected to phylogenetic analysis to confirm their relationship with other species in the family Dilleniaceae. The reference rbcL and matK sequences of each species were obtained from NCBI GenBank. Jojoba (Simmondsia chinensis) (NC_040935) and pokeweed (Phytolacca americana) (NC_067846) were used as outgroups. Multiple sequence alignment of the rbcL and matK sequences was performed using ClustalW (Thompson et al., 1994). The aligned rbcL and matK sequences were also concatenated using FaBox v.1.61 (Villesen 2007). Finally, phylogenetic trees were constructed based on the aligned rbcL+matK sequences, with best-fit substitution model of Tamura 3-parameter model with gamma distribution (T92 + G)."  The phylogenetic trees were constructed using  MEGA12 (Kumar et al., 2024), following the maximum likelihood method with the respective best-fit substitution model and employing bootstrap replicates of 1000. 
       
This research was conducted at the Biochemistry Laboratory of the Institute of Plant Breeding, University of the Philippines Los Baños, from November 2024 to April 2025. 
Chloroplast genome assembly and annotation
 
The complete chloroplast genome of Dillenia philippinensis is 161,591 bp in length with 36.3% GC content and exhibits the typical quadripartite structure composed of a large single copy (LSC, 89,411 bp), a small single copy (SSC, 19,208 bp) and two inverted repeat (IRa and IRb) regions (26,486 bp each) (Fig 1). A total of 130 genes were identified (Table 1), comprising 113 unique genes and 16 duplicated in the IRs. Duplicated genes include five protein-coding (rpl2, rpl23, ycf2, ndhB, rps7), seven tRNA (trnI-CAU, trnL-CAA, trnV-GAU, trnA-UGC, trnR-ACG, trnN-GUU) and four rRNA genes (rrn23S, rrn4.5S, rrn5S, rrn16S). The trans-spliced rps12 gene was also annotated twice. Functionally, the genome encodes 79 protein-coding genes, 30 tRNA genes and 4 rRNA genes, of which 44 are involved in photosynthesis, 59 in self-replication and 10 in other functions.

Fig 1: Gene map of the chloroplast genome of D. philippinensis (GB68492). LSC, SSC and IR regions are indicated. Genes outside the circle are transcribed clockwise; those inside are transcribed counterclockwise.



Table 1: List of genes in the chloroplast genome of D. philippinensis. 


       
Initial annotation with CPGAVAS2 predicted 127 genes, while GeSeq identified 135 genes. Discrepancies arose from differences in annotation algorithms, with GeSeq uniquely detecting psbL, rps19, ndhK, ycf1 and accD. BLAST verification confirmed these as functional genes. Conversely, ycf15 was annotated as protein-coding by CPGAVAS2 but is considered non-coding in previous studies (Schmitz-Linneweber et al., 2001; Shi et al., 2013). After cross-validation, the finalized annotation included 130 genes. This emphasizes that single-tool annotation may miss genes, highlighting the need for multi-tool validation to ensure accurate plastome characterization. The assembled cp genome sequence was submitted to GenBank with the accession number PX171333 of the National Center for Biotechnology Information (NCBI).
 
SSR and codon usage analysis
 
The chloroplast genome of D. philippinensis contained 54 simple sequence repeats (SSRs), 94.4% of which were A/T mononucleotide motifs, confirming a strong AT bias consistent with previous reports (de Souza et al., 2019). Dinucleotide (AT/TA) and trinucleotide (ATA) repeats were rare, each occurring once. A total of 53,863 codons were identified, with leucine the most abundant (9.7%, predominantly UUA) and cysteine the least (2.3%, mainly UGU), reflecting typical codon usage bias in chloroplast genomes (Nakamura and Sugiura, 2007).
 
Comparative genome and IR junctions
 
Comparison with D. indica and D. turbinata revealed minor differences in LSC (88,305-90,907 bp) and SSC (18,047-19,349 bp) lengths, while IR regions were nearly identical (Fig 2). No G/C SSR motifs were detected, reinforcing the AT-rich nature of the plastome (Guo et al., 2021). IR boundaries were generally conserved, with a slight IRb expansion in D. indica due to the rps19 gene.

Fig 2: Comparison of IR junctions among D. philippinensis, D. indica and D. turbinata. JLB: LSC-IRb; JSB: IRb-SSC; JSA: SSC-IRa; JLA: IRa-LSC.


 
Phylogenetic analysis
 
The utility of the assembled chloroplast genome of D. philippinensis is limited by the available sequences deposited in NCBI. Currently, there are only three species belonging to the family Dilleniaceae with publicly available complete chloroplast genomes. As such, the rbcL and matK genes were used to conduct phylogenetic analysis, allowing for the representation of more taxa. 
       
The relationships shown in the tree generated from rbcL, matK and concatenated rbcL and matK sequences. The concatenated sequences provided higher resolution than single genes (not shown in this paper). It also resolved the discrepancy observed in the trees constructed from individual genes. Tetracera, which was previously depicted to be closely related to Dillenia and Hibbertia, was depicted to be a sister to the rest of the family, branching off earliest as the closest relative to all remaining members (Fig 3). This result was congruent with those of Horn (2009), who used four plastid loci (rbcL, infA, rps4 and the rpl16 intron). Gontcharov et al. (2004) noted that using combined genes in phylogenetic analysis offers greater resolution than single genes and resolves conflicts observed in phylogenetic analyses involving single genes. 

Fig 3: Maximum likelihood phylogenetic tree based on concatenated rbcL + matK sequences, showing D. philippinensis closely related to D. suffruticosa.


       
The chloroplast genome of D. philippinensis provides a foundational resource for species identification, phylogenomics and conservation of Philippine endemic and indigenous fruits. SSR markers and codon usage patterns can support future population genetics and breeding efforts to enhance the utilization of underexploited species like katmon (Xu et al., 2025).
This study reports a complete chloroplast genome of Dillenia philippinensis (katmon), a Philippine endemic fruit species. The genome is 161,591 bp with 36.3% GC content and exhibits a typical quadripartite structure comprising 113 unique genes. SSR analysis revealed an A/T-rich repeat pattern, while codon usage analysis indicated a bias toward leucine codons. Comparative analysis with other Dillenia species demonstrated conserved IR boundaries and minor length variations in LSC and SSC regions. Phylogenetic reconstruction based on concatenated sequences of rbcL and matK consistently placed D. philippinensis as closely related to D. suffruticosa. These findings provide a foundational genomic resource for germplasm characterization, phylogenomic studies and the conservation and potential breeding of underutilized Philippine fruit crops.
The present study was supported by the people of the Republic of the Philippines thru the Department of Science and Technology - Philippine Council for Health Research and Development (DOST-PCHRD).
 
Disclaimers
 
The views and conclusions expressed in this article are solely those of the authors and do not necessarily represent the views of their affiliated institutions. The authors are responsible for the accuracy and completeness of the information provided, but do not accept any liability for any direct or indirect losses resulting from the use of this content.
The authors declare that there are no conflicts of interest regarding the publication of this article. No funding or sponsorship influenced the design of the study, data collection, analysis, decision to publish, or preparation of the manuscript.

  1. Andrews, S. (2010). FastQC: A Quality Control Tool for High- Throughput Sequence Data. Babraham Bioinformatics.

  2. Ansari, S.S., Diño, P.H., Castillo, A.L. and Santiago, L.A. (2021). Antioxidant activity, xanthine oxidase inhibition and acute oral toxicity of Dillenia philippinensis Rolfe (Dilleniaceae) leaf extract. Journal of Pharmacy and Pharmacognosy Research. 9: 846-858.

  3. Aquino, M.E.A., Wagan, A.J.M. and Omaña, M. (2015). Recapturing the food value of katmon. Bureau of Agricultural Research. 16: 1-16.

  4. Bolger, A.M., Lohse, M. and Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics. 30: 2114-2120.

  5. Cao, K. and Zhang, M. (2025). Features and phylogenetic analysis of the chloroplast genome of the medicinal plant Euchresta tubulosa. Legume Research. 48(12): 1947-1957. doi: 10. 18805/LRF-881.

  6. Daniell, H., Lin, C.S., Yu, M. and Chang, W.J. (2016). Chloroplast genomes: Diversity, evolution and applications in genetic engineering. Genome Biology. 17: 134.

  7. Dante, R.A.S., Ferrer, R.J.E. and Jacinto, S.D. (2019). Leaf extracts from Dillenia philippinensis Rolfe exhibit cytotoxic activity to both drug-sensitive and multidrug-resistant cancer cells. Asian Pacific Journal of Cancer Prevention. 20: 3285-3290.

  8. De Souza, U.J.B., Nunes, R., Targueta, C.P., Diniz-Filho, J.A.F. and de Campos Telles, M.P. (2019). The complete chloroplast genome of Stryphnodendron adstringens (Leguminosae - Caesalpinioideae): Comparative analysis with related Mimosoid species. Scientific Reports. 9: 14206.

  9. Durst, P. and Bayasgalanbat, N. (2014). Proceedings of a Symposium on the Promotion of Underutilized Indigenous Food Resources for Food Security and Nutrition in Asia and the Pacific (May 31-June 1, 2014, Khon Khan, Thailand). Food and Agriculture Organization, Regional Office for Asia and the Pacific, Rikkyo University.

  10. Fatallo, E.K.F. and Panes, V.A. (2022). DNA barcoding of Dillenia philippinensis Rolfe and Dillenia luzoniensis (S. Vidal) Merr. (Dilleniaceae) from Oriental Mindoro and Quezon City, Philippines. Philippine Journal of Systematic Biology. 16: 72-84.

  11. Gontcharov, A.A., Marin, B. and Melkonian, M. (2004). Are combined analyses better than single-gene phylogenies? A case study using SSU rDNA and rbcL sequence comparisons in the Zygnematophyceae (Streptophyta). Molecular Biology and Evolution. 21: 612-624.

  12. Guo, Y.Y., Yang, J.X., Li, H.K. and Zhao, H.S. (2021). Chloroplast genomes of two species of Cypripedium: Expanded genome size and proliferation of AT-biased repeat sequences. Frontiers in Plant Science. 12:  609729.

  13. Horn, J.W. (2009). Phylogenetics of Dilleniaceae using sequence data from four plastid loci (rbcL, infA, rps4, rpl16 intron). International Journal of Plant Sciences. 170: 794-813.

  14. Inglis, P.W., Pappas, M.C.R., Resende and L.V., Grattapaglia, D. (2018). Fast and inexpensive protocols for consistent extraction of high-quality DNA and RNA from challenging plant and fungal samples for high-throughput SNP genotyping and sequencing applications. Plos One. 13: e0206085.

  15. Jin, J.J., Yu, W.B., Yang, J.B., Song, Y., DePamphilis, C.W., Yi, T.S. and Li, D.Z. (2020). GetOrganelle: A fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biology. 21: 241.

  16. Kumar, S., Stecher, G., Suleski, M., Sanderford, M., Sharma, S. and Tamura, K. (2024). MEGA12: Molecular evolutionary genetic analysis version 12 for adaptive and green computing. Molecular Biology and Evolution. 41: msae263.

  17. Madayag, R.E. Jr., Quiñones-Arribado, K.J.O., Timog, E.B.S., Bartolome, M.C.B., Cruz, J.R.A.V., Borromeo, T.H., Endonela, L.E. and Coronado, N.B. (2026). The complete chloroplast genome of “Lasona” (Allium cepa var. aggregatum G. Don), an important indigenous vegetable in the Northern Philippines. Agricultural Science Digest. 46(1): 08-14. doi: 10.18805/ag.DF-531.

  18. Magdalita, P.M., Abrigo, M.I.K.M. and Coronel, R.E. (2014). Phenotypic evaluation of some promising rare fruit crops in the Philippines. Philippine Science Letters. 7: 376-386.

  19. Nakamura, M. and Sugiura, M. (2007). Translation efficiencies of synonymous codons are not always correlated with codon usage in tobacco chloroplasts. The Plant Journal. 49: 128-134.

  20. Nhamo, L., Paterson, G., Van Der Walt, M., Moeletsi, M., Modi, A., Kunz, R., Chimonyo, V., Masupha, T., Mpandeli, S., Liphadzi, S., Molwantwa, J. and Mabhaudhi, T. (2022). Optimal production areas of underutilized indigenous crops and their role under climate change: Focus on Bambara groundnut. Frontiers in Sustainable Food Systems. 6: 990213.

  21. Oraye, C., De Chavez, H., Aguilar, C., Makiling, F., Ladia, V.J., Enicola, E., Guevarra, L., Gueco, L., Maghirang, R., Anunciado, M., Oro, E., Gonsalves, J., Hunter, D., Borelli, T. and Mendonce, S. (2023). Initiatives on indigenous fruits in the Philippines: A scoping study. Bioversity International. 90: 1-45.

  22. Pormento, C.C. (2024). The potential of katmon fruit (Dillenia philippinensis) extract as a natural food preservative. Advances in Research. 25: 126-132.

  23. Ragasa, C., Alimboyoguen, A. and Shen, C. (2009). Antimicrobial triterpenes from Dillenia philippinensis. Philippine Scientist. 46: 78-87.

  24. Rozov, S.M., Zagorskaya, A.A., Konstantinov, Y.M. and Deineko, E.V. (2022). Three parts of the plant genome: On the way to success in the production of recombinant proteins. Plants. 12: 38.

  25. Sarhan, S., Hamed, F. and Al-Youssef, W. (2016). The rbcL gene sequence variations among and within Prunus species. Journal of Agricultural Science and Technology. 18: 1105-1115.

  26. Schmitz-Linneweber, C., Maier, R.M., Alcaraz, J.P., Cottet, A., Herrmann, R.G. and Mache, R. (2001). The plastid chromosome of spinach (Spinacia oleracea): Complete nucleotide sequence and gene organization. Plant Molecular Biology. 45: 307- 315.

  27. Shi, C., Liu, Y., Huang, H., Xia, E.H., Zhang, H.B. and Gao, L.Z. (2013). Contradiction between plastid gene transcription and function due to complex posttranscriptional splicing: An exemplary study of ycf15 function and evolution in angiosperms. PLoS ONE. 8: e59620.

  28. Shi, L., Chen, H., Jiang, M., Wang, L., Wu, X., Huang, L. and Liu, C. (2019). CPGAVAS2: An integrated plastome sequence annotator and analyzer. Nucleic Acids Research. 47: W65-W73.

  29. Swarup, S., Cargill, E.J., Crosby, K., Flagel, L., Kniskern, J. and Glenn, K.C. (2020). Genetic diversity is indispensable for plant breeding to improve crops. Crop Science. 61: 839-852.

  30. Theeuwen, T.P.J.M., Logie, L.L., Harbinson, J. and Aarts, M.G.M. (2022). Genetics as a key to improving crop photosynthesis. Journal of Experimental Botany. 73: 3122-3137.

  31. Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994). CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position- specific gap penalties and weight matrix choice. Nucleic Acids Research. 22: 4673-4680.

  32. Thooptianrat, T., Chaveerach, A., Runglawan, S. and Tanee, T. (2017). DNA profiles to identify Dillenia species (Dilleniaceae) in Thailand. Phytotaxa. 296: 239-252.

  33. Tillich, M., Lehwark, P., Pellizzer, T., Ulbricht-Jones, E.S., Fischer, A., Bock, R. and Greiner, S. (2017). GeSeq -Versatile and accurate annotation of organelle genomes. Nucleic Acids Research. 45: W6-W11.

  34. Tubillo, M., Tugade, M., Ugalde, R., Uy, C.A., Uy, M.C., Valenzuela, D.P., Vellesfin, J.A., Vallester, H., Vertudez, A.N. and Vicente, A.J. (2016). Phytochemical profile and antimicrobial activity of Dillenia philippinensis (katmon) fruit extract on S. aureus and E. coli. GreenPrints.

  35. Villarino, R.T. and Villarino, M.L. (2023). Indigenous knowledge of medicinal fruits in the Philippines: A systematic review. Research Journal of Pharmacognosy. 10: 77-89.

  36. Villesen, P. (2007). FaBox: An online toolbox for FASTA sequences. Molecular Ecology Notes. 7: 965-968.

  37. Xu, W., Ding, H., Mu, Y., Gao, C., Gao, Y., Wang, Q. and Shi, F. (2025). Comparative analysis of mitochondrial and chloroplast genomes in alfalfa (Medicago sativa L.). Legume Research. 49(3): 365-376. doi: 10.18805/LRF-882.

  38. Yu, J., Xue, J.H. and Zhou, S.L. (2011). New universal matK primers for DNA barcoding angiosperms. Journal of Systematics and Evolution. 49: 176-181.

Chloroplast Genome and Phylogenetic Analysis of Katmon (Dillenia philippinensis Rolfe): A Philippine Endemic Fruit

J
Jairus Jake M. Lucero1
J
Judy Ann M. Muñoz2
L
Lyka Y. Aglibot3
R
Roneil Christian S. Alonday2,4,*
1Genetic and Molecular Biology Division, Institute of Biological Sciences, College of Arts and Sciences, University of the Philippines Los Baños, Laguna, 4031, Philippines.  
2Philippine Genome Center-Program for Agriculture, Livestock, Fisheries and Forestry, Office of the Vice Chancellor for Research and Extension, University of the Philippines Los Baños, Laguna, 4031, Philippines.
3National Plant Genetic Resources Laboratory, Institute of Plant Breeding, College of Agriculture and Food Science, University of the Philippines, Los Baños, Laguna, 4031, Philippines.
4Biochemistry Laboratory, Crop Biotechnology Division, Institute of Plant Breeding, College of Agriculture and Food Science, University of the Philippines Los Baños, Laguna, 4031, Philippines.

Background: Katmon (Dillenia philippinensis Rolfe) is a Philippine-endemic fruit species with a relatively well-studied biochemical profile but poor genomic characterization. Studies involving the chloroplast genome can provide valuable insights into its evolution and support conservation efforts.

Methods: The complete chloroplast genome of D. philippinensis was sequenced using Illumina NovaSeqX. Reads were quality-checked, assembled with GetOrganelle and annotated using CPGAVAS2 and GeSeq. Simple sequence repeats, codon usage and inverted repeat boundaries were analyzed. Phylogenetic relationships were inferred using concatenated rbcL and matK sequences via maximum likelihood analysis.

Result: The chloroplast genome was 161,591 bp with a GC content of 36.3%. It exhibited the typical quadripartite structure, consisting of a large single-copy (LSC) region  (89,411 bp), a small single-copy (SSC) region (19,208 bp) and a pair of inverted repeats (IR) (26,486 bp each).  A total of 113 unique genes were identified, comprising 79 protein-coding, 30 tRNA and four rRNA genes. Fifty-four SSRs, primarily A/T mononucleotide repeats and 53,863 codons were observed. Phylogenetic analysis placed D. philippinensis as the closest relative to D. suffruticosa and the most distantly related to D. ovata. The complete chloroplast genome of D. philippinensis provides a valuable resource for phylogenetic studies, germplasm characterization and future breeding and conservation programs. 

Endemic plants are species that are solely found within a specific geographical region. They are well-adapted to their local habitats, thereby posing less damage to the environment and making them valuable resources for reinforcing food security. They also exhibit greater climate resiliency and pest tolerance than commercial crops (Nhamo et al., 2022). Moreover, indigenous fruits display substantial, if not higher, nutritional value compared to widely cultivated species (Durst and Bayasgalanbat, 2014). Despite the health benefits and potential for diet diversification, indigenous plants remain poorly studied (Oraye et al., 2023; Villarino and Villarino, 2023), leading to their underutilization and preventing their full potential from being harnessed. 
       
Dillenia philippinensis
is a medium-sized, evergreen fruit tree belonging to the family Dilleniaceae (Aquino et al., 2015, as cited by Fatallo and Panes, 2022). It is distributed in many provinces across the Philippines, such as Laguna, Quezon, Oriental Mindoro and Cebu (Magdalita et al., 2014). The leaves of this tree species were found to exhibit cytotoxic (Dante et al., 2019), antifungal (Ragasa et al., 2009) and antioxidant (Ansari et al., 2021) activities. Meanwhile, the fruit extracts display antimicrobial activity (Tubillo et al., 2016) and can be used as a natural food preservative (Pormento, 2024). While the biochemical profile of katmon is relatively well-studied, its genomic characteristics remain largely unexplored. Genomic characterization is limited to the work of Fatallo and Panes (2022), who used rbcL, matK and ITS markers for barcoding and phylogenetic analysis. No further studies have been conducted for the genetic characterization of this species. 
       
The chloroplast genome is one of the three genomes present in plants (Rozov et al., 2022). It holds evolutionary significance, as it is absent in other eukaryotes and functions for photosynthesis, a process whose biochemical mechanisms are conserved among plants (Theeuwen et al., 2022; Cao et al., 2025). Thus, the characterization of a chloroplast genome provides valuable insights into phylogenetic relationships and plant taxonomy (Daniell et al., 2016). Moreover, a comprehensive analysis of the chloroplast genome can equip plant breeders and conservation biologists with pertinent knowledge on breeding, conservation and germplasm characterization efforts (Madayag et al., 2026). Molecular data derived from the chloroplast genome can support the development of new cultivars with greater yield, higher nutritional content and higher genetic diversity (Swarup et al., 2020). Therefore, this study sought to characterize the chloroplast genome of D. philippinensis to gain a deeper understanding of its phylogeny. 
Sample collection
 
Mature leaves of D. philippinensis accessions GB68492 and GB70976 were collected from the germplasm collections of the National Plant Genetic Resources Laboratory, Institute of Plant Breeding, University of the Philippines, Los Baños. The collected leaves were placed inside airtight plastic bags and kept in an icebox during transportation. The leaves were then disinfected using 70% ethanol, put in a resealable plastic bag and stored at -80°C until further use. 
 
Genomic DNA isolation
 
Genomic DNA was extracted from collected leaf samples following the protocol of Inglis et al. (2018). The purity and concentration of the isolated DNA were determined using an Epoch Multi-Volume Spectrophotometer System (BioTek Instruments Inc., USA). The extract from accession GB68492 was sent to Macrogen (South Korea) for complete chloroplast genome sequencing via the Illumina NovaSeqX platform (Illumina Inc., San Diego, CA). 
 
Chloroplast genome assembly and annotation
 
Raw reads generated from Illumina sequencing were subjected to quality checking using FastQC v0.12.1 (Andrews, 2010). Adapters and low-quality reads were trimmed using Trimmomatic v0.39 (Bolger et al., 2014), which was implemented using the following operations: “ILLUMINACLIP TruSeq3 2:30:10:8; LEADING:25 and SLIDINGWINDOW:4:20.” Poor-quality reads were discarded, while high-quality reads were subjected to de novo assembly using GetOrganelle v1.7.7.1+ (Jin et al., 2020), with the consensus rbcL sequence serving as a seed sequence for assembly. The assembled chloroplast genome was annotated using CPGAVAS2 (Shi et al., 2019), with the complete chloroplast genome of D. indica deposited in NCBI GenBank (NC_042740.1) serving as reference. Manual validation was done using Geneious Prime v2025.1.2 and by cross-checking the annotations made by GeSeq (Tillich et al., 2017). 
 
Amplification of rbcL and matK
 
The genomic DNA samples were diluted to 20 ng/μL. Polymerase chain reaction (PCR) was performed to amplify rbcL and matK. Primers from Thooptianrat et al. (2017) and Yu et al. (2011) were used to amplify rbcL and matK, respectively. PCR thermocycling conditions were adapted from the study of Fatallo and Panes (2022). The amplicons were electrophoresed using 1.5% agarose at 110 V for 30 min to check amplification success. They were then purified using a NucleoSpinTM Gel and PCR Clean-Up Kit (Macherey-Nagel, Germany). Spectrophotometry was performed to assess the purity and concentration of the purified amplicons. Lastly, the amplicons for rbcL and matK were sent to Macrogen (South Korea) for bidirectional Sanger sequencing.
 
Phylogenetic analysis
 
The consensus rbcL and matK sequences of the two D. philippinensis accessions were subjected to phylogenetic analysis to confirm their relationship with other species in the family Dilleniaceae. The reference rbcL and matK sequences of each species were obtained from NCBI GenBank. Jojoba (Simmondsia chinensis) (NC_040935) and pokeweed (Phytolacca americana) (NC_067846) were used as outgroups. Multiple sequence alignment of the rbcL and matK sequences was performed using ClustalW (Thompson et al., 1994). The aligned rbcL and matK sequences were also concatenated using FaBox v.1.61 (Villesen 2007). Finally, phylogenetic trees were constructed based on the aligned rbcL+matK sequences, with best-fit substitution model of Tamura 3-parameter model with gamma distribution (T92 + G)."  The phylogenetic trees were constructed using  MEGA12 (Kumar et al., 2024), following the maximum likelihood method with the respective best-fit substitution model and employing bootstrap replicates of 1000. 
       
This research was conducted at the Biochemistry Laboratory of the Institute of Plant Breeding, University of the Philippines Los Baños, from November 2024 to April 2025. 
Chloroplast genome assembly and annotation
 
The complete chloroplast genome of Dillenia philippinensis is 161,591 bp in length with 36.3% GC content and exhibits the typical quadripartite structure composed of a large single copy (LSC, 89,411 bp), a small single copy (SSC, 19,208 bp) and two inverted repeat (IRa and IRb) regions (26,486 bp each) (Fig 1). A total of 130 genes were identified (Table 1), comprising 113 unique genes and 16 duplicated in the IRs. Duplicated genes include five protein-coding (rpl2, rpl23, ycf2, ndhB, rps7), seven tRNA (trnI-CAU, trnL-CAA, trnV-GAU, trnA-UGC, trnR-ACG, trnN-GUU) and four rRNA genes (rrn23S, rrn4.5S, rrn5S, rrn16S). The trans-spliced rps12 gene was also annotated twice. Functionally, the genome encodes 79 protein-coding genes, 30 tRNA genes and 4 rRNA genes, of which 44 are involved in photosynthesis, 59 in self-replication and 10 in other functions.

Fig 1: Gene map of the chloroplast genome of D. philippinensis (GB68492). LSC, SSC and IR regions are indicated. Genes outside the circle are transcribed clockwise; those inside are transcribed counterclockwise.



Table 1: List of genes in the chloroplast genome of D. philippinensis. 


       
Initial annotation with CPGAVAS2 predicted 127 genes, while GeSeq identified 135 genes. Discrepancies arose from differences in annotation algorithms, with GeSeq uniquely detecting psbL, rps19, ndhK, ycf1 and accD. BLAST verification confirmed these as functional genes. Conversely, ycf15 was annotated as protein-coding by CPGAVAS2 but is considered non-coding in previous studies (Schmitz-Linneweber et al., 2001; Shi et al., 2013). After cross-validation, the finalized annotation included 130 genes. This emphasizes that single-tool annotation may miss genes, highlighting the need for multi-tool validation to ensure accurate plastome characterization. The assembled cp genome sequence was submitted to GenBank with the accession number PX171333 of the National Center for Biotechnology Information (NCBI).
 
SSR and codon usage analysis
 
The chloroplast genome of D. philippinensis contained 54 simple sequence repeats (SSRs), 94.4% of which were A/T mononucleotide motifs, confirming a strong AT bias consistent with previous reports (de Souza et al., 2019). Dinucleotide (AT/TA) and trinucleotide (ATA) repeats were rare, each occurring once. A total of 53,863 codons were identified, with leucine the most abundant (9.7%, predominantly UUA) and cysteine the least (2.3%, mainly UGU), reflecting typical codon usage bias in chloroplast genomes (Nakamura and Sugiura, 2007).
 
Comparative genome and IR junctions
 
Comparison with D. indica and D. turbinata revealed minor differences in LSC (88,305-90,907 bp) and SSC (18,047-19,349 bp) lengths, while IR regions were nearly identical (Fig 2). No G/C SSR motifs were detected, reinforcing the AT-rich nature of the plastome (Guo et al., 2021). IR boundaries were generally conserved, with a slight IRb expansion in D. indica due to the rps19 gene.

Fig 2: Comparison of IR junctions among D. philippinensis, D. indica and D. turbinata. JLB: LSC-IRb; JSB: IRb-SSC; JSA: SSC-IRa; JLA: IRa-LSC.


 
Phylogenetic analysis
 
The utility of the assembled chloroplast genome of D. philippinensis is limited by the available sequences deposited in NCBI. Currently, there are only three species belonging to the family Dilleniaceae with publicly available complete chloroplast genomes. As such, the rbcL and matK genes were used to conduct phylogenetic analysis, allowing for the representation of more taxa. 
       
The relationships shown in the tree generated from rbcL, matK and concatenated rbcL and matK sequences. The concatenated sequences provided higher resolution than single genes (not shown in this paper). It also resolved the discrepancy observed in the trees constructed from individual genes. Tetracera, which was previously depicted to be closely related to Dillenia and Hibbertia, was depicted to be a sister to the rest of the family, branching off earliest as the closest relative to all remaining members (Fig 3). This result was congruent with those of Horn (2009), who used four plastid loci (rbcL, infA, rps4 and the rpl16 intron). Gontcharov et al. (2004) noted that using combined genes in phylogenetic analysis offers greater resolution than single genes and resolves conflicts observed in phylogenetic analyses involving single genes. 

Fig 3: Maximum likelihood phylogenetic tree based on concatenated rbcL + matK sequences, showing D. philippinensis closely related to D. suffruticosa.


       
The chloroplast genome of D. philippinensis provides a foundational resource for species identification, phylogenomics and conservation of Philippine endemic and indigenous fruits. SSR markers and codon usage patterns can support future population genetics and breeding efforts to enhance the utilization of underexploited species like katmon (Xu et al., 2025).
This study reports a complete chloroplast genome of Dillenia philippinensis (katmon), a Philippine endemic fruit species. The genome is 161,591 bp with 36.3% GC content and exhibits a typical quadripartite structure comprising 113 unique genes. SSR analysis revealed an A/T-rich repeat pattern, while codon usage analysis indicated a bias toward leucine codons. Comparative analysis with other Dillenia species demonstrated conserved IR boundaries and minor length variations in LSC and SSC regions. Phylogenetic reconstruction based on concatenated sequences of rbcL and matK consistently placed D. philippinensis as closely related to D. suffruticosa. These findings provide a foundational genomic resource for germplasm characterization, phylogenomic studies and the conservation and potential breeding of underutilized Philippine fruit crops.
The present study was supported by the people of the Republic of the Philippines thru the Department of Science and Technology - Philippine Council for Health Research and Development (DOST-PCHRD).
 
Disclaimers
 
The views and conclusions expressed in this article are solely those of the authors and do not necessarily represent the views of their affiliated institutions. The authors are responsible for the accuracy and completeness of the information provided, but do not accept any liability for any direct or indirect losses resulting from the use of this content.
The authors declare that there are no conflicts of interest regarding the publication of this article. No funding or sponsorship influenced the design of the study, data collection, analysis, decision to publish, or preparation of the manuscript.

  1. Andrews, S. (2010). FastQC: A Quality Control Tool for High- Throughput Sequence Data. Babraham Bioinformatics.

  2. Ansari, S.S., Diño, P.H., Castillo, A.L. and Santiago, L.A. (2021). Antioxidant activity, xanthine oxidase inhibition and acute oral toxicity of Dillenia philippinensis Rolfe (Dilleniaceae) leaf extract. Journal of Pharmacy and Pharmacognosy Research. 9: 846-858.

  3. Aquino, M.E.A., Wagan, A.J.M. and Omaña, M. (2015). Recapturing the food value of katmon. Bureau of Agricultural Research. 16: 1-16.

  4. Bolger, A.M., Lohse, M. and Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics. 30: 2114-2120.

  5. Cao, K. and Zhang, M. (2025). Features and phylogenetic analysis of the chloroplast genome of the medicinal plant Euchresta tubulosa. Legume Research. 48(12): 1947-1957. doi: 10. 18805/LRF-881.

  6. Daniell, H., Lin, C.S., Yu, M. and Chang, W.J. (2016). Chloroplast genomes: Diversity, evolution and applications in genetic engineering. Genome Biology. 17: 134.

  7. Dante, R.A.S., Ferrer, R.J.E. and Jacinto, S.D. (2019). Leaf extracts from Dillenia philippinensis Rolfe exhibit cytotoxic activity to both drug-sensitive and multidrug-resistant cancer cells. Asian Pacific Journal of Cancer Prevention. 20: 3285-3290.

  8. De Souza, U.J.B., Nunes, R., Targueta, C.P., Diniz-Filho, J.A.F. and de Campos Telles, M.P. (2019). The complete chloroplast genome of Stryphnodendron adstringens (Leguminosae - Caesalpinioideae): Comparative analysis with related Mimosoid species. Scientific Reports. 9: 14206.

  9. Durst, P. and Bayasgalanbat, N. (2014). Proceedings of a Symposium on the Promotion of Underutilized Indigenous Food Resources for Food Security and Nutrition in Asia and the Pacific (May 31-June 1, 2014, Khon Khan, Thailand). Food and Agriculture Organization, Regional Office for Asia and the Pacific, Rikkyo University.

  10. Fatallo, E.K.F. and Panes, V.A. (2022). DNA barcoding of Dillenia philippinensis Rolfe and Dillenia luzoniensis (S. Vidal) Merr. (Dilleniaceae) from Oriental Mindoro and Quezon City, Philippines. Philippine Journal of Systematic Biology. 16: 72-84.

  11. Gontcharov, A.A., Marin, B. and Melkonian, M. (2004). Are combined analyses better than single-gene phylogenies? A case study using SSU rDNA and rbcL sequence comparisons in the Zygnematophyceae (Streptophyta). Molecular Biology and Evolution. 21: 612-624.

  12. Guo, Y.Y., Yang, J.X., Li, H.K. and Zhao, H.S. (2021). Chloroplast genomes of two species of Cypripedium: Expanded genome size and proliferation of AT-biased repeat sequences. Frontiers in Plant Science. 12:  609729.

  13. Horn, J.W. (2009). Phylogenetics of Dilleniaceae using sequence data from four plastid loci (rbcL, infA, rps4, rpl16 intron). International Journal of Plant Sciences. 170: 794-813.

  14. Inglis, P.W., Pappas, M.C.R., Resende and L.V., Grattapaglia, D. (2018). Fast and inexpensive protocols for consistent extraction of high-quality DNA and RNA from challenging plant and fungal samples for high-throughput SNP genotyping and sequencing applications. Plos One. 13: e0206085.

  15. Jin, J.J., Yu, W.B., Yang, J.B., Song, Y., DePamphilis, C.W., Yi, T.S. and Li, D.Z. (2020). GetOrganelle: A fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biology. 21: 241.

  16. Kumar, S., Stecher, G., Suleski, M., Sanderford, M., Sharma, S. and Tamura, K. (2024). MEGA12: Molecular evolutionary genetic analysis version 12 for adaptive and green computing. Molecular Biology and Evolution. 41: msae263.

  17. Madayag, R.E. Jr., Quiñones-Arribado, K.J.O., Timog, E.B.S., Bartolome, M.C.B., Cruz, J.R.A.V., Borromeo, T.H., Endonela, L.E. and Coronado, N.B. (2026). The complete chloroplast genome of “Lasona” (Allium cepa var. aggregatum G. Don), an important indigenous vegetable in the Northern Philippines. Agricultural Science Digest. 46(1): 08-14. doi: 10.18805/ag.DF-531.

  18. Magdalita, P.M., Abrigo, M.I.K.M. and Coronel, R.E. (2014). Phenotypic evaluation of some promising rare fruit crops in the Philippines. Philippine Science Letters. 7: 376-386.

  19. Nakamura, M. and Sugiura, M. (2007). Translation efficiencies of synonymous codons are not always correlated with codon usage in tobacco chloroplasts. The Plant Journal. 49: 128-134.

  20. Nhamo, L., Paterson, G., Van Der Walt, M., Moeletsi, M., Modi, A., Kunz, R., Chimonyo, V., Masupha, T., Mpandeli, S., Liphadzi, S., Molwantwa, J. and Mabhaudhi, T. (2022). Optimal production areas of underutilized indigenous crops and their role under climate change: Focus on Bambara groundnut. Frontiers in Sustainable Food Systems. 6: 990213.

  21. Oraye, C., De Chavez, H., Aguilar, C., Makiling, F., Ladia, V.J., Enicola, E., Guevarra, L., Gueco, L., Maghirang, R., Anunciado, M., Oro, E., Gonsalves, J., Hunter, D., Borelli, T. and Mendonce, S. (2023). Initiatives on indigenous fruits in the Philippines: A scoping study. Bioversity International. 90: 1-45.

  22. Pormento, C.C. (2024). The potential of katmon fruit (Dillenia philippinensis) extract as a natural food preservative. Advances in Research. 25: 126-132.

  23. Ragasa, C., Alimboyoguen, A. and Shen, C. (2009). Antimicrobial triterpenes from Dillenia philippinensis. Philippine Scientist. 46: 78-87.

  24. Rozov, S.M., Zagorskaya, A.A., Konstantinov, Y.M. and Deineko, E.V. (2022). Three parts of the plant genome: On the way to success in the production of recombinant proteins. Plants. 12: 38.

  25. Sarhan, S., Hamed, F. and Al-Youssef, W. (2016). The rbcL gene sequence variations among and within Prunus species. Journal of Agricultural Science and Technology. 18: 1105-1115.

  26. Schmitz-Linneweber, C., Maier, R.M., Alcaraz, J.P., Cottet, A., Herrmann, R.G. and Mache, R. (2001). The plastid chromosome of spinach (Spinacia oleracea): Complete nucleotide sequence and gene organization. Plant Molecular Biology. 45: 307- 315.

  27. Shi, C., Liu, Y., Huang, H., Xia, E.H., Zhang, H.B. and Gao, L.Z. (2013). Contradiction between plastid gene transcription and function due to complex posttranscriptional splicing: An exemplary study of ycf15 function and evolution in angiosperms. PLoS ONE. 8: e59620.

  28. Shi, L., Chen, H., Jiang, M., Wang, L., Wu, X., Huang, L. and Liu, C. (2019). CPGAVAS2: An integrated plastome sequence annotator and analyzer. Nucleic Acids Research. 47: W65-W73.

  29. Swarup, S., Cargill, E.J., Crosby, K., Flagel, L., Kniskern, J. and Glenn, K.C. (2020). Genetic diversity is indispensable for plant breeding to improve crops. Crop Science. 61: 839-852.

  30. Theeuwen, T.P.J.M., Logie, L.L., Harbinson, J. and Aarts, M.G.M. (2022). Genetics as a key to improving crop photosynthesis. Journal of Experimental Botany. 73: 3122-3137.

  31. Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994). CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position- specific gap penalties and weight matrix choice. Nucleic Acids Research. 22: 4673-4680.

  32. Thooptianrat, T., Chaveerach, A., Runglawan, S. and Tanee, T. (2017). DNA profiles to identify Dillenia species (Dilleniaceae) in Thailand. Phytotaxa. 296: 239-252.

  33. Tillich, M., Lehwark, P., Pellizzer, T., Ulbricht-Jones, E.S., Fischer, A., Bock, R. and Greiner, S. (2017). GeSeq -Versatile and accurate annotation of organelle genomes. Nucleic Acids Research. 45: W6-W11.

  34. Tubillo, M., Tugade, M., Ugalde, R., Uy, C.A., Uy, M.C., Valenzuela, D.P., Vellesfin, J.A., Vallester, H., Vertudez, A.N. and Vicente, A.J. (2016). Phytochemical profile and antimicrobial activity of Dillenia philippinensis (katmon) fruit extract on S. aureus and E. coli. GreenPrints.

  35. Villarino, R.T. and Villarino, M.L. (2023). Indigenous knowledge of medicinal fruits in the Philippines: A systematic review. Research Journal of Pharmacognosy. 10: 77-89.

  36. Villesen, P. (2007). FaBox: An online toolbox for FASTA sequences. Molecular Ecology Notes. 7: 965-968.

  37. Xu, W., Ding, H., Mu, Y., Gao, C., Gao, Y., Wang, Q. and Shi, F. (2025). Comparative analysis of mitochondrial and chloroplast genomes in alfalfa (Medicago sativa L.). Legume Research. 49(3): 365-376. doi: 10.18805/LRF-882.

  38. Yu, J., Xue, J.H. and Zhou, S.L. (2011). New universal matK primers for DNA barcoding angiosperms. Journal of Systematics and Evolution. 49: 176-181.
In this Article
Published In
Agricultural Science Digest

Editorial Board

View all (0)