It’s been a decade long journey, but the publication of a completely gapless telomere-to-telomere (T2T) genome assembly finally marks the end of the Bauhinia Genome Project. Having nothing left to sequence, this data advances our understanding of the captivating traits and origins of Hong Kong’s floral emblem, and also provides a model for assembling other curious hybrid species.
A Blooming Achievement
The 25th April is International DNA Day, and this year it marked the completion of our decade-long project to sequence the DNA of Hong Kong’s floral emblem, the Hong Kong Orchid Tree Bauhinia x blakeana Dunn. The study, published in the Open Science journal GigaScience and led by our partners at Chinese University of Hong Kong (CUHK), presents a complete, gapless sequence of the Bauhinia genome, spanning from one end of a chromosome to the other (from telomere-to-telomere or T2T). Featured on the Hong Kong flag and currency, this beautiful ornamental Bauhinia species – admired for its striking purplish orchid-like flowers – can be traced back to a chance discovery by French horticulturalist Jean-Marie Delavay on Hong Kong peak in the 1880s. It was later determined to be completely sterile and could only be grown by being propagated from cuttings, thus making the taxonomic status and precise origin of this striking species a scientific mystery. As far back as 1903 the Hong Kong Botanical and Afforestation Department reporting “The mysterious origin of the tree and its magnificent flowers at once arrest the interest”.
Morphological and single-gene and marker studies previously suggested the species may be a hybrid between Bauhinia purpurea and Bauhinia variegata, but until now there remained a lack of definitive confirmation, especially at the genomic level. The historical and cultural interest of Hong Kong Bauhinia led to our initially community-crowdfunded genome project to try to answer some of the questions on the species’ origin, and in 2015 we raised enough money on the Indiegogo platform to start the Bauhinia Genome project and sequence the transcriptomes of the three key species. Work was handed over to CUHK and their team started progressing to full genomes in 2019 by combining state of the art second-generation and third-generation sequencing sequencing technologies. This multi-platform approach enabled the first chromosome scale draft assemblies of the hybrid Hong Kong species and its most likely parent species.
With the advent of the first human telomere-to-telomere (T2T) genome in 2022, computational biologists from the National University of Singapore refined this assembly further using novel algorithms, and achieving a T2T, haplotype-resolved genome. The final genome assembly reveals 28 complete chromosomes—14 inherited from each parent—forged through a singular hybridization event, a genetic chimera powering its vibrant blooms and ecological adaptability. This new T2T-level assembly completes the sequencing effort, allows precise tracking and analysis of genetic variations across parental lines and their hybrid offspring, and facilitates a comprehensive understanding of the underlying genetic mechanisms of its unusual flowering and reproductive traits.
Who’s Your Daddy?
The project definitively resolves the question of the Hong Kong Bauhinia’s parentage, with B. variegata identified as the paternal parent and B. purpurea as the maternal parent. Creating a hybrid despite the two species being separated over 13.4 million years. Transcriptome profiling (a method to track gene activity) of flower tissues highlighted a closer resemblance of B. blakeana to the maternal parent, and revealed distinct expression patterns among the three species, particularly in biosynthetic and metabolic processes. The new transcriptome data also throws light on the process of heterosis or hybrid vigor, the phenomenon in which hybrid offspring display enhanced or superior traits compared to their parents.
Scott Edmunds, co-founder of this community driven project says of this effort “When we launched this grass-roots community project to educate the public on the potential of the rapid development of genomics, little did we know we would end up with a complete genome with no gaps covering all the 290.7 million base-pairs. Complete T2T genomics was not possible when we proposed the project due the accuracy and cost of the technology at the time. Starting as a citizen-science project appealing to the public by doing outreach and talks in Hong Kong schools, radio and TV; it is immensely satisfying to be able to say “mission accomplished” on the Bauhinia Genome.”
Through the methods developed in the production of this haplotype-resolved and gapless T2T genome, the researchers have advanced our understanding of the genomic structure and genetic mechanisms underlying the captivating flower colours traits and extended flowering period of in this popular and enigmatic ornamental tree. This trio of species approach serving as a case study for investigating traits in other hybrid species.
Prof Stephen Tsui, who lead the work at The Chinese University of Hong Kong says of his journey to finish the T2T genome of Bauhinia “Hybrid genomes present significant assembly challenges due to their high heterozygosity and structural complexity. To address this, we integrated trio-binning with custom algorithms, achieving complete T2T assemblies for both parental haplotypes of Bauhinia × blakeana using only ~60x long-read data—a notable improvement over conventional workflows. By decoding these haplotypes, we not only solved the century-old mystery of its origin but also revealed how parental allele interactions shape the hybrid’s iconic floral characteristics. Our work also serves as a model for exploring the characteristics of hybrid species using T2T haplotype-resolved genomes, providing a novel approach to understanding genetic interactions and evolutionary mechanisms in complex genomes with high heterozygosity. Lastly, my team is very proud to have been able to complete the genome of Bauhinia, which is the emblem of Hong Kong”
As an open science project all the resources generated in this study were made freely available for future scientific studies, breeding programs, and conservation initiatives in Bauhinia species. As well as sharing teaching materials and protocols are also available for wider educational reuse in Hong Kong and beyond. On top of genomics we also taught about biodiversity and plant physiology, and the observations collected in our #BauhiniaWatch project have now been deposited in the GBIF biodiversity database. The resulting open resources and data are listed below, and on top of conclusively answering a scientific question these provide a further legacy of this project.
The Open Science Legacy of Bauhinia Genome:
- Indiegogo Crowdfunding Page.
- Protocols.io protocols.
- OER Commons open teaching materials.
- Bauhinia Genome SciStarter page.
- Bauhinia Watch biodiversity data in GBIF.
- OSF Project (includes specimen pictures, QC reports and posters).
- GigaDB data (links to genomes and transcriptomes).
- NCBI Bioproject (sequencing data mirrored from US).
- CNSA Bioproject (sequencing data mirrored in China).
- Bauhinia Genome Youtube.
See the TEDx talk for the background on how his project was launched to raise genomic literacy in Hong Kong.
Thank You for the Music
We would like to thank everybody who has been on this journey with us. From the crowdfunding community who kickstarted the funding of the project, Sarah Lazarus who first gave the project oxygen by featuring it as an (award winning) SCMP cover story, the many schools and community groups who invited us to speak and teach genomics, the artists inspired by the project who inspired us back by their representations of Bauhinia Art, BGI Hong Kong for providing the initial at-cost sequencing, and Stephen and his many students at CUHK who completed this work to such an unprecedented quality. Alongside conclusively solving a scientific mystery going back to 1903. When we started the crowdfunding campaign (see the English and Cantonese videos), little did we know we would see mission accomplished with every one of the roughly 290 million basepairs sequenced without a single gap. Thank you everybody!
References
Mu et al. The haplotype-resolved T2T genome for Bauhinia × blakeana sheds light on the genetic basis of flower heterosis. GigaScience. 2025. https://doi.org/10.1093/gigascience/giaf044