Research projects
Creation of the Bioinformatics and Data Science Unit at IBFG
Bioinformatics is a rapidly growing field with diverse applications that has become a crucial tool for analysing data generated by biological experimentation and sequencing (e.g., genomics, transcriptomics, proteomics, metabolomics). Despite high demand, recruiting qualified personnel remains one of the main bottlenecks for many research centres, as bioinformatics requires advanced knowledge spanning multiple areas of biology as well as computing and IT environments.
The Momentum project awarded to the IBFG proposes the creation of a Bioinformatics and Data Science Unit with the aim of strengthening the digital competences of the institute.
To this end, we have developed a strategy to identify the main computational challenges faced by the centre, with the goal of building a tailored infrastructure — both physical and computational — and establishing ad hoc working protocols (e.g., pipelines for multi-omics data integration). A further objective is to increase the digital competences of the institute through hands-on training courses for the general staff, as well as targeted training for identified personnel within the individual research groups, with the aim of fostering bioinformatics independence both collectively and at the group level.
The Bioinformatics and Data Science Unit (UBCD) of the Institute of Functional Biology and Genomics (IBFG, CSIC-USAL) is a research-support structure whose primary objective is to improve the computational and digital capabilities of the institute. The unit organises its activities around three core pillars: advisory support and collaboration with the institute's research groups, provision and management of computational infrastructure, and training of research staff.
Project funded by the European Union (NextGenerationEU) through the Momentum Programme (CSIC), in collaboration with the Spanish Ministry of Science, Innovation and Universities.
Activity Areas
Advisory Support and Collaboration
The UBCD acts as an internal consultancy service for the research groups of the IBFG. Since its launch, the unit has handled requests from a large proportion of the groups, covering a wide range of disciplines: bacterial genomics and metagenomics, transcriptomics, eukaryotic genomics, RIP-seq, ChIP-seq, proteomics, metabolomics, lipidomics, and image analysis. Requests are managed through an internal tracking system that allows prioritisation and monitoring of each request. Since October 2025, 57 requests have been registered, of which more than two thirds have been completed successfully and the remainder are ongoing.
Computational Infrastructure
The IBFG has its own computational infrastructure managed by the UBCD, including high-performance servers ranging from 88 CPU cores and 384 GB of RAM up to configurations with more than 400 cores and 1.5 TB of RAM, as well as a storage system exceeding 100 TB. This virtualisation-based infrastructure allows individual virtual machines to be allocated to each research group, tailoring resources to the specific needs of each project. Additionally, the unit advises and facilitates access to external high-performance computing systems, such as SCAYLE (Supercomputación Castilla y León) and the DRAGO system. Code and resources developed by the unit are shared through a public GitHub repository. The unit also participates in CSIC networks, including the CSIC Computational Biology and Bioinformatics Network (BCB).
Training
The UBCD offers a continuing training programme aimed at researchers, technicians, and students at the institute, regardless of their prior computing experience. The programme includes workshops on general-purpose bioinformatics tools, such as an introduction to the Linux operating system, advanced command-line usage, and the R programming language in its various applications: statistical analysis, data visualisation, and applied statistics. Training is complemented by individualised sessions tailored to the specific needs of each group or researcher, as well as online resources available through a dedicated website featuring tutorials and guides.
Research Areas
Computational Science
Scientific Computing, High Performance Computing (HPC), Bioinformatics, Computational Biology
Artificial Intelligence
Deep Learning, Machine Learning, Neural Networks, LLM, GenAI