I am currently working as a consultant at the Ontario Cancer Biomarker Network.
My main scientific interests are in the area of computational biology and bioinformatics and related fields.
As an undergraduate at the University of São Paulo, I completed my Bachelor's degree in Biology. During this period, I worked on different labs as a intern as a way to grasp the practical side of research as a compliment to the theory sen in class. I then entered the graduate program of Genetics/Biology at the Biological Sciences Institute of the University of São Paulo, as a Masters student under supervision of Dr Paulo Otto. My main research activity was the development and coding of a computer program that simulates and presents population genetics phenomena to illustrate key concepts to students learning the subject. The resulting software, WinPop, has an intuitive and easy to use interface to generate graphical graphical representations that helps undergraduate and graduate students understand population genetics. My Master's thesis was one of the first in the University to be based solely on Computational Biology. This software is still maintained by me and Dr Matthew Hamilton from the Georgetown University. It is now named PopGene.S2, and is being actively expanded and improved as a teaching tool for use in population genetics courses around the world.
For my Ph.D. work, I shifted my attention to a project centered on determining phylogenetic relationships of a small subfamily of South American frogs. During these four years of research, I worked with Dr. Francisca do Val, at the Museum of Zoology, University of São Paulo. Throughout this term, I developed proficiency in all stages of the phylogenetic analysis from field work and scoring of morphology characters to DNA sequencing and final data set assembly and tree estimation. My interests in doing phylogenetics was not only to determine the relationships among the frog species. Rather, my motivation was also to learn the aspects of each phylogenetic model and to explore the computational problems presented by the field. Therefore, as I worked on the specific case of frogs I also searched for possibilities to speed up the process of tree estimation where possible, and analyzed the use of of distinct character types on the final cladogram topology.
After finishing my Ph.D. work, with a relived enthusiasm for Computational Biology I decided to focus on the
development of new methodologies and biological applications of computational tools. My first post-doctoral experience was
at the Department of Biology of McMaster University. At McMaster I worked in Dr. Brian Golding's lab, with collaboration
with Dr Elizabeth
Weretilnyk's group, as part of the development team for a computer application to analyze output data from gas-chromatography/mass-spectrometry (GC/MS) instruments called the ''GC/MS Analysis Software Package'' or GASP. The release of this software to the academic society was one of the most rewarding moments to this point in my career, due to the fact that GASP is a full featured application that meets an array of objectives defined by bench biologists engaged in GC/MS research applications.
GASP's development required a multi-disciplinary team with diverse expertise in order to create a novel methodology for the field of metabolomics. My role was to gather requirements from the wet-lab personnel and translate them into computer algorithms and code in a way that would facilitate both the software interface and the final scientific analysis. Over the course of GASP development project I also interacted with researchers in metabolomics and worked on some aspects of GC/MS analysis that need to be assessed and improved. For example, the quality of GC/MS data alignments and specific methods of statistical analysis and comparison as well as the development of databases to store experiment data in a format accessible with existing metabolomics software are both areas where additional work would improve research results. Also while I was at McMaster, I was involved in setting up a state-of-the-art facility for the Sinorhizobium meliloti Genome Canada project at the Center for Environmental Genomics Biotechnology (CEGeBio). This work consisted on installing, managing and supporting high-throughput equipment for the use of scientists involved in the project.
I also worked as a post-doctoral fellow at the University Health Network, Division of Cancer Genomics
and
Proteomics of the Ontario Cancer Institute, under guidance of Dr Elisabeth Tillier. My main project dealt with the discovery
of
transcription factor binding sites in DNA sequence clusters of expressed genes in different tissues. This project included the
development of novel algorithms to speed up the searching process, new statistical methodologies that allow the determination
of unique binding sites of the different tissues and also collaboration with laboratories that provide data for the analysis. I also had been involved on the design of computer interfaces for software available in the lab, in order to improve their usability and accessibility for scientists with all levels of computer knowledge. Remarkably, Simprot is the one that stands out at the moment. This application allows the simulation of protein sequences with insertions and deletions, which was originally a command-line tool. Using the experience accumulated from GASP design, myself and Dr Tillier were able to add a simple yet powerful interface, improving Simprot's usability and features.
During my career I had also cultivated some personal interests that have arisen from the essence of my research. One of them is cancer genomics and chromosomal structures. I have worked closely with Dr Jeremy Squire, providing in silico data that complement and improve results obtained experimentally in the lab. This collaboration showed me the importance of genomic architecture and its implication on cancer development, such as segmental duplications and their effect on prostate cancer.
From my Ph.D. work, I had also experienced the difficulties of DNA sequencing and phylogenetic analysis, specially the ribosomic RNA (rRNA) alignment procedure. This is an interesting and difficult obstacle to solve and the development of powerful tools that would address it are key to improve the reliability of a rRNA sequence alignment and also decrease the amount of time spent on performing it by eye. Some algorithms have been developed and published recently but no software package is largely available and known to output good and reliable results. Developing one package that annotates and aligns rRNA sequences at the same time is one of my objectives.
Another aspect of the computational biology that attracts me is the development of the Graphical User Interface (GUI) for scientific software. A good GUI would allow faster analysis of data and less time wasted with operational manuals. In most cases biological software has command-line applications or intricate interfaces. Effective interfaces instil a sense of control, do not concern users with how they work and perform a maximum of work for minimal input, as well as anticipate the users' wants and needs. With the improvement of different operational systems, toward a more graphical environment, there has been a change in the look of some biological applications. Nevertheless, cross-platform coding is one of the major issues on interface production. My main interest in GUI development is to generate code that is portable to different systems, having the usability and generating the results in an equal fashion. A close interaction with the final user, a student (i.e. WinPop) or a researcher (i.e. GASP), is a key point in the development of a effective interface.
Apart from collaborative work done with other researchers, I intend to develop several other projects on my own lab. Among them, would be the the continuous improvement of WinPop, with the inclusion of different population genetics simulation tools. Other avenue that I would like to explore is the use of different types of phylogenetic characters in tree estimation, with a possible effect on phylogenetic method choice. I also intend to continue working with the development of different biology-driven algorithms, such as the transcription factor binding sites search procedure and GC/MS data alignment. Regarding, GC/MS data alignment, the improvement of GASP and its code is included in the projects I wish to develop. Comparative genomics and human chromosome structure analysis using computer tools, are other interesting side-projects that I have in my list of possible future developments.
At last, another goal that I have in my career is to pursue an integration of Biology and High Performance Computing (HPC). While at McMaster, I was lucky to access the Canadian HPC network Sharcnet, what made me realize the importance of this type of applications in biological analysis. In the past years, I have been trying to improve my skills in HPC and parallel computing and also trying to exchange information and collaborations with researchers in the Computer Science field, in order to build parallel applications to biological analysis.
In summary, my main research interest is to work with computational biology, being on the interface between the bench biologist and the computer. This interface can be a way of developing new computer applications and methodologies, or even supplying fast and reliable results for experiments obtained in the lab, or still gathering data from available databases and literature and using computational techniques in order to obtain novel results. I firmly believe in the collaboration of efforts from different labs and scientists pursuing a common result, and in the case of computational biology collaborations are indispensable, as the laboratory bench is the fountain of data and at the same time the place to corroborate conclusions obtained in silico.
I plan to continue my research work in computational biology, creating novel computer applications and methodologies and collaborating with other researchers. My research will be centered on the development of methods of analysis of genomic architecture, transcription factor biding sites search, phylogenetic analysis, rRNA alignment, algorithm development for biological applications, among others. I will also continue investigation into the development of GUI for biological applications, and extend my work toward a greater integration of HPC and Biology.
In the next several years, I intend to make a significant contributions to the field of Computational Biology and at the same time teach undergraduate and graduate students. I also plan to have a established research and be able to supervise graduate students into the mysteries of informatics and biology. I firmly believe that funding is key for any scientific research and I plan to apply for all possible opportunities in the following years.
Regarding my teaching interests, I have been closely related during my academic life to the teaching of genetics and population genetics. I have done, while in Brazil, two Teaching Assistantships in genetics courses. At McMaster, I had also the opportunity to become a Seasonal Lecturer, teaching Population Genetics to third year students. I am interested in instructing courses of basic and advanced aspects of computational biology, genetics, phylogenetics, evolution and population genetics. I believe that computational biology teaching is not a detached subject and its education is linked with many different fields of Biology, due to the ever increasing computer use. It is necessary a greater integration of informatics aspects of Biology in many different courses, in order to make teaching and learning easier and pleasurable. I am enthusiastic about both laboratory-oriented and more theoretical courses and I also enjoy one-on-one teaching and advising.