Apprenticeship proposal in Computer science / Bioinformatics (Master 2)

Development of new functionalities for the information system FLAGdb++

Employer: French National Institute for Agricultural Research
Tutor: Jean Philippe TAMBY

The Genomic Networks (GNet) team of IPS2 is developing statistical and bio-informatics methods as well as databases and interfaces for the analysis of 'omics' data. Its main research theme is the determination of gene functions and the interactions between these genes. Several software tools are maintained and made publicly available, including databases like FLAGdb++ for genomics and CATdb for transcriptomics.

FLAGdb++ is a generic information system used to explore a vast body of knowledge on six plant genomes (Arabidopsis, rice, poplar, vine, tomato and melon). It offers an original computing environment by aggregating structural and functional information from experimental sources, expert studies and bio-informatics predictions.
An interface, developed in Java, allows graphical visualization of genomes and their annotations and provides visual data analysis tools. However, it is necessary to evolve the interface to (i) enhance the legibility of genomic information, (ii) propose original visualizations for new data, (iii) improve the performance of analyses that handle increasing amounts of data.

The candidate will participate in the development of FLAGdb++ in three areas:
1. Graphic Environment Ergonomics:
The FLAGdb++ interface provides a representation of structural and functional annotations of genomes using the graphical capabilities of the Java language. Its improvement is to facilitate access to information and exploration tools, categorize the functional annotation and modernise navigation across genomes via a pointing device. The apprentice will carry out these developments by making extensive use of Java libraries (Swing and others).
2. Knowledge representation:
FLAGdb++ will need to integrate new data from large-scale studies conducted by the GNet team. Depending on their nature, these annotations will be visualized on a single genome or on a combination of species in comparative genomics. The apprentice will have to design and implement in Java the computer objects underlying these new annotations.
3. Benchmarking and Interfacing:
The increase in data quantities leads to the search for ever more powerful software to provide results quickly to the user (e.g. alignments via diamond). The apprentice will conduct comparative testing of bioinformatics software, establishing a test data set, running the software and writing a test report. He(She) will select the software for best performance and will interface it with FLAGdb++.

Object-oriented programming in Java, relational databases (PostgreSQL). Possible use of Perl, Python and shell languages. Eclipse/Unix development environment.
Biology and genomics knowledge required.

Contact: Jean Philippe TAMBY (jean-philippe.tamby @ inra.fr)