Simone MARINI, PhD

Postdoc fellow, University of Kyoto, Japan.


Room CB320 Akutsu Laboratory,

Bioinformatics Center, Institute for Chemical Research, Kyoto University,

Uji, Kyoto 611-0011, Japan.


smarini (_at) sunflower (_dot) kuicr (dot__) kyoto-u (_dot_) ac (dot_) jp

simone (_dot_) marini (at_) unipv (_dot) it

Who I am

As a scientist, I work in Bioinformatics, mainly applying Machine Learning to infer prediction models and simulations.

I worked on a wide variety of data, such as electronic health records, genomic variants, ontologies, protein sequences, and techniques, e.g. support vector machines, random forest, bayesian networks, data fusion.

My research projects span over three countries, namely Italy, China and Japan, and involve seven institutions, namely Kyoto University, University of Pavia, Italian Research National Council (CNR), Chinese Academy of Sciences, Tsinghua University, Hong Kong University of Science and Technology and Maugeri Foundation.

I lived in Pavia, Madrid, Hong Kong and Kyoto.



PhD in Bioengineering

Hong Kong University of Science and Technology, China.

Thesis: Qualitative and quantitative protein interaction prediction with machine learning.


MSc in Biomedical Engineering

University of Pavia, Italy.

Thesis: Design of a classifier by coevolution of genetic algorithms and genetic programming.


BSc in Biomedical Engineering

University of Pavia, Italy.

Thesis: Bone tissue engineering, effects of mechanical shear stress on human osteoblast SAOS2.



Data Fusion, knowledge discovery in Epilepsy, Myelodysplastic syndromes and protein cleavage

Techniques > Matrix tri-factorization, Association Study, Random Forest, Burden Methods
Technology > Octave, High Performance Computing, Perl
Data > NGS, TCGA, TarBase, KEGG, GO, DO, MEROPS, GEO, Domine, Negatome, BIOGrid, Interpro, STRING, genomic variants
Institutions > Kyoto University, University of Pavia

Cohort simulation of Type 1 and 2 diabetes patients

Techniques > Dynamic Bayesian Networks, Continuous Time Bayesian Networks
Technology > MATLAB, R
Data > EDIC, DCCT, Electronic Health Records
Institutions > University of Pavia, Maugeri Foundation

Genomic variant deleteriousness prediction (Classification, webtool)

Techniques > Ensemble Learning, Cost-sensitive Learning
Technology > Perl, Weka, AJAX, Glassfish
Data > NGS, HGMD, 1TGP,  NHLBI GO Exome Sequencing Project
Institution > University of Pavia

SNP selection in Beet Root

Techniques > Markov Chain Monte Carlo, Logistic Regression
Technology > Weka, MATLAB
Data > Genotyping
Institutions > University of Pavia, (Italian) National Research Council

DNA-, RNA- and protein-protein interaction (or affinity) prediction

Techniques > Ensemble Learning, Support Vector Machines
Technology > Weka, Perl
Data > Negatome, DSCAM, Protein-interactions, Aptamers
Institutions > HKUST, Tsinghua university, Chinese National Academy of Science




A Dynamic Bayesian Network model for long-term simulation of clinical complications in type 1 diabetes

Marini S, Trifoglio E, Barbarini N, Sambo F, Di Camillo B, Malovini A , Manfrini M, Cobelli C , Bellazzi R. Journal of Biomedical Informatics 2015, 57.

PaPI: pseudo amino acid composition to score human coding variants

Limongelli I, Marini S, Bellazzi R. BMC Bioinformatics 2015, 16:123

Developing a parsimonius predictor for binary traits in sugar beet (Beta vulgaris)

Biscarini F, Marini S, Stevanato P, Broccanello C, Bellazzi R, Nazzicari N. Molecular Breeding 2015, 35(10)


Improvement of Dscam homophilic binding affinity throughout Drosophila evolution

Wang G Z*, Marini S*, Ma X, Yang Q, Zhang X, Zhu Y. BMC Evolutionary Biology 2014, 14:186

*equally contributed


The role of SwrA, DegU and P(D3) in fla/che expression in B. subtilis.

Mordini S, Osera C, Marini S, Scavone F, Bellazzi R, Galizzi A, Calvio C. PLoS One 2013, 8(12):e85065.


In silico Protein-Protein Interaction prediction with sequence alignment and classifier stacking.

Marini S, Xu Q, Yang Q. Curr Protein Pept Sci. 2011, 12(7).



[to appear] Data Fusion for cleavage target prediction

Marini S, Demartini A, Vitali F, Bellazzi R, Akutsu T. Bioinformatics Italian Society National Congress 2016, podium presentation

Learning T2D evolving complexity from EMR and administrative data using Continuous Time Bayesian Networks

Marini S, Dagliati A, Sacchi L, Bellazzi R. 9th International Joint Conference on Biomedical Engineering System and Technolgy, 2016


A genomic data fusion framework to exploit rare and common variants for association discovery.

Marini S, Limongelli I, Rizzo E, Da T, Bellazzi R. 15th Conference of Artificial Intelligence in Medicine 2015

Matrix tri-factorization for miRNA-gene association discovery in acute myeloid leukemia

De Martini A, Marini S, Vitali F, Bellazzi R. 15th Conference of Artificial Intelligence in Medicine [Workshop] 2015

A continuous time, multivariate model to simulate Type 2 Diabetes patients trajectories

Marini S, Dagliati A, Bellazzi R. American Medical Informatics Association joint Summits on Translational Science 2015

Predicting Microvascular Complications from Type 2 Diabetes Retrospective Data

Sacchi L, Colombo C, Dagliati D, Marini S, Cerra C, Chiovato L, Bellazzi R. 15th Annual Diabetes Technology Meetings, 2015


A multivariate data-driven model to investigate the arising of complications in T2D patients

Marini S, Malavolti M, Dagliati A, Bellazzi R. 14th Annual Diabetes Technology Meeting 2014

PaPI: the Pseudo Amino acid variant Predictor

Marini S, Limongelli I, Bellazzi R. Bioinformatics Italian Society National Congress 2014, podium presentation

A novel algorithm to predict the deleteriousness of genomic coding variants

Limongelli I, Marini S, Bellazzi R. NGS (ISCB) 2014

Dynamic Bayesian Networks to simulate type I diabetes patients cohorts

Barbarini N, Bellazzi R, Cobelli C, Di Camillo B, Manfrini F, Malovini A, Marini S, Sambo F, Trifoglio E. Economics, Modelling and Diabetes: Mount Hood Challenge 2014, podium presentation

PaPI: using pseudo amino acid composition to predict deleterious coding variants

Limongelli I, Marini S, Bellazzi R. Italian Bioengineering Group National Congress 2014



Outstanding contribution in reviewing, Journal of Biomedical Informatics (Elsevier)


Bioengineering Division Graduate Student Research Award, 1st ranked.


HKUST Overseas Research Award for PhD Students.



[to come in Sep] Rare disease association studies from multiple data sources. Bioengineering Division, The Hong Kong University of Science and Technology.

[to come in June] Leveraging on publicly available databases for novel peptidase target discovery. Electrical, Computer and Biomedical Engineering Dept., University of Pavia.


May 13. Motif search, sequence alignment and Support Vector Regression for Dscam protein self- and hetero-binding affinity prediction. Institute of Biophysics, the Chinese Academy of Science, Beijing.


1. Journal of Biomedical Informatics (since 2015)

2. Briefings in Bioinformatics (since 2015)

3. Artificial Intelligence in Medicine (conferences) (since 2016)

4. American Medical Informatics Association joint Summits on Translational Science (since 2016)

5. Computers in Biology and Medicine (since 2016)

(My profile on Publons.)


Italian (Native speaker), English (Fluent), Spanish (Fluent)


Among the stuff that I like to do in my spare time, I mention (1) traveling alone, and very cheaply; (2) playing nerdy pen-and-paper role playing games; (3) (try to) learn languages, history and philosophy.


I make prediction models and simulations applying several Machine Learning techniques. I work on a wide variety of data, in both Health Informatics and Bioinformatics.