Alternative splicing (AS) represents an effective way to expand the proteome, and thus the biological complexity, without having to create and evolve new genes. Anecdotal evidence of the involvement of alternative splicing in the regulation of protein-protein interactions has been reported by several studies. AS events have been shown to significantly occur in regions where a protein interaction domain or a short linear motif is present. Several AS variants show partial or complete loss of interface residues, suggesting that AS can play a major role in the interaction regulation by selectively targeting the protein binding sites. In the first part of my PhD work I performed a statistical analysis of the alternative splicing of a non-redundant data set of human protein-protein interfaces known at molecular level to determine the importance of this way of modulation of protein-protein interactions through AS. I demonstrated that the alternative splicingmediated partial removal of both heterodimeric and homodimeric binding sites occurs at lower frequencies than expected, and this holds true even if I consider only those isoforms whose sequence is less different from that of the canonical protein and which therefore allow to selectively regulate functional regions of the protein. On the other hand, large removals of the binding site are not significantly prevented, possibly because they are associated to drastic structural changes of the protein. The observed protection of the binding sites from AS is not preferentially directed towards putative hot spot interface residues, and is widespread to all protein functional classes. Using the same procedure as that applied for proteinprotein interactions, I also evaluated the importance of AS-mediated removal of protein-ligand binding sites, obtained from three-dimensional structures of human proteins. Again, I observed that AS tends to avoid partial removal of such sites, while being quite indifferent to complete or near-complete deletions. This tendency does not depend on the size of the binding site. The choice of the AS pattern of a gene is thus conditioned by constraints imposed by the three-dimensional structure of the protein products. Alternative splicing is not observed only in protein-coding genes: many long non-coding RNA (lncRNAs) have two or more transcript isoforms. Since these transcripts are not translated, the differential usage of splice sites cannot be influenced by structural constraints, except for those related to the RNA folding. Even if AS occurs in about a quarter of lncRNA genes, little is known about its role in the regulation of lncRNA function and stability, mainly because few lncRNAs have been functionally characterized. The aim of the second half of my PhD work was to study the alternative splicing of lncRNA genes. First, I analyzed the evolutionary conservation of lncRNA alternatively spliced sequences (and their flanking regions) and I found that their pattern of conservation is similar to that showed in protein-coding genes; this suggests that AS of lncRNA genes is as important as that of protein-coding genes, at least from an evolutionary standpoint. To study the impact of AS on lncRNA functional sites, I assembled a data set of proteinRNA interaction sites by reanalysing published CLIP-Seq, RIP-Seq and RIPChip experiments. The results of this reanalysis work will be stored in a public database of protein-RNA interactions detected via high-throughput methods.

(2014). Computational characterization of alternative splicing events in coding and non-coding genes.

Computational characterization of alternative splicing events in coding and non-coding genes

COLANTONI, ALESSIO
2014-01-01

Abstract

Alternative splicing (AS) represents an effective way to expand the proteome, and thus the biological complexity, without having to create and evolve new genes. Anecdotal evidence of the involvement of alternative splicing in the regulation of protein-protein interactions has been reported by several studies. AS events have been shown to significantly occur in regions where a protein interaction domain or a short linear motif is present. Several AS variants show partial or complete loss of interface residues, suggesting that AS can play a major role in the interaction regulation by selectively targeting the protein binding sites. In the first part of my PhD work I performed a statistical analysis of the alternative splicing of a non-redundant data set of human protein-protein interfaces known at molecular level to determine the importance of this way of modulation of protein-protein interactions through AS. I demonstrated that the alternative splicingmediated partial removal of both heterodimeric and homodimeric binding sites occurs at lower frequencies than expected, and this holds true even if I consider only those isoforms whose sequence is less different from that of the canonical protein and which therefore allow to selectively regulate functional regions of the protein. On the other hand, large removals of the binding site are not significantly prevented, possibly because they are associated to drastic structural changes of the protein. The observed protection of the binding sites from AS is not preferentially directed towards putative hot spot interface residues, and is widespread to all protein functional classes. Using the same procedure as that applied for proteinprotein interactions, I also evaluated the importance of AS-mediated removal of protein-ligand binding sites, obtained from three-dimensional structures of human proteins. Again, I observed that AS tends to avoid partial removal of such sites, while being quite indifferent to complete or near-complete deletions. This tendency does not depend on the size of the binding site. The choice of the AS pattern of a gene is thus conditioned by constraints imposed by the three-dimensional structure of the protein products. Alternative splicing is not observed only in protein-coding genes: many long non-coding RNA (lncRNAs) have two or more transcript isoforms. Since these transcripts are not translated, the differential usage of splice sites cannot be influenced by structural constraints, except for those related to the RNA folding. Even if AS occurs in about a quarter of lncRNA genes, little is known about its role in the regulation of lncRNA function and stability, mainly because few lncRNAs have been functionally characterized. The aim of the second half of my PhD work was to study the alternative splicing of lncRNA genes. First, I analyzed the evolutionary conservation of lncRNA alternatively spliced sequences (and their flanking regions) and I found that their pattern of conservation is similar to that showed in protein-coding genes; this suggests that AS of lncRNA genes is as important as that of protein-coding genes, at least from an evolutionary standpoint. To study the impact of AS on lncRNA functional sites, I assembled a data set of proteinRNA interaction sites by reanalysing published CLIP-Seq, RIP-Seq and RIPChip experiments. The results of this reanalysis work will be stored in a public database of protein-RNA interactions detected via high-throughput methods.
2014
2014/2015
Biologia cellulare e molecolare
27.
Settore BIO/11 - BIOLOGIA MOLECOLARE
English
Tesi di dottorato
(2014). Computational characterization of alternative splicing events in coding and non-coding genes.
File in questo prodotto:
File Dimensione Formato  
Tesi_Dottorale_Alessio_Colantoni.pdf

solo utenti autorizzati

Licenza: Non specificato
Dimensione 3.57 MB
Formato Adobe PDF
3.57 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/202013
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact