About me

67BX6077My main research interest concerns the evolution of chloroplasts, and I am particularly interested in the evolution of the two protein complexes “Translocons at the Outer and Inner Chloroplast envelope membranes” (TOC and TIC respectively). These protein complexes transport nuclear-encoded proteins into the chloroplast, and the evolution of TOC/TIC is an important part of the transformation of the ancient cyanobiont into a modern chloroplast.

info@matstopel.se

Skeletonema codon usage

Introduction

I'm about to calculate the codon usage for Skeletonema marinoi transcripts, using Sandras Trinity assembly from unfiltered/untrimmed non-normalised reads (/data4/skeletonema_sandra/trinity_out_dir/Trinity.fasta).

Identify single copy loci for sequence capture in the Orchidaceae family

Introduction

We (Filipe, Alex, me ...) have set up four criteria for the EST sequences that will be used to design the sequencing probes:

Webbased Phylogenomic analysis

Introduction

You will in this exercise practice your newly acquired skills in phylogenetic inference and "tree thinking", by analysing the evolutionary history of a gene family. One part of the exercise is also to collect the data necessary for running the analysis (in the form of homologous protein sequences from different species) as well as interpreting the result (you first have to draw a species tree for the taxa included in the analysis).

Gene family evolution - Exercise #2

Introduction

In this exercise you will redo the analysis from Exercise #1, but this time analyse the fungal genes you analysed in other parts of this course. Another difference from the last exercise is that you will do alignment, alignment editing using and tree manipulation on your local computer. For the latter you have to download and install the program Figtree.

Marine Genomics 2013 - Phylogenomics exercise

Introduction

Gene family evolution - Exercise #1

Introduction

You will in this exercise practice your newly acquired skills in phylogenetic inference and "tree thinking", by analysing the evolutionary history of a gene family. One part of the exercise is also to collect the data necessary for running the analysis (in the form of homologous protein sequences from different species) as well as interpreting the result (you first have to draw a species tree for the taxa included in the analysis).

Testing Trinity on Surirella brebissonii RNA data

Introduction

Testing the digital normalisation feature of the Trinity package.

Normalisation of unfiltered data

[2013-09-24]

Toc34 and Toc159 proteins in diatoms

Introduction

Kallanon & McFadden reported that Cyanidioschyzon merolae encodes two putative GTPase receptor proteins and classified one of them (CMP284C) in the Toc34 gene family and the other one (CMQ137C) as Toc159-like. Searching the NCBI non-redundant protein (nr) repository for Rhodophyceae (taxid:2763) proteins using either of the two protein as query sequence identifies the other as the fifth best match.

Also, using these two proteins as query sequences when searching all proteins from the diatoms ...

Identifying bacterial sequences in diatom WGS data

Introduction

Our genome assemblies of the Skeletonema marinoi and Surirella brebissonii datasets contains a lot of bacterial contigs, and these have to be identified and removed before further analyses can be done. This section describes the method we will use for this.

Material

Diatom databases

Thalassiosira pseudonana CCMP1335

11673 sequences. Database contains all NCBI RefSeq sequences for "Thalassiosira pseudonana CCMP1335[orgn]" available 2013-05-08.

Performance test of ExaML

Introduction

Amphiura filiformis de novo genome project

Introduction

Coming soon

Data

  • 5_120719_AC0YY4ACXX_2_indexm1_1.fastq - 170'243'702 sequences (Data_1)
  • 5_120719_AC0YY4ACXX_2_indexm1_2.fastq - 170'243'702 sequences (Data_2)
  • 3_130111_BD1HWHACXX_P389_101_indexm1_1.fastq - 158'112'696 sequences (Data_3)
  • 3_130111_BD1HWHACXX_P389_101_indexm1_2.fastq - 158'112'696 sequences (Data_4)

Analyses

20130305

Fucus vesiculosus de novo genome project

Introduction

Coming soon

Data

  • 2_120706_BC0YYNACXX_4_indexm2_1.fastq - 169'696'580 sequences (Data_1).
  • 2_120706_BC0YYNACXX_4_indexm2_2.fastq - 169'696'580 sequences (Data_2).
  • 2_130111_BD1HWHACXX_P388_101_indexm2_1.fastq - 155'901'827 sequences (Data_3)
  • 2_130111_BD1HWHACXX_P388_101_indexm2_2.fastq - 155'901'827 sequences (Data_4)

Analyses

20130305

Littorina saxatilis de novo genome project

Introduction

Coming soon.

assemblyPipeline.py

Introduction

assemblyPipeline.py is a wrapper for running de novo genome assembly analyses using the tools fastx_trimmer, cutadapt and fastq_quality_filter to prepare the data and the CLC Assembly Cell for the actual assembly.

The Surirella brebissonii genome project

At CMB, University of Gothenburg, we are currently working on the de novo genome assembly of the diatom Surirella brebissonii. PI for the project is Anders Blomberg and main responsibility for the assembly work lies on myself and Magnus Alm Rosenblad.

This page will describe the work on assembling the S. brebissonii data. I will not go into detail about the pre-sequencing work as I had part in that.

Evolution of the POR gene family

Data at github.com/mtop-data/POR
Code at github.com/mtop/misc

Material & Methods

Peter has predicted the transit peptide (TP) region in all the sequences used in previous analyses (stored in "analysis/mrbayes/all_seqs.fst"). The TP regions have been removed from the sequences that are found in the file "all_seqs_No_TP.fst". I have aligned this dataset using mafft...

E&S

This page could have been a Q&A (Questions & Answers), but instead I have decided to call it an E&S - Error messages and Solutions!

zorro

Error message:

Reciprocal BLAST S. schafta - S. uralensis - S. vulgaris

Introduction

In order to compare the two transcriptome datasets from Silene schafta (78895 sequences) and S. uralensis (80151 sequences) to each other, we are going to do a reciprocal BLAST analyses. This analysis will also include a dataset from S. vulgaris (37874 sequences) (Yann please send me details about the reference).

Next Generation Sequencing - data handling and analyses (2 ECTS)

Course schedule
Suggested reading
Find your way around

This is the unofficial web page for the PhD course "Next Generation Sequencing - data handling and analyses (2 ECTS)" arranged by University of Gothenburg (Life Sciences) and ForBio - the Research School in Biosystematics. Here is where updated information about the course will be posted. The official course web page can be found here.

Lab Notebook - Toc75

Lab Notebook - Beagle optimiser

Subscribe to Mats Töpel RSS