PhD Student · Bioinformatics · Charles University

Vít Škrhák

I work on machine learning methods for protein–ligand binding site prediction, with special focus on protein language models.

Member of Structural Bioinformatics Group, supervised by doc. David Hoksza.

VS

Research Interests

Protein language models Using pLMs (ESM-2, ProtTrans) as sequence encoders for residue-level prediction tasks.
Cryptic binding site prediction Detecting hidden (cryptic) pockets that are absent in apo (ligand-free) structures and assess their mechanisms.
PLM interpretability Probing what pLMs encode about protein structure and binding; attention analysis and feature attribution.
Protein conformation analysis Computational analysis of structural flexibility in the context of pocket exposure and crypticity.
Large-scale protein databases Querying and mining PDB and UniProt; conformation search and dataset curation pipelines.

Open Student Topics

I supervise BSc and MSc theses. Contact me if you are interested in any of the topics below or have a related idea.

1.

Cryptic Binding Site Prediction via Conformational Sampling

Predict cryptic binding sites by generating multiple protein conformations using a DL-based conformational sampler, then running a structure-based predictor (P2Rank) across the ensemble. Aggregate pocket predictions to identify cryptic sites.

BSc conformational sampling · cryptic sites
2.

Pocketome Analysis

Using LIGYSIS predictions, analyze which binding sites are well or poorly predicted by current models. Investigate whether failure cases correlate with ligand type (e.g. ions, ATP), pocket flexibility, or other biases from the training data.

MSc pocket analysis · benchmark
3.

Expanding CryptoBench with Crystallographic Heterogeneity

Inspired by the finding that unrefined ligand occupancy masks binding site heterogeneity in PDB structures, develop a pipeline to identify and incorporate previously overlooked cryptic sites into our dataset of cryptic binding sites.

MSc PDB · crystallography · dataset curation
4.

AF2Bind vs. AlphaFold3 for Binding Site Prediction

A recent binding site prediction method leveraging AlphaFold2 weights (AF2Bind) was published. Here we compare it against AlphaFold3 on PDB holo structures: strip the ligand, run AF3 (sequence + ligand → complex) and AF2Bind independently, and evaluate whether AF3's predicted binding pose coincides with the AF2Bind pocket.

BSc AlphaFold3 · AF2Bind · binding site prediction

Publications

CryptoBench: cryptic protein–ligand binding sites dataset and benchmark
Vít Škrhák, Marian Novotný, Christos P. Feidakis, Radoslav Krivák, David Hoksza
Bioinformatics · 2025
Seq2Pocket: Augmenting protein language models for spatially consistent binding site prediction
Vít Škrhák, Lukáš Polák, Marian Novotný, David Hoksza
bioRxiv preprint · 2026

Full list on Google Scholar.