PhD Student · Bioinformatics · Charles University

Vít Škrhák

I work on machine learning methods for protein–ligand binding site prediction, with special focus on protein language models.

Member of Structural Bioinformatics Group, supervised by doc. David Hoksza.

GitHub Google Scholar LinkedIn ORCID

Research Interests

Protein language models Using pLMs (ESM-2, ProtTrans) as sequence encoders for residue-level prediction tasks.

Cryptic binding site prediction Detecting hidden (cryptic) pockets that are absent in apo (ligand-free) structures and assess their mechanisms.

PLM interpretability Probing what pLMs encode about protein structure and binding; attention analysis and feature attribution.

Protein conformation analysis Computational analysis of structural flexibility in the context of pocket exposure and crypticity.

Large-scale protein databases Querying and mining PDB and UniProt; conformation search and dataset curation pipelines.

Open Student Topics

I supervise BSc and MSc theses. Contact me if you are interested in any of the topics below or have a related idea.

Cryptic Binding Site Prediction via Conformational Sampling

Predict cryptic binding sites by generating multiple protein conformations using a DL-based conformational sampler, then running a structure-based predictor (P2Rank) across the ensemble. Aggregate pocket predictions to identify cryptic sites.

BSc conformational sampling · cryptic sites

Pocketome Analysis

Using LIGYSIS predictions, analyze which binding sites are well or poorly predicted by current models. Investigate whether failure cases correlate with ligand type (e.g. ions, ATP), pocket flexibility, or other biases from the training data.

MSc pocket analysis · benchmark

Expanding CryptoBench with Crystallographic Heterogeneity

Inspired by the finding that unrefined ligand occupancy masks binding site heterogeneity in PDB structures, develop a pipeline to identify and incorporate previously overlooked cryptic sites into our dataset of cryptic binding sites.

MSc PDB · crystallography · dataset curation

AF2Bind vs. AlphaFold3 for Binding Site Prediction

A recent binding site prediction method leveraging AlphaFold2 weights (AF2Bind) was published. Here we compare it against AlphaFold3 on PDB holo structures: strip the ligand, run AF3 (sequence + ligand → complex) and AF2Bind independently, and evaluate whether AF3's predicted binding pose coincides with the AF2Bind pocket.

BSc AlphaFold3 · AF2Bind · binding site prediction

Publications

CryptoBench: cryptic protein–ligand binding sites dataset and benchmark

Vít Škrhák, Marian Novotný, Christos P. Feidakis, Radoslav Krivák, David Hoksza

Bioinformatics · 2025

PDF Code DOI

Seq2Pocket: Augmenting protein language models for spatially consistent binding site prediction

Vít Škrhák, Lukáš Polák, Marian Novotný, David Hoksza

bioRxiv preprint · 2026

Preprint Code

Full list on Google Scholar.