Katarína Grešová

Genomics and Deep Learning PhD student

About Me

Hello! My name is Katarína.

I like Computer Science, Biology and everything in between. Always excited and looking for new challenges, but for now focusing on Deep learning and Genomics.

I have Bachelor’s degree in Information technology, Master’s degree in Bioinformatics and Biocomputing and currently I am pursuing Doctoral studies in Deep Learning and Genomics.

I am also capable to combine my studies with work in a company and I have 3+ years experience as a Software Developer, working with Java and SQL and 2 years experience as Machine Learning Specialist, having fun with Python, NLP and computer vision.

Education

Masaryk University

PhD Genomics and Proteomics

2020 - Present

The aim of the Genomics and Proteomics program is to train top class specialist in these subjects. Students will acquire extensive and in-depth knowledge about the structure and function of the genome at all basic levels of living systems. They will deepen their knowledge and skills in basic biological disciplines, in biochemistry and proteomics and in biophysics. In addition to the theoretical principles of the discipline, students are also closely acquainted with performing basic and advanced methods used in various disciplines.

Research topic: Modeling Small RNA binding rules using Machine Learning

Brno University of Technology

MSc Bioinformatics and Biocomputing

2018 - 2020

Students of the branch acquire deeper knowledge in the bioinformatics and natural computing. This give them knowledge and skills base to analyse, design and verification of problems of biological databases as well as use, design, implement and accelerate algorithms for analysis of the biological data. They are able to apply natural computing algorithms (evolutionary algorithms, artificial neural networks, fuzzy systems etc.) in the complex design and optimization problems.

Master’s Thesis: Bacteria Classification into Taxonomic Categories Based on Properties of 16s rRNA

Brno University of Technology

BSc Information Technology

2015 - 2018

The aim of the Information Technology Bachelor Degree Programme is to prepare alumni who are able to act as designers, programmers, and servicemen of computer systems, digital systems, computer networks, computer-based systems and programmers and administrators of database systems and information systems.

Bachelor’s Thesis: Searching Semantically Annotated Texts

Teaching

LF:DSIB01 Introduction to Bioinformatics (Autumn 2021)

https://is.muni.cz/course/med/autumn2021/DSIB01

Teaching several practical lessons focused on git, command line and NGS data preprocessing. Materials: https://katarinagresova.github.io/DSIB01_2021/.

Experience

University of Malta

Visiting Researcher

September 2023 - Present

https://www.um.edu.mt/projects/biogemt/

The ERA Chair in Bioinformatics for Genomics in Malta (BioGeMT) is a project that aims to bring structural change to the University of Malta in the field of Bioinformatics.

python, jupyter, tersorflow, pytorch, bioinformatics, machine learning, deep learning, genomics

Masaryk University

Researcher

July 2023 - Present

Working as a Researcher at Intelligent Systems for Complex Data Research Group, Faculty of Informatics, Masaryk University.

python, jupyter, bioinformatics, Deep Learning, protein structures, single-cell sequencing analysis

Melown Technologies

Machine Learning Engineer

January 2023 - August 2023

https://www.melowntech.com/

The future of the society and your company is in the data. Start using them to your advantage and know their value. We will teach you to work with them and make the right decisions.

Working as a Machine Learning Engineer in a Machine Learning team. As a member of a dedicated team, I actively contribute to the exploration and mastery of a 3D reality-capture system, aimed at discovering the underlying meaning of pixel values. Notable projects within this context include developing robust techniques for extracting text from images and seamlessly merging results obtained from multiple viewpoints. Additionally, I have played a key role in enhancing the realism of LOD2 buildings by employing advanced adjustment techniques. Furthermore, I have been actively involved in preprocessing training data to optimize instance semantic segmentation algorithms, ultimately improving efficiency of the models.

python, jupyter, C++, openCV, bash, computer vision, 3D

CEITEC MU - Central European Institute of Technology

Research Specialist

September 2020 - August

https://www.ceitec.eu/rbp-bioinformatics/rg281

Central European Institute of Technology is a young and dynamic center of scientific excellence, which strives to fulfill its mission in advancing knowledge in the field of life and materials sciences, based on the interconnection of the scientific capacities of its six partners.

Working as a Research Specialist in Panagiotis Alexiou Research Group.

As a PhD student, my research journey has been driven by a passion for staying at the forefront of genomics and machine learning advancements, particularly in the domain of miRNA targeting and deep learning. A notable accomplishment during my doctoral studies was the development of a robust deep learning model for precise classification of miRNA targets, leveraging CLASH data. This achievement required extensive effort in both model development and interpretation, leading to novel biological insights learned by the model. Furthermore, I have actively contributed to the field by curating and collecting diverse genomic datasets, facilitating the development and benchmarking of machine and deep learning models. These datasets serve as invaluable resources for researchers in their pursuit of advancing the field of genomics through the power of artificial intelligence. python, tersorflow, pytorch, fastai, deep learning, NLP, genomics

Proficio Marketing

Machine Learning Specialist

December 2021 - September 2022

https://proficiodigital.com/

Proficio Marketing is a digital full service agency that helps clients with performance campaigns, data and business analytics, brand building and social network management and more.

Working as a Machine Learning Specialist in Databy (formerly the Analytics team in Proficio).

In my position at Proficio Marketing, my primary responsibility was to identify opportunities for applying machine/deep learning techniques and develop proof of concept solutions. I focused on leveraging customer data to deliver enhanced value. This involved a systematic approach, beginning with a comprehensive understanding of the business problem at hand. I would then explore and cleanse the relevant data, carefully selecting the most suitable machine learning algorithms. Implementing these algorithms, I successfully generated insightful results and delivered compelling presentations to stakeholders.
python, pandas, numpy, matplotlib, scikit-learn, jupyter, tensorflow, NLP, time-series

Oracle NetSuite

Software Developer

July 2020 - September 2021

https://www.netsuite.com/

NetSuite is the world’s leading provider of cloud-based business management software. NetSuite helps companies manage core business processes with a single, fully integrated system covering ERP/financials, CRM, ecommerce, inventory and more. NetSuite transforms how businesses operate so they can achieve their business vision.

Working as a Software Developer in SuiteAnalytics Connect team.

In my role I actively contributed to the development of features, wrote comprehensive unit tests, and deployed tools for the NetSuite SuiteAnalytics Connect Service. This critical service enables users to efficiently archive, analyze, and report on NetSuite data using third-party tools or custom-built applications across various devices. Java, SQL, ODBC, Perforce

Tieto

Software Developer

February 2018 - June 2020

https://www.tietoevry.com/

Tieto creates digital advantage for businesses and society. They are a leading digital services and software company with local presence and global capabilities.

Working as a Software Developer in the Nemesis team.

Starting as an intern, I swiftly progressed to a permanent role within a data center project. In this capacity, I provided crucial support to developers and testers, effectively enhancing the testing environment. I took on the responsibility of developing a comprehensive testing framework, ensuring streamlined and efficient testing processes. Additionally, I played a pivotal role in establishing and maintaining continuous integration practices, contributing to the overall quality and efficiency of the project. Java, bash, python, Jenkins, TestNG, CI

Brno University of Technology

Research Scholar

February 2016 - September 2018

https://www.fit.vut.cz/.en

Faculty of Information Technology is in the position of one of the leading workplaces in the Czech Republic, which provides education and research in the field of information technology.

Working as a Research Scholar in the Knowledge Technology Research Group.

Starting from my second semester of studies and continuing until the completion of my awarded Bachelor thesis, I dedicated my efforts to the Corpora Processing Software project, emphasizing semantic search. Throughout this project, I focused on enhancing the user experience of the search engine and implementing advanced result filtering capabilities based on semantic annotations. By leveraging my skills, I successfully improved the overall functionality and usability of the software, contributing to its effectiveness in delivering precise and contextually relevant search results. Java, Vaadin Framework, bash, python, HTML

Selected Projects

Informal notes from semi-formal biology discussions.

You can find here notes from Biology Crash Course held in RBP Bioinformatics lab. It is far from comprehensive explanation, but it might be a good starting point.

Create all functional elements datasets you ever wanted.

Command line tool for preparing datasets of function elements sequences downloaded from Ensembl. It is able to download selected data from Ensembl, do quality check, split to train-validate-test datasets and generate random negative example for each positive one.

Collection of easily accessible, well curated and diverse genomics classification datasets together with current state of the art metrics.

Recently, deep neural networks have been successfully applied in many biological fields. In 2020, a deep learning model AlphaFold won the protein folding competition with predicted structures within the error tolerance of experimental methods. However, this solution of the most prominent bioinformatic challenge of the past 50 year has been possible only thanks to a carefully curated benchmark of experimentally predicted protein structures. In Genomics, we have similar challenges (annotation of genomes and identification of functional elements) but currently we lack benchmarks similar to protein folding competition. Therefore, we propose a collection of benchmark datasets for the classification of genomic sequences with the emphasis on transparency, reproducibility, comparativeness and ease of use. The datasets are open source and the process of their creation is fully documented. We provide an interface for most commonly used deep learning libraries, implementation of the simple neural network and a training framework that can be used as a starting point for future research. We believe that this work will reduce the overhead of researchers that want to apply their machine learning knowledge to the field of Genomics and it will create a healthy competition leading to new discoveries.

Package for finding products across multiple sellers.

Package for finding products across multiple sellers. Primarily developed for pneu market but can be used for any product matching where data can be provided in specified format.

Finetuning DNABert to miRNA target prediction data.

Neural poetry inspited by Emily Dickinson.

Reccurent neural network trained on poetry by Emily Dickinson. Result is available as a web application: https://generator-poezie.herokuapp.com/