Extract and process data from multiple possible source systems in St. Michael’s Hospital (e.g., Enterprise Data Warehouse (EDW), Picture Archiving and Communications Systems (PACS))
Use specialized software for large and complex data sets including tools for preprocessing, registering, and viewing imaging or waveform data.
Curate, validate, and preprocess large and complex data sets This includes cleaning and merging data from multiple sources (e.g., PACS and the EDW), as well as understanding overall data quality.
Create pipelines to feed real time imaging and/or waveform data to models
Conducting analysis with basic supervised and unsupervised machine learning methods is an asset, including: Linear regression, logistic regression, GLM, penalized regression, SVM, random forest, xgboost, and neural networks
Perform descriptive and inferential statistical analysis in R, Python, and/or Julia. Write HTML/PDF/Microsoft Word reports summarizing the analysis with RMarkdown, Quarto, or other notebooks
Validate output tables, listings or figures generated to ensure accuracy and reliability of analyses
Produce high quality ad hoc and standardized reports customized per project, using R/Python/Julia procedures, tailored to different end-users (e.g. clinicians, researchers, senior management, and hospital executives)
Develop efficient programs, algorithms, or systems to reduce programming time of standardized data analyses and reports
Assist with writing project protocols, including: data extraction, cleaning, predictive modelling, and validation plans
Qualifications
A Master's degree quantitative, health/clinical, or informatics disciplines, with at least 1 year of relevant work experience
Expert level programming with either R (preferred) or Python is required. Experience in Julia is a plus.
Experience working with and manipulating large datasets
Familiarity with databases such as: Netezza, SQL, Oracle, etc.
Familiarity with version control (git, Github, GitLab).
Experience writing reproducible reports using R Markdown and/or Jupyter Notebooks
A deep understanding and experience with regression and classification tasks using generalized linear models, penalized regression, decision tree learning, random forest, support vector machines
Experience with time series modeling and waveform/signal processing and modelling is an asset.
Experience with graphical methods would be an asset
Experience with Bayesian methods would be an asset
Experience with ML/ModelOps would be an asset
Experience with medical imaging would be an asset
Experience with specialized platforms (e.g., XNAT, MONAI, MDai) would be an asset
Familiarity with pretrained models for imaging or waveform data would be an asset
Experience in healthcare is an asset
Demonstrated ability to manage statistical programming activities
Proficient with MS Office software (Word, Excel, PowerPoint, Outlook, etc.)
In order to be considered for this position, please include links to any public-facing code and tools you have created (e.g., personal or institutional Github repositories, your website).
We will also consider other degrees and disciplines in conjunction with applied experience