5 Day International Virtual Workshop on Data Analysis and Machine Learning in Bioscience Research using Programming in R | 30th April to 4th May 2025

5 Day International Virtual Workshop

on

Data Analysis and Machine Learning in Bioscience Research using

Programming in R

International Workshop Series

Quaxon Bio & IT Solutions, India is going to conduct it’s 53rd  international virtual workshop on Data Analysis and Machine Learning in Bioscience Research using Programming in R

Date: 30th April to 4th May  2025

Earley Bird Registration closes on: 27th April  2025

Last Date  28th April (Registration will close before  once all seats are filled up)

Time: 7 PM to 9 PM IST   Platform : Google Meet

Eligibility:

Students, Research scholars, Faculties from all bioscience disciplines (Botany, zoology, microbiology, Biotechnology  Bioinformatics, clinical research fellows and other allied fields having basic computer operations skill are eligibility to participate 

Visit Website post: https://qbiits.org/data-analysis-and-machine-learning-using-r-bioscience/  

ABOUT THE THEME TOPIC        

R programming for Data Analysis and Machine Learning in Bioscience Research

R program unparalleled capabilities for statistical computing, visualization, and reproducible research. Its open-source nature and extensive library support make it a versatile tool for handling complex biological datasets, such as genomic sequences, proteomics data, and clinical trials. Beyond traditional data analysis, R provides powerful frameworks for implementing machine learning techniques—such as classification, clustering, and predictive modeling—allowing researchers to uncover patterns, generate insights, and make accurate data-driven predictions. Whether you are conducting exploratory data analysis, building machine learning models, or visualizing intricate trends in your data, R offers a robust and flexible platform. Learning R is not just a technical skill but an essential competency for bioscience researchers in today’s data-driven world, empowering them to make informed decisions and contribute to cutting-edge research. 

About this Workshop

          This 5-day hands-on workshop offers an immersive introduction to data analysis and machine learning tailored for bioscience research. With R programming at its core, participants will gain practical experience in applying key machine learning techniques—including classification, clustering, and predictive modeling—on real-world biological datasets. From genomic and proteomic data to clinical and experimental results, the workshop provides step-by-step guidance in using R’s powerful libraries for statistical analysis, visualization, and model building. Each session is designed to provide one-to-one mentorship and personalized support, ensuring a deep understanding of concepts and their applications. Whether you're exploring patterns in biological data or developing machine learning models, this workshop equips you with the skills to harness the full potential of R in bioscience research. 

Schedule

Date: 30 April to 4 May 2025

Time: 7 PM to 9 PM as per Indian Standard Time

check the schedule in your time zone at https://savvytime.com/converter  

Complete curriculum 

Day-1: Overview of R and RStudio

 
 Data types and structures in R (Vectors, Lists, Matrices, Data Frames)

 Operators
 Importing and exporting biological datasets from local disc or internet (CSV, TSV,Excel, TXT)
Exploring datasets (
summary(), str(), head(), tail())
Handling missing data (
na.omit(), impute())

Installing & loading packages (tidyverse, ggplot2, Bioconductor)

Example Datasets:

  • Iris Dataset (Plant species dataset)
  • Gene Expression Data (CSV format)

Hands-on Exercises:

  • Loading and exploring datasets (head(), summary(), str())
  • Data filtering & subsetting (dplyr functions)
  • Handling missing values and outliers

Day-2 Data Manipulation

Using dplyr for filtering, selecting, and mutating data
Merging and reshaping datasets (tidyr package)
Working with categorical variables and factors
String manipulation using
stringr
Date-time handling (
lubridate)

Data cleaning and preprocessing (tidyverse, janitor) 

Data Visualization and Exploratory Data Analysis (EDA)

  • Data visualization using ggplot2
  • Boxplots, scatter plots, violin plots
  • Heatmaps for gene expression data
  • Customizing plots (Themes, Labels, Colors)
  • Principal Component Analysis (PCA) for dimension reduction

Example Datasets:

  • Public RNA-Seq Data (Processed using Bioconductor)
  • Gene Expression Dataset (CSV)

Data normalization and transformation

  • Identifying differentially expressed genes
  • Creating volcano plots for visualization

Example Datasets:

  • Iris Dataset (Species distribution visualization)
  • Microarray Gene Expression Data (Heatmap & PCA)

Hands-on Exercises:

  • Creating scatter plots for different species in the Iris dataset
  • Generating a heatmap of gene expression levels
  • Performing PCA and visualizing clusters

Day-3 Statistical Analysis and hypothesis testing

Topics Covered:

  •  Descriptive statistics (Mean, Median, Variance, Standard Deviation)
  • Hypothesis testing (t-test, ANOVA, chi-square test)
  • Correlation and regression analysis
  • Non-parametric tests (Wilcoxon, Kruskal-Wallis)
  • Biological significance vs statistical significance

Example Datasets:

  • Iris Dataset (ANOVA for species differences)
  • Gene Expression Data (t-test for differential expression)

 Hands-on Exercises:

  • Running a t-test to compare gene expression between conditions
  • Performing ANOVA to check species variation
  • Pearson & Spearman correlation on biological variables
  • Plotting of statistical analysis result using various plot

Day-4 Machine Learning Basics

Fundamentals of Machine Learning with R (Using caret)

Objective: Understand the basic principles of machine learning, data preprocessing, and preparing data for ML.

Introduction to Machine Learning Concepts

  • Supervised vs Unsupervised Learning
  • Types of ML algorithms
  • Real-life applications in biosciences (e.g., disease prediction, gene expression classification)

Getting Started with the caret Package

  • Overview of caret (Classification and Regression Training)
  • Loading caret and required libraries
  • Structure of a typical ML pipeline using caret

Data Preprocessing in ML

  • Importing biological datasets (Iris, Cancer biomarker data, gene expression matrix)
  • Handling missing values
  • Feature scaling and normalization
  • Encoding categorical variables

Hands-on Example

  • Dataset: Iris or Diabetes Dataset
  • Task: Prepare data for classification (Step-by-step)
  • Visual exploration of features (ggplot2 / base R)

Day-5: ML Modeling, Evaluation & Bioscience Applications

Objective: Apply machine learning models to classify and cluster biological data, and evaluate model performance.

Classification Models

  • Decision Tree (rpart)
  • Random Forest (randomForest / caret)
  • How to train, test, and interpret classification models

Clustering Methods

  • K-Means Clustering
  • Hierarchical Clustering
  • Applying clustering to gene expression data(or data prepared on day4)

Model Evaluation Metrics

  • Accuracy, Confusion Matrix
  • ROC Curve and AUC (caret + pROC)
  • Applying Cross-validation to improve prediction accuracy(iris dataset using rf method)

Case Studies used in this workshop

Exploratory Data Analysis, Statistical Analysis/inference and Machine Learning will be implemented on following datasets(5 or more)

  • Group cancer cell lines based on gene expression data  (NCI60)
  • Species Prediction based on sepal, petal measurement
  • Predict Blood-Brain barrier of drug like compound by molecular descriptors
  • Classification Leukemia Type: All or AML based on gene expression (Golub Data Set)
  • Identification of bio marker gene for cancer baed on gene expression data
  • Classify vegetable oil samples (e.g., pumpkin, sunflower) from fatty acid profiles, (Brodnjak-Voncina et al. (2005))
  • Statistical analysis of Organe tree growth data
  • The CO₂ uptake rate in grass plants under various conditions. 

Steps to Participate 

Step-1: Pay the participation fee as per your category 

Participation fee category wise on or before 27th April 2025

Category

for Indians

For International Participants

Students

 Rs. 1000/-

  $25

Research Scholar/PhD Scholar

 Rs. 1200/-

  $30

Faculty/PDF/ Other Job holders

 Rs.1400/-

  $35

Call/WhatsApp:     +91-9692521875  for any kind of Information 

Payment Link for Indians: https://rzp.io/rzp/FZipj4P 

Payment link for International Participants: https://www.paypal.com/paypalme/Workshop334 

Step-2: Fill up the registration form in this link below(After Payment) 

https://docs.google.com/forms/d/1GMIaoHHHhQW1-fnyC6Ki8rdtk9ea7oJOfYTw8YldNRA/edit 

**Must visit main post https://qbiits.org/data-analysis-and-machine-learning-using-r-bioscience/ to know the other details term and conditions. 

About us   

Quaxon Bio & IT Solutions is a fastest growing EduTech start-up established and registered to the Ministry of Micro, Small and Medium Enterprises, Government of India. Our mission is to act as an industry-academic interface, to excel in knowledge transformation and producing a highly skilled workforce equipped with next-generation technology. We are delivering high demand skills via virtual workshop on bioinformatics and data science with international participants, facilitating the exchange of cutting-edge research and ideas around the globe.

www.qbiits.org 

Contact us for any queries 

WhatsApp/Call +919692521875  or write to us support@qbiits.org



Subscribe our X Channel  to get Latest and more updates: Join Now!

To Get All Updates from helpBIOTECH, Subscribe Now by Email

Post a Comment

0 Comments