Technical Documentation

Comprehensive API documentation and technical references for PreditX

Developer Resources

Overview

PreditX is a machine learning-based virtual screening platform that combines ChEMBL bioactivity data with state-of-the-art ML algorithms to predict molecular activity against protein targets.

Data Source

ChEMBL & UniProt

ML Models

XGBoost, Neural Networks, Random Forest, SVM

Molecular Features

Morgan Fingerprints, Mordred Descriptors

Core Components

Pipeline Architecture

Module	Description	Key Functions
`pipeline_part1.py`	Data retrieval and cleaning	`retrieve_and_clean_data()`, `transform_and_label()`
`pipeline_part2.py`	Feature generation and model training	`calculate_descriptors_and_features()`, `train_and_evaluate_models()`
`pipeline_part3.py`	External predictions	`predict_with_model()`, `process_external_data()`

Technologies & Dependencies

Backend

Flask 2.0.3
scikit-learn 1.2.2
XGBoost 1.6.2
RDKit PyPI 2022.9.3
Mordred 1.2.0

Cloud Services

Google Cloud Run
Firebase Authentication
Cloud Firestore
Cloud Storage (GCS)
Brevo SMTP

Ready to Get Started?

Check out our user guide and start your first virtual screening project

User Guide Start Pipeline Contact Support