Technical Documentation

Comprehensive API documentation and technical references for PreditX

Developer Resources

Overview

PreditX is a machine learning-based virtual screening platform that combines ChEMBL bioactivity data with state-of-the-art ML algorithms to predict molecular activity against protein targets.


Data Source

ChEMBL & UniProt

ML Models

XGBoost, Neural Networks, Random Forest, SVM

Molecular Features

Morgan Fingerprints, Mordred Descriptors

Core Components

Pipeline Architecture

Module Description Key Functions
pipeline_part1.py Data retrieval and cleaning retrieve_and_clean_data(), transform_and_label()
pipeline_part2.py Feature generation and model training calculate_descriptors_and_features(), train_and_evaluate_models()
pipeline_part3.py External predictions predict_with_model(), process_external_data()

Technologies & Dependencies

Backend
  • Flask 2.0.3
  • scikit-learn 1.2.2
  • XGBoost 1.6.2
  • RDKit PyPI 2022.9.3
  • Mordred 1.2.0
Cloud Services
  • Google Cloud Run
  • Firebase Authentication
  • Cloud Firestore
  • Cloud Storage (GCS)
  • Brevo SMTP

Ready to Get Started?

Check out our user guide and start your first virtual screening project