Features of DeepProtein

DeepProtein is a Deep Learning Library and Benchmark for Protein Sequence Learning Toolkit (using PyTorch) It allows very easy usage for non-computational domain researchers to be able to deal with protein data using deep learning while facilitating deep learning method research in this topic by providing a flexible framework (with few lines of codes!) and baselines. The Github repository is located here.

Features

  • For computational researchers, 10+ powerful encodings for proteins, ranging from CNN, transformers to GNNs and graph transformers.

  • Realistic and user-friendly design:

    • Applications in Protein Property Prediction, Localization Prediction, Protein-Protein Interaction, Antigen Epitope Prediction, Antibody Paratope Prediction, Antibody Developability Prediction, and more.

    • easy monitoring of training process with detailed training metrics output such as test set figures (AUCs) and tables, also support early stopping.

    • various evaluation metrics: ROC-AUC, PR-AUC, F1 for binary task, MSE, R-squared, Concordance Index for regression task.

    • time reference for computational expensive encoding.

    • PyTorch based, support CPU, GPU, Multi-GPUs.

TODO

  • List

    • Pretraining

    • Protein Structure Prediction

    • Protein design

    • Combination of Models

    • LightWeight Data Processing with Fewer Lines of codes