How AI Makes Identifying Biomarkers More Accurate and Efficient
Current approaches to drug development that rely solely on identifying statistical differences between healthy and diseased patients to pinpoint biomarkers, molecular targets, or suitable drugs often fail to provide biologically useful results. While these differences can help predict which patients are likely to have misregulation of a particular gene, receptor, or protein, the majority of the differences will not be associated with the disease of interest (i.e., the differences would be considered false positives).
The usual method of determining which differences are clinically relevant consists of a manual review of thousands of documents. This approach is extremely time-consuming and unsustainable because of the speed at which the quantity of healthcare information is growing. To handle the vast amount of data available, the process needs to be automated, which will require implementation of AI technologies. Automation will in turn free up researchers to focus on identifying many new biomarkers, targets, and drugs rather than on spending an inordinate amount of time validating them through manual literature views.
To automate this process, we leveraged patient data from public and enterprise sources through AI-powered computational algorithms. We used this data to generate an initial set of potential biomarkers and patient responsiveness. A hypothesis was generated by examining data from 100+ patients, identifying 89 genes with altered expression, analyzing survival outcomes, and connecting these outcomes to specific biomarkers. These biomarkers were then validated statistically by assessing their implications via the Innoplexus life sciences data ocean.
We proceeded by identifying endotype responses in five steps:
- Patient data aggregation: Our machines crawled through publicly available data sources with patient genomics data to collect gene expressions in different patient groups. The data was then normalized.
- Mapping: Differentially expressed genes (DEGs) between data sets and important features were identified using AI technologies, (such as machine learning).
- Clustering: The genomic data collected was clustered using an AI technology that enabled subsequent analyses by identifying biomarkers linked to patient outcomes.
- Analysis: The next step was to identify biomarkers indicative of drug responders, nonresponders, and adversely affected responders (e.g., those who experienced toxicity).
- Hypothesis validation: By leveraging our proprietary life sciences data ocean and using a network analysis approach, we further validated the biomarkers.
Solutions offered by Innoplexus
Customized genomic marker dashboard
We offer clients a customized genomic marker dashboard to assist in optimizing clinical trial patient stratification by level of response to therapy. The dashboard enables the client to identify patient endotypes using their own and public data. It is capable of connecting endotype responses to published literature and the client’s internal clinical trial data and can continually integrate new data streams (e.g., from additional clinical trials) to update endotype response predictions.
Biomarker identification dashboard
Our proprietary ontology-annotated life sciences data ocean is populated by publicly available information from trials and literature. It provides the biomarker identification dashboard with biologically normalized data using patent-pending technologies. The dashboard empowers researchers to identify biomarkers with high precision, leveraging artificial intelligence over aggregated enterprise and public data. It allows stratification of patient groups with different endotypes for a given indication to improve disease diagnosis, prognosis, or treatment response. Extensive biological multigraph-based scientific validation over public knowledge is deployed to ensure high biomarker prediction accuracy while removing false positives.
Innoplexus’s sophisticated proprietary technology and in-house data collection make the time-consuming and complex task of lead identification easier, faster, and more accurate. Complex combinations of machine learning-driven traditional and advanced ligand- and structure-based approaches are applied iteratively to optimize hits into lead candidates. In addition, our product dashboards with pre-built machine intelligence networks of data points provide insights around obvious and nonobvious connections. These connections are systematically audited with respect to the optimized compound for their associated targets impacting a variety of indications.
We help generate bias-free insights to make infinite dynamic molecular interactions clear and enable faster target identification. Our dashboard visualizes novel and well-established relationships connected to each other and their correlation scores from the literature, allowing users to explore all possible direct and indirect associations. Each target can be prioritized based on a custom druggability score, which includes number of relevant data sources (publications, trials, etc.), approved drugs for a given target, experimentally known structures, number of antibodies, genetic associations, and more. Clients are able to deep dive on each target and access all real-world evidence and reports associated with the potential targets.
Benefits to the client
The benefits of using Innoplexus’s technology set us apart as a leader in AI solutions for pharma and include:
- State-of-the-art machine learning-powered technology
- Access to a comprehensive chemical library with well-connected multiple data sources
- Real-time and continuously updated data
- Facility of executing high-end computational calculations
- Project executed by highly experienced data and computational scientists
We contextualize data from 95% of publicly available life sciences sources. These sources include clinical research, academic literature, experimental studies, commercial insights, published chemicals and their preclinical studies, and siloed datasets. This data is used to identify and validate drug leads via natural language processing (NLP) that understands life sciences. By leveraging our life sciences-specific ontology, which understands relationships between genes, pathways, targets, and diseases, we help clients automate, and thereby make more efficient, the identification and validation of drug targets.