Coverage

An honest map of what is in the dataset, what is thin, and what is left out on purpose.

Two ways a drug enters the map

Coverage comes from two pipes, and understanding them explains everything else.

  • Approved and marketed drugs and diagnostics enter comprehensively, regardless of what disease they treat. This pull is indication-agnostic, so the approved landscape is broadly covered: vaccines, antivirals, antibiotics, metabolic drugs, oncology drugs, and drugs used in aging. It is broad rather than a guaranteed every-single-one, and it excludes the non-drug products noted below.
  • Pipeline drugs (still in trials, not yet approved) enter through a set of therapeutic-area searches: oncology, immunology, cardiology, metabolic disease, neurology, rare disease, and psychiatry, including the psychedelic pipeline. The pipeline is deep for these areas and thinner for areas not actively searched.

What is fully covered

  • Essentially all FDA-approved drugs, across every disease area. Approved drugs are pulled comprehensively from drug databases (ChEMBL and Drugs@FDA), so coverage of approved small molecules and biologics is broad and indication-agnostic. It is not a guaranteed one hundred percent: a handful of older, discontinued, or non-drug approved products (see "what is left out, on purpose") fall outside it.
  • Major molecular and genomic diagnostics (liquid biopsy, prenatal and carrier screening, companion diagnostics, and the like). Diagnostic coverage is curated rather than exhaustive: BioCosm tracks clinically meaningful molecular tests, not the full universe of FDA-cleared devices and assays.
  • The active clinical pipeline (Phase 1 through 3) for oncology, immunology, cardiology, metabolic disease, neurology, rare disease, and psychiatry.
  • For each program: its biological target, mechanism, trials, sponsor, and where available, revenue and a probability-of-success estimate.

What is partial

A few areas have their approved drugs but a thinner experimental pipeline, because they sit outside the active search set.

  • Early-stage infectious disease. Approved antivirals, antibiotics, and vaccines are present; the youngest experimental vaccine and anti-infective programs may not be.
  • Longevity and anti-aging. These therapeutics are covered, but filed under the disease their trial targets, not under a single "longevity" heading. Regulators do not recognize aging as an indication, so a reprogramming, NAD-boosting, or senolytic program appears under the specific disease it is tested in (for example an eye disease or a metabolic condition) rather than grouped together. They are on the map; they are simply spread across disease regions.

What is left out, on purpose

  • Hardware medical devices and lab instruments (analyzers, monitors, pumps, sensors). BioCosm maps drugs and molecular diagnostics, not hardware. A dedicated device view may come later.
  • Nutritional supplements, parenteral-nutrition components, and similar non-drug products. These are FDA-regulated but are not drug development programs.
  • A small number of programs are redacted where the author has a conflict of interest; those pages show a disclosure in place of analysis.

One drug, one node

The number on the map counts distinct drug and diagnostic programs, not raw trial records. A single drug appears in public data under many names and many trial arms: "sotorasib", "AMG 510", and "sotorasib 960mg" are the same molecule across different studies. BioCosm collapses all of those into one node through nightly deduplication.

That is why the count is a curated figure: tens of thousands of trial rows reduce to a few thousand real programs. It is also why the internal database is larger than what the map shows. The difference is duplicates merged and noise removed, so the map stays a clean count of distinct programs rather than an inflated trial tally.

Coverage evolves as the pipeline moves and the searches expand. For how the data is reconciled and how the estimates work, see About and Methodology. Data freshness varies, so always verify against primary sources before acting on any data point.