The FDA receives a substantial volume of drug label information daily, encompassing both new submissions and revisions. By accessing the OpenFDA platform, particularly through the Drug API Endpoints under Product Labeling, you can visualize the trends in drug label submissions over the years. Astonishingly, around 40,000 labels were submitted in 2023 alone.
These drug labels, also known as prescription information, contain a wealth of valuable data, including brand names, active and inactive ingredients, manufacturer details, drug types, dosage forms, and more. Leveraging AI, I have conducted a detailed analysis of this data. While OpenFDA aggregates all submitted labels to present statistics on prescription and OTC drugs, it’s important to note that many entries are merely revisions under the same drug name and manufacturer.
To refine the data, I screened 228,420 labels (acquired from OpenFDA on August 3, 2024) that were submitted between 2009 and 2024, focusing on unique combinations of generic names and manufacturers. This screening resulted in 40,807 distinct entries, providing a more accurate dataset for analysis. Using this refined dataset, I generated a chart illustrating the number of drug types submitted each year, offering a clearer view of industry trends.
(Data source: OpenFDA Drug Labeling datasets – 12 json files, Tools: ChatGPT, Python with Pandas, matplotlib, and other Python packages)
Next, I aimed to understand the distribution of drug products across different administration routes, such as tablets or capsules for oral use, and vials or syringes for injections. To achieve this, I conducted an analysis to determine the number of drug labels associated with each administration route. It’s important to note that many drug products are available in multiple formulations (e.g., oral and injection, intravenous and subcutaneous). For entries that included multiple routes, I selected the route for analysis based on the following rule:
- subcutaneous > intramuscular > intravenous
- sublingual > buccal
- infiltration > intrathecal > epidural
Stay tuned for more data analysis from 40,807 distinct drug labels.