This study leverages natural language processing (NLP) and social media data to identify novel adverse effects of sildenafil, complementing traditional pharmacovigilance systems. We analyzed ۲.۹۲ million English-language posts from Twitter, Reddit, and health forums (۲۰۲۰–۲۰۲۵) using a multi-stage NLP pipeline. First, BERTweet classified sildenafil-related content (F۱=۰.۹۲۵), followed by BioClinical-BERT with CRF for named-entity recognition of adverse events (strict F۱=۰.۸۶۶). A DeBERTa-v۳ model then extracted drug–adverse event relationships (F۱=۰.۸۱۲). Statistical signal detection using proportional reporting ratios (PRR) identified ۴۲ significant adverse event signals, including eight potentially novel effects (e.g., tinnitus [PRR=۲.۸۷] and night sweats [PRR=۲.۴۶]) and six underreported effects (e.g., nasal congestion). Posts mentioning off-label sildenafil use (۱۱.۴% of corpus) showed ۱.۸-fold higher adverse event reporting rates (χ²=۱۸۲.۴, p<۰.۰۰۱). While synthetic data augmentation improved named-entity recognition marginally (+۱.۷% F۱), its impact on relation extraction was nonsignificant. The pipeline detected signals in a median of ۱۴.۲ months, demonstrating faster detection compared to spontaneous reporting systems. Expert review validated ۷۵% of novel signals as clinically plausible. These findings highlight the potential of social media mining to uncover patient-reported drug safety concerns, though challenges remain in causal inference and scalability. The study provides a reproducible framework for integrating real-world patient narratives into pharmacovigilance, offering insights that could enhance post-marketing drug safety monitoring while maintaining alignment with regulatory standards. Our open-source pipeline and validation approach advance methods for detecting emerging adverse drug reactions from unstructured patient-generated content.