Methodology Basis
Key Point
Discourses implements the methodology described in the academic research of Paolucci et al. 2024 on cross-sectional return predictability using text data. We do not use, distribute, or provide access to the underlying research data.
Our sentiment analysis model is a commercial implementation built from the ground up based on the methodological framework and analytical approaches described in published academic literature. The methodology includes:
- Financial domain-specific sentiment lexicon construction principles
- Context-aware negation and intensifier handling techniques
- Document-level sentiment aggregation approaches
- Cross-sectional normalization frameworks
These methodological elements are implemented independently using our own proprietary code, infrastructure, and training processes.
Data Independence
Discourses does not use any underlying data from the original academic research. This includes but is not limited to:
- Training Data: We do not use any datasets that were compiled or used in the original academic research. Our model is trained on independently sourced and licensed data.
- Validation Data: We do not use academic test sets or validation datasets. Our model validation uses separate, independently acquired datasets.
- Lexicons: While informed by published methodological principles, our sentiment lexicons are constructed independently and do not copy or redistribute any proprietary word lists from academic sources.
- Model Weights: Our model parameters are derived entirely from our own training processes, not transferred from any academic implementation.
- Historical Sentiment Series: We do not distribute or provide access to any sentiment time-series data from the original research.
Academic License Compliance
We are committed to respecting academic intellectual property and licensing terms:
- No Data Redistribution: Academic research data is typically licensed for non-commercial, research purposes only. We do not redistribute any such data through our commercial service.
- Methodology vs. Data: Published methodologies in academic literature are generally available for implementation. We implement only the publicly described methodological concepts, not proprietary implementations or data.
- Independent Development: Our codebase, model architecture, and all associated intellectual property were developed independently by Discourses without access to or use of any academic source code.
Why This Matters: Academic data licenses typically restrict commercial use. By implementing only the methodology—not using the underlying data—we ensure our commercial service does not violate academic licensing terms while still providing users with state-of-the-art sentiment analysis based on proven research approaches.
Implementation Details
Our implementation differs from academic research in several important ways:
| Aspect | Academic Research | Discourses Model |
|---|---|---|
| Data Sources | Research-licensed datasets | Independently licensed commercial data |
| Infrastructure | Academic computing resources | Cloud-native production infrastructure |
| Lexicons | Research-specific word lists | Proprietary lexicons built from commercial sources |
| Updates | Point-in-time research snapshot | Continuously updated and improved |
| Purpose | Academic hypothesis testing | Real-time commercial analysis |
Attribution & Citation
Discourses acknowledges that our sentiment analysis methodology is informed by and builds upon the academic research contributions of Paolucci et al. (2024) in the field of financial text analysis and cross-sectional return predictability. The methodology whitepaper is available on SSRN.
Users interested in the academic foundations of our methodology are encouraged to review the original published research. Academic users may cite the original research when discussing the methodological basis of analyses conducted using Discourses' Discourses Model.
Note: Discourses is the commercial implementation. For academic research purposes requiring the original methodology and data, please refer to the original academic publications and their associated data repositories.
No Academic Affiliation
Discourses is an independent commercial entity. We are not affiliated with, endorsed by, or officially connected to any academic institution, university, or the authors of the underlying academic research, except where explicitly stated.
The Discourses Model is built on the methodological framework described in Paolucci et al. 2024 academic literature. This naming convention is used for clarity and attribution purposes and does not imply:
- Official endorsement of Discourses by the research authors
- Partnership or collaboration with any academic institution
- Access to proprietary academic data or implementations
- Certification or validation by academic researchers
Questions?
For questions regarding our research methodology, data independence, or academic compliance, please contact us at support@discourses.io.
For academic collaboration or licensing inquiries, please email support@discourses.io.